Text2Tree: Aligning Text Representation to the Label Tree Hierarchy for Imbalanced Medical Classification.
Conference on Empirical Methods in Natural Language Processing(2023)
摘要
Deep learning approaches exhibit promising performances on various text
tasks. However, they are still struggling on medical text classification since
samples are often extremely imbalanced and scarce. Different from existing
mainstream approaches that focus on supplementary semantics with external
medical information, this paper aims to rethink the data challenges in medical
texts and present a novel framework-agnostic algorithm called Text2Tree that
only utilizes internal label hierarchy in training deep learning models. We
embed the ICD code tree structure of labels into cascade attention modules for
learning hierarchy-aware label representations. Two new learning schemes,
Similarity Surrogate Learning (SSL) and Dissimilarity Mixup Learning (DML), are
devised to boost text classification by reusing and distinguishing samples of
other labels following the label representation hierarchy, respectively.
Experiments on authoritative public datasets and real-world medical records
show that our approach stably achieves superior performances over classical and
advanced imbalanced classification methods.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要