Cross-lingual Argument Mining in the Medical Domain
arxiv(2023)
摘要
Nowadays the medical domain is receiving more and more attention in
applications involving Artificial Intelligence as clinicians decision-making is
increasingly dependent on dealing with enormous amounts of unstructured textual
data. In this context, Argument Mining (AM) helps to meaningfully structure
textual data by identifying the argumentative components in the text and
classifying the relations between them. However, as it is the case for man
tasks in Natural Language Processing in general and in medical text processing
in particular, the large majority of the work on computational argumentation
has been focusing only on the English language. In this paper, we investigate
several strategies to perform AM in medical texts for a language such as
Spanish, for which no annotated data is available. Our work shows that
automatically translating and projecting annotations (data-transfer) from
English to a given target language is an effective way to generate annotated
data without costly manual intervention. Furthermore, and contrary to
conclusions from previous work for other sequence labelling tasks, our
experiments demonstrate that data-transfer outperforms methods based on the
crosslingual transfer capabilities of multilingual pre-trained language models
(model-transfer). Finally, we show how the automatically generated data in
Spanish can also be used to improve results in the original English monolingual
setting, providing thus a fully automatic data augmentation strategy.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要