Document-Level Relation Extraction with Uncertainty Pseudo-Label Selection and Hard-Sample Focal Loss

JOURNAL OF ADVANCED COMPUTATIONAL INTELLIGENCE AND INTELLIGENT INFORMATICS（2024）

引用 0|浏览2

暂无评分

摘要

Relation extraction is a fundamental task in natural language processing that aims to identify structured triple relationships from unstructured text. In recent years, research on relation extraction has gradually advanced from the sentence level to the document level. Most existing document-level relation extraction (DocRE) models are fully supervised and their performance is limited by the dataset quality. However, existing DocRE datasets suffer from annotation omission, making fully supervised models unsuitable for realworld scenarios. To address this issue, we propose the DocRE method based on uncertainty pseudo-label selection. This method first trains a teacher model to annotate pseudo-labels for a dataset with incomplete annotations, trains a student model on the dataset with annotated pseudo-labels, and uses the trained student model to predict relations on the test set. To mitigate the confirmation bias problem in pseudo-label methods, we performed adversarial training on the teacher model and calculated the uncertainty of the model output to supervise the generation of pseudo-labels. In addition, to address the hard-easy sample imbalance problem, we propose an adaptive hard-sample focal loss. This loss can guide the model to reduce attention to easy-to-classify samples and outliers and to pay ducted experiments on two public datasets, and the results proved the effectiveness of our method.

查看译文

关键词

information extraction,relationship extraction,pseudo label

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要