Re-Examine Distantly Supervised NER: A New Benchmark and a Simple Approach
CoRR(2024)
摘要
This paper delves into Named Entity Recognition (NER) under the framework of
Distant Supervision (DS-NER), where the main challenge lies in the compromised
quality of labels due to inherent errors such as false positives, false
negatives, and positive type errors. We critically assess the efficacy of
current DS-NER methodologies using a real-world benchmark dataset named QTL,
revealing that their performance often does not meet expectations. To tackle
the prevalent issue of label noise, we introduce a simple yet effective
approach, Curriculum-based Positive-Unlabeled Learning CuPUL, which
strategically starts on "easy" and cleaner samples during the training process
to enhance model resilience to noisy samples. Our empirical results highlight
the capability of CuPUL to significantly reduce the impact of noisy labels and
outperform existing methods.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要