谷歌浏览器插件
订阅小程序
在清言上使用

Cimind: A Phonetic-Based Tool for Multilingual Named Entity Recognition in Biomedical Texts

Journal of biomedical informatics(2019)

引用 16|浏览37
暂无评分
摘要
BACKGROUND:Extracting concepts from biomedical texts is a key to support many advanced applications such as biomedical information retrieval. However, in clinical notes Named Entity Recognition (NER) has to deal with various types of errors such as spelling errors, grammatical errors, truncated sentences, and non-standard abbreviations. Moreover, in numerous countries, NER is challenged by the availability of many resources originally developed and only suitable for English texts. This paper presents the Cimind system, a multilingual system dedicated to named entity recognition in medical texts based on a phonetic similarity measure.METHODS:Cimind performs entity recognition by combining phonetic recognition using the DM phonetic algorithm to deal with spelling errors and string similarity measures. Three main steps are processed to identify terms in a controlled vocabulary: normalization, candidate selection by phonetic similarity and candidate ranking.RESULTS:Cimind was evaluated in the 2016 and 2017 editions of the CLEF eHealth challenge in the CépiDC/CDC tasks. In 2017, it obtained on each corpus the following results: English dataset: 83.9% P, 78.3% R, 81.0% F1; French raw dataset: 85.7% P, 68.9% R, 76.4% F1; French aligned dataset: 83.5% P, 77.5% R, 80.4% F1. It ranked first in French and fourth in English in officials runs.
更多
查看译文
关键词
Named Entity Recognition,Natural Language Processing,Vocabulary,Controlled
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要