Extracting Drug-drug Interactions from Biomedical Texts using Knowledge Graph Embeddings and Multi-focal Loss

Conference on Information and Knowledge Management(2022)

引用 2|浏览9
暂无评分
摘要
ABSTRACTThe field of Drug-drug interaction (DDI) aims to detect descriptions of interactions between drugs from biomedical texts. Currently, researchers have extracted DDIs using pre-trained language models such as BERT, which often misclassify two kinds of DDI types, "Effect" and "Int", on the DDIExtraction 2013 corpus because of highly similar expressions. The use of knowledge graphs can alleviate this problem by incorporating different relationships for each, thus allowing them to be distinguished. Thus, we propose a novel framework to integrate the neural network with a knowledge graph, where the features from these components are complementary. Specifically, we take text features at different levels into account in the neural network part. This is done by firstly obtaining a word-level position feature using PubMedBERT together with a convolution neural network, secondly, getting a phrase-level key path feature using a dependency parsing tree, thirdly, using PubMedBERT with an attention mechanism to obtain a sentence-level language feature, and finally, fusing these three kinds of representation into a synthesized feature. We also extract a knowledge feature from a drug knowledge graph which takes just a few minutes to construct, then concatenate the synthesized feature with the knowledge feature, feed the result into a multi-layer perceptron and obtain the result by a softmax classifier. In order to achieve a good integration of the synthesized feature and the knowledge feature, we train the model using a novel multifocal loss function, KGE-MFL, which is based on a knowledge graph embedding. Finally we attain state-of-the-art results on the DDIExtraction 2013 dataset (micro F-score 86.24%) and on the ChemProt dataset (micro F-score 77.75%), which proves our framework to be effective for biomedical relation extraction tasks. In particular, we fill the performance gap (more than 5.57%) between methods that rely on and do not rely on knowledge graph embedding on the DDIExtraction 2013 corpus, when predicting the "Int" type. The implementation code is available at https://github.com/NWU-IPMI/DDIE-KGE-MFL.
更多
查看译文
关键词
knowledge graph embeddings,biomedical texts,drug-drug,multi-focal
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要