Named entity recognition in the perovskite field based on convolutional neural networks and MatBERT

Jiaxin Zhang,Lingxue Zhang, Yuxuan Sun,Wei Li,Ruge Quhe

Computational Materials Science(2024)

引用 0|浏览0
暂无评分
摘要
Due to the significant increase in publications in the field of materials science, there has been a bottleneck in organizing material science knowledge and discovering new materials. The number of literature in the emerging field of perovskite materials has grown to a massive scale. It is necessary to compile information on the structure, properties, synthesis methods, characterization techniques, and applications of perovskite materials. To address this issue, we employed named entity recognition, a natural language processing technique, to extract important entities from perovskite material texts. In this paper, we propose a method based on convolutional neural networks (CNN) and MatBERT. Firstly, we utilized MatBERT, which has been pre-trained on a large amount of material science text, to generate contextualized word embeddings. Next, we extracted feature information using a CNN model. Finally, a conditional random field (CRF) layer was used for decoding sequences in addition to calculating the training and validation loss. Experimental results demonstrated that the performance of our model on perovskite material dataset was improved by 1 %∼6% compared with BERT, SciBERT and MatBERT models. Through this model, we extracted the entities of 2389 abstracts to obtain knowledge of perovskite materials.
更多
查看译文
关键词
Named Entity Recognition,BERT,Convolutional Neural Network,Conditional Random Field
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要