谷歌浏览器插件
订阅小程序
在清言上使用

Machine Learning Models for Automatic Gene Ontology Annotation of Biological Texts

Artificial Intelligence in Medicine(2023)

引用 0|浏览26
暂无评分
摘要
Gene ontology (GO) is a major source of biological knowledge that describes the functions of genes and gene products using a comprehensive set of controlled vocabularies or terms organized in a hierarchical structure. Automatic annotation of biological texts using gene ontology (GO) terms gained the attention of the scientific community as it helps to quickly identify relevant documents or parts of text related to specific biological functions or processes. In this paper, we propose and investigate a new GO-term annotation strategy that uses a non-parametric k-nearest neighbor model and relies on various vector-based representations of documents and GO terms linked to these documents. Our vector representations are based on machine learning and natural language processing (NLP) models, including singular value decomposition, Word2Vec and topic-based scoring. We evaluate the performance of our model on a large benchmark corpus using a variety of standard and hierarchical evaluation metrics.
更多
查看译文
关键词
Gene Ontology (GO), GO-term text annotation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要