Gold Standard Dataset For Alzheimer Genes

DATA IN BRIEF(2020)

引用 1|浏览1
暂无评分
摘要
Alzheimer disease is a genetically complex multigenic neurodegenerative disorder, resulting from the interaction between multiple genes. Most of the earlier studies reported only few specific genes that have involvement in Alzheimer. However more than hundreds of susceptible genes have been observed, that have significant role in the development and progression of Alzheimer. Among all the existing data resources, Genetic association database is the most popular data source that contains information about genes, their association classes into positive, negative and neutral class and supporting reference. However, it contains lot of false positives and negatives associations. We have taken this data as reference and performed the double fold cross validation to compile the comprehensive list of Alzheimer genes, their association class viz, positive, negative or ambiguous with the disease and reference sentence confirming the association. The data generated will be used as a GOLD standard reference data set for the training of machine learning classifier to predict the classification of published literature not only in Alzheimer but in other diseases as well. In addition, positive associated genes data can also be used for the system level modelling or meta analysis of Alzheimer. (C) 2020 The Author(s). Published by Elsevier Inc.
更多
查看译文
关键词
Alzheimer genes, Cross validation, GOLD standard, Meta analysis, System modeling, Text classification, Machine learning, Alzheimer gene association
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要