[Identification of nucleosome positioning using support vector machine method based on comprehensive DNA sequence feature].

Sheng wu yi xue gong cheng xue za zhi = Journal of biomedical engineering = Shengwu yixue gongchengxue zazhi(2020)

引用 0|浏览18
暂无评分
摘要
In this article, based on z-curve theory and position weight matrix (PWM), a model for nucleosome sequences was constructed. Nucleosome sequence dataset was transformed into three-dimensional coordinates, PWM of the nucleosome sequences was calculated and the similarity score was obtained. After integrating them, a nucleosome feature model based on the comprehensive DNA sequences was obtained and named CSeqFM. We calculated the Euclidean distance between nucleosome sequence candidates or linker sequences and CSeqFM model as the feature dataset, and put the feature datasets into the support vector machine (SVM) for training and testing by ten-fold cross-validation. The results showed that the sensitivity, specificity, accuracy and Matthews correlation coefficient (MCC) of identifying nucleosome positioning for S. cerevisiae were 97.1%, 96.9%, 94.2% and 0.89, respectively, and the area under the receiver operating characteristic curve (AUC) was 0.980 1. Compared with another z-curve method, it was found that our method had better identifying effect and each evaluation performance showed better superiority. CSeqFM method was applied to identify nucleosome positioning for other three species, including C. elegans, H. sapiens and D. melanogaster. The results showed that AUCs of the three species were all higher than 0.90, and CSeqFM method also showed better stability and effectiveness compared with iNuc-STNC and iNuc-PseKNC methods, which is further demonstrated that CSeqFM method has strong reliability and good identification performance.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要