Species homology analysis method based on amino acid location and physicochemical properties
PROCEEDINGS OF 2021 2ND INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND INFORMATION SYSTEMS (ICAIIS '21)(2021)
摘要
The similarity analysis of protein sequences is a method for analyzing the homology between species. In this paper, we first encode a protein sequence into a 9 dimensional feature-vector consisting of a 20 dimensional content ratio vector, a 20 dimensional positionatio vector of the amino acids, and a 9 dimensional mean-physicochemical properties vector. Using the Euclidean distance to characterize the similarity distance of the protein sequence. We tested our method on two datasets: (1) 9 species ofND5 sequences, (2) 28 species of influenza A virus sequences. As a result, to verify the validity and practicability, we compare the correlation between the corresponding method and ClustalW.
更多查看译文
关键词
Protein sequence, Mean-physicochemical properties, Similarity analysis, Correlation coefficient
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要