Citrullination Site Prediction by Incorporating Sequence Coupled Effects into PseAAC and Resolving Data Imbalance Issue

CURRENT BIOINFORMATICS(2020)

引用 27|浏览2
暂无评分
摘要
Background: Post-translational modification is one of the bio-molecular mechanisms in living organisms, which incorporate functional diversity in proteins as well as regulate cellular processes. Transformation of arginine residue to citrulline in protein is such a modification. Objective: Our objective is to identify citrullinated arginine residue sites quickly and accurately. Methods: In this study, a novel computational tool, abbreviated as predCitru-Site, has been developed to predict citrullination sites. This technique effectively has incorporated the sequence-coupling effect of surrounding amino acids of arginine residues as well as optimizes skewed training citrullination dataset for prediction quality improvement. The performance of predCitruSite has been measured from the average of 5 complete runs of the 10-fold cross-validation test to comply with existing tools. Results and Conclusion: predCitru-Site has achieved 97.6% sensitivity, 98.9% specificity, and overall accuracy of 98.5%. With Matthew's correlation coefficient of 0.967, it has also shown an area under the receiver operator characteristics curve of 0.997. Compared with existing tools, predCitru-Site significantly outperforms on the same benchmark dataset. It also shows significant improvement in the case of independent tests in all performance metrics (around 50% higher in AUC). These results suggest that our method is promising and can be used as a complementary technique for fast exploration of citrullination in arginine residue. A user-friendly web server has also been deployed at http://research.ru.ac.bd/predCitru-Site/ for the convenience of experimental scientists.
更多
查看译文
关键词
Citrullination sites prediction,sequence-coupling model,general PseAAC,data imbalance issue,support vector machine,computational
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要