Lysine Malonylation identification in E.coli with multiple features-1

CURRENT PROTEOMICS(2019)

引用 0|浏览3
暂无评分
摘要
Motivation: Lysine malonylation in eukaryote proteins had been found in 2011 through high-throughput proteomic analysis. However, it was poorly understood in prokaryotes. Recent researches have shown that maonylation in E. colt was significantly enriched in protein translation, energy metabolism pathways and fatty acid biosynthesis. Results: In this work we proposed a predictor to identify the lysine malonylation sites in E. coli through physicochemical properties, binary code and sequence frequency by support vector machine algorithm. The experimentally determined lysine malonylation sites were retrieved from the first and largest malonylome dataset in prokaryotes up to date. The physicochemical properties plus position specific amino acid sequence propensity features got the best results with AUC (the area under the Receive Operating Character curve) 0.7994, MCC (Mathew correlation coefficient) 0.4335 in 10-fold cross-validation. Meanwhile the AUC values were 0.7800, 0.7851 and 0.8050 in 6-fold, 8-fold and LOO (leave-one-out) cross-validation, respectively. All the ROC curves were close to each other which illustrated the robustness and performance of the proposed predictor. We also analyzed the sequence propensities through TwoSampleLogo and found some peptides differences with t-test p<0.01. The predictor had shown better results than those of other methods K-Nearest Neighbors, C4.5 decision tree, Naive Bayes and Random Forest. Functional analysis showed that malonylated proteins were involved in many transcription activities and diverse biological processes. Meanwhile we also developed an online package which could be freely downloaded https://github.com/Sunmile/Malonylation E.coli.
更多
查看译文
关键词
Malonylation,support vector machine,post translational modification,E. coli,Receive Operating Character (ROC),Prokaryotes
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要