TIDD: tool-independent and data-dependent machine learning for peptide identification

Honglan Li,Seungjin Na,Kyu-Baek Hwang,Eunok Paek

BMC Bioinformatics（2022）

引用 2|浏览8

暂无评分

摘要

Background In shotgun proteomics, database search engines have been developed to assign peptides to tandem mass (MS/MS) spectra and at the same time post-processing (or rescoring) approaches over the search results have been proposed to increase the number of confident peptide identifications. The most popular post-processing approaches such as Percolator and PeptideProphet have improved rates of peptide identifications by combining multiple scores from database search engines while applying machine learning techniques. Existing post-processing approaches, however, are limited when dealing with results from new search engines because their features for machine learning must be optimized specifically for each search engine. Results We propose a universal post-processing tool, called TIDD, which supports confident peptide identifications regardless of the search engine adopted. TIDD can work for any (including newly developed) search engines because it calculates universal features that assess peptide-spectrum match quality while it allows additional features provided by search engines (or users) as well. Even though it relies on universal features independent of search tools, TIDD showed similar or better performance than Percolator in terms of peptide identification. TIDD identified 10.23–38.95% more PSMs than target-decoy estimation for MSFragger, which is not supported by Percolator. TIDD offers an easy-to-use simple graphical user interface for user convenience. Conclusions TIDD successfully eliminated the requirement for an optimal feature engineering per database search tool, and thus, can be applied directly to any database search results including newly developed ones.

查看译文

关键词

Mass spectrometry,Peptide identification,PSM rescoring,Tool-independent,Data-dependent,Machine learning

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要