AlphaViz: Visualization and validation of critical proteomics data directly at the raw data level

biorxiv(2022)

引用 3|浏览17
暂无评分
摘要
Although current mass spectrometry (MS)-based proteomics identifies and quantifies thousands of proteins and (modified) peptides, only a minority of them are subjected to in-depth downstream analysis. With the advent of automated processing workflows, biologically or clinically important results within a study are rarely validated by visualization of the underlying raw information. Current tools are often not integrated into the overall analysis nor readily extendable with new approaches. To remedy this, we developed AlphaViz, an open-source Python package to superimpose output from common analysis workflows on the raw data for easy visualization and validation of protein and peptide identifications. AlphaViz takes advantage of recent breakthroughs in the deep learning-assisted prediction of experimental peptide properties to allow manual assessment of the expected versus measured peptide result. We focused on the visualization of the 4-dimensional data cuboid provided by Bruker TimsTOF instruments, where the ion mobility dimension, besides intensity and retention time, can be predicted and used for verification. We illustrate how AlphaViz can quickly validate or invalidate peptide identifications regardless of the score given to them by automated workflows. Furthermore, we provide a ‘predict mode’ that can locate peptides present in the raw data but not reported by the search engine. This is illustrated the recovery of missing values from experimental replicates. Applied to phosphoproteomics, we show how key signaling nodes can be validated to enhance confidence for downstream interpretation or follow-up experiments. AlphaViz follows standards for open-source software development and features an easy-to-install graphical user interface for end-users and a modular Python package for bioinformaticians. Validation of critical proteomics results should now become a standard feature in MS-based proteomics. ### Competing Interest Statement The authors have declared no competing interest. * BPI : base peak intensity CCS : collisional cross section DDA : data-dependent acquisition DIA : data-independent acquisition EGF : epidermal growth factor FDR : false discovery rate GOBP : Gene Ontology Biological Process GUI : graphical user interface IM : ion mobility IQR : interquartile range MS/MS : or MS2 tandem MS PASEF : parallel accumulation – serial fragmentation PEP : posterior error probability PTM : post-translational modification PyPI : Python Package Index RT : retention time TIC : total ion current TIMS : trapped ion mobility spectrometry TOF : time-of-flight XIC : extracted ion chromatogram
更多
查看译文
关键词
critical proteomics data,visualization
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要