Analyzing Histone ChIP-seq Data with a Bin-Based Probability of Being Signal

Vivian Hecht,Kevin Dong, Sreshtaa Rajesh,Polina Shpilker, Siddarth Wekhande,Noam Shoresh

PLOS COMPUTATIONAL BIOLOGY(2023)

引用 0|浏览2
暂无评分
摘要
Histone ChIP-seq is one of the primary methods for charting the cellular epigenomic landscape, the components of which play a critical regulatory role in gene expression. Analyzing the activity of regulatory elements across datasets and cell types can be challenging due to shifting peak positions and normalization artifacts resulting from, for example, differing read depths, ChIP efficiencies, and target sizes. Moreover, broad regions of enrichment seen in repressive histone marks often evade detection by commonly used peak callers. Here, we present a simple and versatile method for identifying enriched regions in ChIP-seq data that relies on estimating a gamma distribution fit to non-overlapping 5kB genomic bins to establish a global background. We use this distribution to assign a probability of being signal (PBS) between zero and one to each 5 kB bin. This approach, while lower in resolution than typical peak-calling methods, provides a straightforward way to identify enriched regions and compare enrichments among multiple datasets, by transforming the data to values that are universally normalized and can be readily visualized and integrated with downstream analysis methods. We demonstrate applications of PBS for both broad and narrow histone marks, and provide several illustrations of biological insights which can be gleaned by integrating PBS scores with downstream data types. Histone modifications are key to epigenetic regulation of gene expression. Their genomic distributions are most commonly measured using ChIP-seq, a next-generation sequencing method which results in pileups of sequencing reads near regions with modified histones. Comparing the signal between histone modification profiles in different cellular contexts is often challenging due to the lack of a clear standard for interpreting signal strength, the inherent variability in the positions of histones, and the difficulty of detecting very broad regions that are only modestly enriched compared to the background. We present a method, which we call PBS, or probability of being signal, to simplify and streamline the process of identifying where true signal can be found in ChIP-seq datasets, regardless of whether the signal is broad or narrow. We demonstrate how our method can be used to visualize, quantify and compare signal among ChIP-seq datasets, and to easily integrate ChIP-seq data with additional data types. We show how PBS can both stand on its own, and also add power to other commonly used analysis methods. We anticipate that its versatility and straightforward interpretation will prove useful in many applications.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要