Accurate transcription start sites enable mining for the cis-regulatory determinants of tissue specific gene expression

biorxiv(2020)

引用 1|浏览0
暂无评分
摘要
Across tissues, gene expression is regulated by a combination of determinants, including the binding of transcription factors (TFs), along with other aspects of cellular state. Recent studies emphasize the importance of both genetic and epigenetic states – TF binding sites and binding site chromatin accessibility have emerged as potentially causal determinants of tissue specificity. To investigate the relative contributions of these determinants, we constructed three genome-scale datasets for both root and shoot tissues of the same Arabidopsis thaliana plants: TSS-seq data to identify Transcription Start Sites, OC-seq data to identify regions of Open Chromatin, and RNA-seq data to assess gene expression levels. For genes that are differentially expressed between root and shoot, we constructed a machine learning model predicting tissue of expression from chromatin accessibility and TF binding information upstream of TSS locations. The resulting model was highly accurate (over 90% auROC and auPRC), and our analysis of model contributions (feature weights) strongly suggests that patterns of TF binding sites within ∼500 nt TSS-proximal regions are predominant explainers of tissue of expression in most cases. Thus, in plants, cis-regulatory control of tissue-specific gene expression appears to be primarily determined by TSS-proximal sequences, and rarely by distal enhancer-like accessible chromatin regions. This study highlights the exciting future possibility of a native TF site-based design process for the tissue-specific targeting of plant gene promoters. ### Competing Interest Statement The authors have declared no competing interest.
更多
查看译文
关键词
gene expression,specific gene expression,accurate transcription,cis-regulatory
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要