Biomedical Data Commons (Bmdc) Prioritizes B-Lymphocyte Non-Coding Genetic Variants In Type 1 Diabetes

PLOS COMPUTATIONAL BIOLOGY(2021)

引用 3|浏览32
暂无评分
摘要
Author summaryThe fragmentation of datasets prevents repurposing due to time-intensive data cleaning and joining. This is especially true for Type 1 Diabetes for which the genetic contributions from B-lymphocytes, a specific type of white blood cells, remain understudied. Here, we create Biomedical Data Commons (BMDC), a knowledge graph, which maps datasets to common entities making them easy to search using queries. We also built a genetic variant prioritization pipeline that uses multi-dimensional 'omics data including three-dimensional connectome data. Using B-lymphocyte cell-type specific data as input, we prioritized variants associated with Type 1 Diabetes. The candidate variants identified are primarily of unknown clinical significance and in the non-coding genome. They are also connected with genes previously implicated in Type 1 Diabetes, suggesting that they affect cell type-specific gene regulation. Some variants in the HLA and IL2RA locus, which are important genomic regions for regulation of immune function, have previously been validated in humans and mice. Other variants have been included in a well-established Type 1 Diabetes genetic risk scoring method. This validates our approach and highlights the novel variants identified that should be prioritized for future clinical and experimental validation. BMDC is a community-based platform that increases the accessibility, reproducibility, and productivity of biomedical information for diverse applications, and our approach is widely applicable for prioritizing variants from other complex diseases.

The repurposing of biomedical data is inhibited by its fragmented and multi-formatted nature that requires redundant investment of time and resources by data scientists. This is particularly true for Type 1 Diabetes (T1D), one of the most intensely studied common childhood diseases. Intense investigation of the contribution of pancreatic beta-islet and T-lymphocytes in T1D has been made. However, genetic contributions from B-lymphocytes, which are known to play a role in a subset of T1D patients, remain relatively understudied. We have addressed this issue through the creation of Biomedical Data Commons (BMDC), a knowledge graph that integrates data from multiple sources into a single queryable format. This increases the speed of analysis by multiple orders of magnitude. We develop a pipeline using B-lymphocyte multi-dimensional epigenome and connectome data and deploy BMDC to assess genetic variants in the context of Type 1 Diabetes (T1D). Pipeline-identified variants are primarily common, non-coding, poorly conserved, and are of unknown clinical significance. While variants and their chromatin connectivity are cell-type specific, they are associated with well-studied disease genes in T-lymphocytes. Candidates include established variants in the HLA-DQB1 and HLA-DRB1 and IL2RA loci that have previously been demonstrated to protect against T1D in humans and mice providing validation for this method. Others are included in the well-established T1D GRS2 genetic risk scoring method. More intriguingly, other prioritized variants are completely novel and form the basis for future mechanistic and clinical validation studies The BMDC community-based platform can be expanded and repurposed to increase the accessibility, reproducibility, and productivity of biomedical information for diverse applications including the prioritization of cell type-specific disease alleles from complex phenotypes.

更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要