A human ensemble cell atlas (hECA) enables in data cell sorting

semanticscholar(2021)

引用 1|浏览20
暂无评分
摘要
The significance of building atlases of human cells as references for future biological and medical studies of human in health or disease has been well recognized. Comparing to the rapidly accumulation of single-cell data, there has been fewer published work on the information structure to assemble cell atlases, or on methods for using reference atlases once they are ready. Most existing cell atlas work organize single-cell gene expression data as a collection of individual files, allowing users to download selected data sheets, or to annotate query cells using models pretrained with the collected data. These features are useful as the basic use of cell atlases. More comprehensive uses of global cell atlases can be developed once data of cells from multiple organs across different studies can be assembled into one orchestrated data repository rather than a collection of data files. For this purpose, we presented a unified giant table or uGT to store and organize single-cell data from multiple studies into a single huge data repository, and a unified hierarchical annotation framework or uHAF to annotate cells from uncoordinated studies. Based on these technologies, we developed a system that enables users to design complex rules to recruit from the atlas cells that meet certain conditions, such as with desired expression range of a gene or multiple genes and with required organ, tissue origins or developmental stages, across multiple datasets that were otherwise unconnected. The conditions can be expressed as sophisticated logic criteria to pinpoint specific cells that cannot be easily spotted in traditional in vivo or in vitro cell sorting or in traditional searching in published data. We name this technology as in data cell sorting from cell atlases. With the increasing coverage of the cell atlas, this in data experiment paradigm will facilitate scientists to conduct investigations in the data space beyond the restrictions in traditional in vivo and in vitro experiments. In the current work, we collected scRNA-seq data of more than 1 million human cells from scattered studies and assembled them as a human Ensemble Cell Atlas or hECA using the proposed information structure, and provided comprehensive tools for in data experiments based on the atlas. Case examples on agile construction of atlases of particular cell types and on off-target prediction of targeted therapy showed that in data cell sorting is an efficient and effective way for comprehensive discoveries. hECA provides a powerful platform for assembling massive scattered single-cell data into a unified atlas, and can serve as a prototype for building future cell atlases.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要