谷歌浏览器插件
订阅小程序
在清言上使用

Rare Copy Number Variant analysis in case-control studies using SNP Array Data: a scalable and automated data analysis pipeline

crossref(2024)

引用 0|浏览16
暂无评分
摘要
Background Rare copy number variants (CNVs) significantly influence the human genome and may contribute to disease susceptibility. High-throughput SNP genotyping platforms provide data that can be used for CNV detection, but it requires the complex pipelining of bioinformatic tools. Here, we propose a flexible bioinformatic pipeline for rare CNV analysis from human SNP array data. Results The pipeline performs two major tasks: (1) CNV detection and quality control, and (2) rare CNV analysis. It is implemented in Snakemake following a rule-based structure that enables automation and scalability while maintaining flexibility. Conclusions Our pipeline automates the detection and analysis of rare CNVs. It implements a rigorous CNV quality control, assesses the frequencies of these rare CNVs in patients versus controls, and evaluates the impact of CNVs on specific genes or pathways. We hence aim to provide an efficient yet flexible bioinformatic framework to investigate rare CNVs in biomedical research. ### Competing Interest Statement The authors have declared no competing interest. * CNV : Copy number variation SNP : Single nucleotide polymorphism GWAS : Genome wide association study LRR : Log R Ratio BAF : B allele frequency WF : Waviness Factor NumCNVs : Number of called CNVs PFB : Population frequency of B allele PCA : Principal component analysis MDS : Multidimensional scaling IBD : Identity by descent
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要