Fast and accurate approximation of the joint site frequency spectrum of multiple populations

biorxiv(2020)

引用 0|浏览0
暂无评分
摘要
The site frequency spectrum (SFS) is a statistic that summarizes the distribution of derived allele frequencies in a sample of DNA sequences. The SFS provides useful information about genetic variation within and among populations and it can used to make population genetic inferences. Methods for computing the SFS based on the diffusion approximation are computationally efficient when computing all terms of the SFS simultaneously and they can handle complicated demographic scenarios. However, in practice it is sometimes only necessary to compute a subset of terms of the SFS, in which case coalescent-based methods can achieve greater computational efficiency. Here, we present simple and accurate approximate formulas for the expected joint SFS for multiple populations connected by migration. Compared with existing exact approaches, our approximate formulas greatly reduce the complexity of computing each entry of the SFS and have simple forms. The computational complexity of our method depends on the index of the entry to be computed, rather than on the sample size, and the accuracy of our approximation improves as the sample size increases.
更多
查看译文
关键词
site frequency spectrum,complex demography,approximation,coalescent
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要