A K-Mer Based Sequence Similarity for Pangenomic Analyses

Vincenzo Bonnici, Andrea Cracco,Giuditta Franco

Machine Learning, Optimization, and Data Science 7th International Conference, LOD 2021, Grasmere, UK, October 4–8, 2021, Revised Selected Papers, Part II（2021）

引用 1|浏览0

暂无评分

摘要

In this work we propose an approach to improve the performance of a current methodology, computing k-mer based sequence similarity via Jaccard index, for pangenomic analyses. Recent studies have shown a good performance of such a measure for retrieving homology among genetic sequences belonging to a group of genomes. Our improvement is obtained by exploiting a suitable k-mer representation, which enables a fast and memory-cheap computation of sequence similarity. Experimental results on genomes of living organisms of different species give an evidence that a state of the art methodology is here improved, in terms of running time and memory requirements.

查看译文

关键词

AF sequence similarity,Genomic dictionary,Jaccard index,k-mer content,Pangenome

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要