A Space-Saving Based MLCS Algorithm.

Botian Jiang, Chunyang Wang, Yuanyuan Fu,Hai-Lin Liu,Ping Guo,Yuping Wang

2023 IEEE Intl Conf on Dependable, Autonomic and Secure Computing, Intl Conf on Pervasive Intelligence and Computing, Intl Conf on Cloud and Big Data Computing, Intl Conf on Cyber Science and Technology Congress (DASC/PiCom/CBDCom/CyberSciTech)(2023)

引用 0|浏览5
暂无评分
摘要
In this paper, we propose a branch and bound space saving method for finding all multiple longest common subsequences (MLCS). First, a more reasonable lower bound estimation strategy is designed. To construct a local directed acyclic graph (DAG graph) for the sequences, the size and distribution of the matching point coordinates are both considered when retaining nodes in each layer, so that the real common subsequence obtained is as long as possible. Then, a simpler upper bound estimation strategy is devised. A new concept “boundary point” is defined, and a new upper bound calculation formula is derived by using the properties of the boundary point. So the upper bound value of the length of the branch of the current node can be quickly estimated, and the node is added to the graph only when this upper bound value is not less than the lower bound value. In addition, a storage-saving graph model is designed, which only retains the nodes with in-degree 0 at any time, and each node stores the string set corresponding to all paths from the source point to the node itself. The comparison experiments with 3 state-of-the-art algorithms on the standard biological data set indicate that the proposed algorithm is more effective and efficient.
更多
查看译文
关键词
MLCS,data mining,DAG,sequence mining,graph optimization
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要