Dynamic-MLCS: Fast searching for dynamic multiple longest common subsequences in sequence stream data

Yuanyuan Fu,Chunyang Wang, Jixin Zhu, Qun Zhang,Yiuming Cheung,Yuping Wang

Knowledge-Based Systems(2024)

引用 0|浏览0
暂无评分
摘要
In recent years, data stream mining has become a popular research topic, wherein the dynamic multiple longest common subsequence problem for sequence streams (i.e., D-MLCS) is an important issue with a vast range of applications. However, the existing methods are only for static multiple longest common subsequence problem (i.e., S-MLCS) and there are no algorithms for the D-MLCS problem. Using the existing algorithms for the S-MLCS problem to solve the D-MLCS problem often leads to an overload of the main memory and is extremely time consuming as all the sequence segments should be stored into the main memory when the new segments flow in, and all updated problems should be resolved. To overcome this shortcoming, we designed a scheme for constructing dynamic Directed Acyclic Graphs (DAGs). It only needs to update the current DAG post each sequence segment flows in, without constructing the DAG from the beginning. We propose a novel algorithm for the D-MLCS problem called Dynamic-MLCS, which only needs to store a part of the key information of each streaming sequence segments, instead of storing all streaming sequence segments and scanning them repeatedly. Theoretical analyses demonstrate that the time and space complexities of the proposed algorithm are only relevant and linear to the number of the incomplete points and newly generated nodes, and the experiments conducted on the datasets of sequence streams indicate that the proposed Dynamic-MLCS is efficient and performs better than the compared state-of-the-art algorithms.
更多
查看译文
关键词
Multiple Longest Common Subsequences (MLCS),Sequence stream,Directed Acyclic Graphs (DAG),Incomplete point,Complete point
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要