Contextual Pattern Matching in Less Space

2023 Data Compression Conference (DCC)(2023)

引用 0|浏览5
暂无评分
摘要
We revisit the Contextual Pattern Matching Problem, defined as follows: preprocess a text T[1, n], so that given a query consisting of a string P and a length P, the occurrences of all distinct strings XPY where |X|=|Y|=P can be reported. This problem was introduced by Navarro, who presented an O($\overline{r}\log(n/\overline{r}))$ space data structure, where $\overline{r}$ is the maximum of the number of runs in the BWT of the text $\mathrm{T}[1,n]$ and its reverse. His solution reports all c contextual occurrences in $O(|P|+c\log n)$ time. However, the only known bounds on $\overline{r}$ are $\overline{r}=O(r\log^{2}n)$ where r is the number of runs in the BWT of T, making it desirable to avoid using structures with space dependent on $\overline{r}$. We demonstrate that this is possible without a significant sacrifice in query time by providing an $O(r\log(n/r))$ space solution that answers queries in $O(|P|+c\log P\cdot\log(n/r))$ time.
更多
查看译文
关键词
Pattern Matching,Compact Data Structures,Suffix Trees,Contextual Pattern Matching
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要