Contextual Pattern Matching in Less Space

Paniz Abedin,Oliver Chubet,Daniel Gibney,Sharma V. Thankachan

2023 Data Compression Conference (DCC)（2023）

引用 0|浏览5

暂无评分

摘要

We revisit the Contextual Pattern Matching Problem, defined as follows: preprocess a text T[1, n], so that given a query consisting of a string P and a length P, the occurrences of all distinct strings XPY where |X|=|Y|=P can be reported. This problem was introduced by Navarro, who presented an O($\overline{r}\log(n/\overline{r}))$ space data structure, where $\overline{r}$ is the maximum of the number of runs in the BWT of the text $\mathrm{T}[1,n]$ and its reverse. His solution reports all c contextual occurrences in $O(|P|+c\log n)$ time. However, the only known bounds on $\overline{r}$ are $\overline{r}=O(r\log^{2}n)$ where r is the number of runs in the BWT of T, making it desirable to avoid using structures with space dependent on $\overline{r}$. We demonstrate that this is possible without a significant sacrifice in query time by providing an $O(r\log(n/r))$ space solution that answers queries in $O(|P|+c\log P\cdot\log(n/r))$ time.

查看译文

关键词

Pattern Matching,Compact Data Structures,Suffix Trees,Contextual Pattern Matching

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要