Adaptive Succinctness

SPIRE(2021)

引用 4|浏览10
暂无评分
摘要
Representing a static set of integers S , |S| = n from a finite universe U = [1..u] is a fundamental task in computer science. Our concern is to represent S in small space while supporting the operations of 𝗋𝖺𝗇𝗄 and 𝗌𝖾𝗅𝖾𝖼𝗍 on S ; if S is viewed as its characteristic vector, the problem becomes that of representing a bit-vector, which is arguably the most fundamental building block of succinct data structures. Although there is an information-theoretic lower bound of ℬ(n, u)= u ()n bits on the space needed to represent S , this applies to worst-case (random) sets S , and sets found in practical applications are compressible. We focus on the case where elements of S contain runs of | ℓ >1 consecutive elements, one that occurs in many practical situations. Let 𝒞^ (n ) denote the class of u ()n distinct sets of n elements over the universe [1..u] . Let also 𝒞^ (n )_g⊂𝒞^ (n ) contain the sets whose n elements are arranged in g ≤ n runs of ℓ _i ≥ 1 consecutive elements from U for i=1,… , g , and let 𝒞^ (n )_g,r⊂𝒞^ (n )_g contain all sets that consist of g runs, such that r ≤ g of them have at least 2 elements. This paper yields the following insights and contributions related to 𝗋𝖺𝗇𝗄 / 𝗌𝖾𝗅𝖾𝖼𝗍 succinct data structures: We introduce new compressibility measures for sets, including: ℬ_1(g,n,u)= |𝒞^ (n )_g| = u-n+1 ()g + n-1 ()g-1 , and ℬ_2(r, g, n,u)= |𝒞^ (n )_g,r| =u-n+1 ()g + n-g-1 ()r-1 + g ()r , such that ℬ_2(r, g, n,u)≤ℬ_1(g,n,u)≤ℬ(n, u) . We give data structures that use space close to bounds ℬ_1(g,n,u) and ℬ_2(r, g, n,u) and support 𝗋𝖺𝗇𝗄 and 𝗌𝖾𝗅𝖾𝖼𝗍 in O(1) time. We provide additional measures involving entropy-coding run lengths and gaps between items, and data structures to support 𝗋𝖺𝗇𝗄 and 𝗌𝖾𝗅𝖾𝖼𝗍 using space close to these measures.
更多
查看译文
关键词
Succinct and compressed data structures, Succinct bit vectors, Succinct rank, select data structures, Integer-set compression
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要