A De-Compositional Approach to Regular Expression Matching for Network Security

IEEE/ACM Transactions on Networking(2019)

引用 5|浏览13
暂无评分
摘要
Regular Expression (RegEx) matching is the industry standard for Deep Packet Inspection (DPI) because RegExes are significantly more expressive than strings. To achieve high matching speed, we need to convert the RegExes to Deterministic Finite State Automata (DFA). However, DFA has the state explosion problem, that is, the number of DFA states and transitions can be exponential with the number of RegExes. Much work has addressed the DFA state explosion problem; however, none has met all the requirements of fast and automated construction, small memory image, and high matching speed. In this paper, we propose a decompositional approach, with fast and automated construction, small memory image, and high matching speed, to DFA state explosion. The first key idea is to decompose a complex RegEx that cause exponential state increases into a set of simpler RegExes that do not cause exponential state increases, where any character string that matches the complex RegEx also matches all the RegExes in the set of simpler RegExes; that is, the set of strings that match the complex RegEx is a subset of strings that match the set of simpler RegExes. The second key idea is to use a stateful post-processing engine to filter the matches that are actually the matches of the complex RegEx. Given an input string for matching, instead of using the large DFA constructed from the original complex RegEx to perform the matching, we first use the small DFA constructed from the set of simpler RegExes to perform the matching, and then, if the small DFA reports a match, we use the post-processing engine to determine whether it is a true match to the original complex RegEx. Because the pre-processing is simple, automaton construction can be automated and fast, and because most on-line processing is done by a DFA, its matching speed is close to that of a DFA alone. Our experimental results show that our decompositional approach achieves orders of magnitude faster DFA construction (in terms of seconds instead of minutes), 30 times smaller memory image, and 43% faster matching speeds, than state-of-the-art software based RegEx matching algorithms.
更多
查看译文
关键词
Engines,Automata,Security,Standards,Pattern matching,Explosions,Random access memory
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要