A novel pattern-based edit distance for automatic log parsing.

ICPR(2022)

引用 1|浏览7
暂无评分
摘要
This work aims at inferring a set of regular expressions to parse a text file, like a system log. To this end, we propose a novel edit distance taking advantage of the pattern matching background. Edit distances are commonly used for fuzzy search and in bioinformatics, and compare two strings at the character level. By doing so, edit distances do not consider the nature of the data conveyed by the strings. To address this problem, we propose the following contributions. First, we propose to model strings at the pattern level using a dedicated data structure, called pattern automaton. Second, we design a novel edit distance, operating at the pattern level. Third, we derive a clustering algorithm optimized for this distance. Finally, we evaluate our proposal through experimental validation.
更多
查看译文
关键词
automatic log,edit distance,pattern-based
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要