LLAMAFUZZ: Large Language Model Enhanced Greybox Fuzzing
CoRR(2024)
摘要
Greybox fuzzing has achieved success in revealing bugs and vulnerabilities in
programs. However, randomized mutation strategies have limited the fuzzer's
performance on structured data. Specialized fuzzers can handle complex
structured data, but require additional efforts in grammar and suffer from low
throughput.
In this paper, we explore the potential of utilizing the Large Language Model
to enhance greybox fuzzing for structured data. We utilize the pre-trained
knowledge of LLM about data conversion and format to generate new valid inputs.
We further fine-tuned it with paired mutation seeds to learn structured format
and mutation strategies effectively. Our LLM-based fuzzer, LLAMAFUZZ,
integrates the power of LLM to understand and mutate structured data to
fuzzing. We conduct experiments on the standard bug-based benchmark Magma and a
wide variety of real-world programs. LLAMAFUZZ outperforms our top competitor
by 41 bugs on average. We also identified 47 unique bugs across all trials.
Moreover, LLAMAFUZZ demonstrated consistent performance on both bug trigger and
bug reached. Compared to AFL++, LLAMAFUZZ achieved 27.19
real-world program sets on average. We also demonstrate a case study to explain
how LLMs enhance the fuzzing process in terms of code coverage.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要