All You Need Is A Fuzzing Brain: An LLM-Powered System for Automated Vulnerability Detection and Patching
Abstract
A Cyber Reasoning System using LLMs autonomously discovered and patched security vulnerabilities in open-source projects, with a public leaderboard for benchmarking LLMs on these tasks.
Our team, All You Need Is A Fuzzing Brain, was one of seven finalists in DARPA's Artificial Intelligence Cyber Challenge (AIxCC), placing fourth in the final round. During the competition, we developed a Cyber Reasoning System (CRS) that autonomously discovered 28 security vulnerabilities - including six previously unknown zero-days - in real-world open-source C and Java projects, and successfully patched 14 of them. The complete CRS is open source at https://github.com/o2lab/afc-crs-all-you-need-is-a-fuzzing-brain. This paper provides a detailed technical description of our CRS, with an emphasis on its LLM-powered components and strategies. Building on AIxCC, we further introduce a public leaderboard for benchmarking state-of-the-art LLMs on vulnerability detection and patching tasks, derived from the AIxCC dataset. The leaderboard is available at https://o2lab.github.io/FuzzingBrain-Leaderboard/.
Community
Our team, “All You Need Is A Fuzzing Brain,” placed 4th in DARPA’s AIxCC finals with a Cyber Reasoning System (CRS) that autonomously discovered 28 vulnerabilities (including 6 zero-days) and patched 14 of them in real-world C and Java projects. This paper presents a detailed technical report of our CRS, with a focus on its LLM-integrated components and autonomous vulnerability triage and patching strategies.
💻 Code: https://github.com/o2lab/afc-crs-all-you-need-is-a-fuzzing-brain
📊 Leaderboard: https://o2lab.github.io/FuzzingBrain-Leaderboard
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- VulnRepairEval: An Exploit-Based Evaluation Framework for Assessing Large Language Model Vulnerability Repair Capabilities (2025)
- LibLMFuzz: LLM-Augmented Fuzz Target Generation for Black-box Libraries (2025)
- A.S.E: A Repository-Level Benchmark for Evaluating Security in AI-Generated Code (2025)
- Multi-Agent Penetration Testing AI for the Web (2025)
- AI Agentic Vulnerability Injection And Transformation with Optimized Reasoning (2025)
- Adversarial Bug Reports as a Security Risk in Language Model-Based Automated Program Repair (2025)
- Shell or Nothing: Real-World Benchmarks and Memory-Activated Agents for Automated Penetration Testing (2025)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Models citing this paper 0
No model linking this paper
Datasets citing this paper 2
Spaces citing this paper 0
No Space linking this paper