Skip to content

Fuzzing llama.cpp

This guide walks through a complete fuzzing campaign targeting llama.cpp's gguf_init_from_file() function, from setup to CVE-ready reports.

Prerequisites

  • Crucible built (make build)
  • clang with sanitizer support
  • llama.cpp source code

Step 1: Get llama.cpp

git clone https://github.com/ggml-org/llama.cpp.git ~/src/llama.cpp
cd ~/src/llama.cpp

Testing against known-vulnerable versions

To reproduce known CVEs, pin to a vulnerable release:

git checkout b3561  # CVE-2024-23496

Step 2: Build the libFuzzer Harness

make harness-libfuzzer LLAMA_CPP=~/src/llama.cpp

This compiles harness/libfuzzer/crucible-libfuzzer with:

  • -fsanitize=fuzzer — libFuzzer instrumentation
  • -fsanitize=address — AddressSanitizer (detects heap/stack overflows)
  • -fsanitize=undefined — UndefinedBehaviorSanitizer (integer overflows, etc.)
make harness-afl LLAMA_CPP=~/src/llama.cpp

Requires afl-clang-fast from AFL++ installation.

No compilation needed — uses go test -fuzz:

make fuzz-go

Step 3: Generate Seed Corpus

make generate

This produces ~50 structural seeds and mutated variants in corpus/generated/:

  • Minimal valid GGUF files
  • One seed per metadata value type
  • One seed per tensor quantization type
  • Edge-case dimensions and alignments
  • Mutated variants with 1-3 structure-aware mutations each

Optional: Add real models

make download-corpus
Downloads small quantized models from HuggingFace for realistic structural variety.

Step 4: Run the Campaign

crucible run \
  --harness ./harness/libfuzzer/crucible-libfuzzer \
  --corpus ./corpus \
  --output ./crashes \
  --jobs 8 \
  --timeout 30s \
  --max-len 10485760
afl-fuzz \
  -i corpus/ \
  -o crashes/afl \
  -m none \
  -t 30000 \
  -x corpus/gguf.dict \
  -- ./harness/aflpp/crucible-afl
make run-libfuzzer  # Generates corpus + starts libFuzzer
make run-afl        # Generates corpus + starts AFL++

What to expect

Metric Target
Executions/sec 1,000+ (libFuzzer), 500+ (AFL++)
Time to first crash < 1 hour on known-vulnerable versions
Corpus growth ~100 new paths/hour initially

Resource usage

With 8 jobs, expect 8 CPU cores at 100% and 2-4 GB RAM. Each fuzzer instance writes to /tmp/ for the temp file approach.

Step 5: Monitor Progress

crucible status

Watch for crash artifacts appearing in ./crashes/. libFuzzer names them crash-<hash> or oom-<hash>.

Step 6: Triage Crashes

crucible triage --crashes ./crashes --output ./reports \
  --harness ./harness/libfuzzer/crucible-libfuzzer \
  --replay-timeout 30s

Output:

╔══════════════════════════════════════════╗
║         CRUCIBLE CRASH TRIAGE            ║
╠══════════════════════════════════════════╣
║ Files processed: 47                      ║
║ Unique crashes:  5                       ║
╚══════════════════════════════════════════╝

  [a3f8b2c1] heap-buffer-overflow  CVSS 9.8 (Critical)
         Location: gguf.c:342 in gguf_init_from_file
         Report:   reports/a3f8b2c1d4e5f678.md

Step 7: Review Reports

Each unique crash gets a markdown report:

# Vulnerability Report: CRASH-0001

**Type:** Heap Buffer Overflow
**CVSS Score:** 9.8 (Critical)
**Location:** gguf_init_from_file (gguf.c:342)
**Target:** gguf
**Reproducer:** crashes/crash-a3f8b2c1

Before publishing

Follow the Responsible Disclosure workflow. Report to llama.cpp maintainers via GitHub Security Advisory and allow 90 days for a fix before public disclosure.

Target Build Matrix

Target Command What's fuzzed
llama.cpp (latest) make harness-libfuzzer gguf_init_from_file() + getters
llama.cpp (b3561) Pin version, same build Regression baseline
Ollama make -C targets/ollama Ollama's vendored ggml