Quick Start¶
This guide walks through a complete fuzzing workflow — from seed generation to crash triage — in under five minutes.
What you will do
- Generate a seed corpus of valid GGUF files
- Build a native harness linked against llama.cpp
- Run a fuzzing campaign
- Triage any crashes that are found
1. Generate a seed corpus¶
Use crucible-gen to create structurally valid GGUF files and mutated variants:
This produces 50 seed files in corpus/generated/. Each seed is a well-formed GGUF binary that exercises different combinations of:
- Header versions and metadata key/value types
- Tensor descriptors with varying dtypes and dimensions
- Alignment padding and offset layouts
The generator also emits mutated variants that deliberately bend structural rules (truncated tensors, invalid enum values, overlapping offsets) to give the fuzzer a head start on interesting code paths.
More seeds, better coverage
A count of 50 is enough to get started. For longer campaigns, consider --count 200 or higher to widen the initial input space.
2. Download real models (optional)¶
For additional coverage, pull small real-world GGUF models from HuggingFace:
Downloaded models are saved to corpus/real/. These complement the generated seeds by providing production layouts that the generator may not synthesize on its own.
Warning
Downloaded models can be several hundred megabytes. The Makefile target selects the smallest quantised variants available, but make sure you have adequate disk space.
3. Build a harness¶
Build a fuzzing harness linked against llama.cpp. Choose the engine that matches your setup:
Produces crucible-libfuzzer — a single binary with built-in coverage feedback.
Produces crucible-afl — an instrumented binary for use with afl-fuzz.
4. Run a campaign¶
Launch a fuzzing campaign with crucible run:
| Flag | Description |
|---|---|
--harness | Path to the compiled fuzzer binary |
--corpus | Directory (or directories) containing seed inputs |
--output | Where to write crashing inputs |
--jobs | Number of parallel fuzzing workers |
--engine | Fuzzing engine: libfuzzer (default) or afl |
--dict | Path to a fuzzer dictionary (auto-detected if omitted) |
Corpus directory
The --corpus flag passes the directory directly to the fuzzing engine. libFuzzer reads top-level files in the given directory; it does not recurse into subdirectories. Point --corpus at a flat directory containing your seed files (e.g., corpus/minimal/).
5. Check status¶
While a campaign is running, open a second terminal and query progress:
Sample output:
Engine telemetry
For detailed metrics (executions, exec/s, corpus size), check the fuzzing engine's own output — libFuzzer prints live stats to stderr, and AFL++ writes to <output>/main/fuzzer_stats.
6. Triage crashes¶
Once the campaign finishes (or you stop it), analyze the crashing inputs:
This replays binary crash files through the harness with sanitizers enabled, deduplicates by stack hash, classifies each root cause, and writes a report for each unique finding.
Text-log triage
If you are using the Go native harness (which produces sanitizer text logs rather than binary reproducers), use make triage instead — no --harness is needed.
7. Read reports¶
Reports are written as individual files under ./reports/:
Each report contains:
# Crash Report: heap-buffer-overflow
## Summary
Type: heap-buffer-overflow
Stack hash: a4f9c31e
Target: gguf
Harness: ./crucible-libfuzzer
Reproducer: crashes/crash-abc123def456
## Stack Trace
#0 gguf_load_tensor_data (gguf.c:482)
#1 gguf_init_from_file (gguf.c:310)
#2 LLVMFuzzerTestOneInput (harness.c:24)
## Reproduction
./crucible-libfuzzer crashes/crash-abc123def456
## Notes
The input contains a tensor descriptor whose offset extends past the
end of the data section, triggering an out-of-bounds read during
tensor loading.
What next?
- Minimise the reproducer:
crucible triage --minimize --crashes ./crashes --harness ./crucible-libfuzzer - File upstream bugs with the reproducer and stack trace
- Add the minimised input to
corpus/minimal/so future runs detect regressions
Next: Configuration — Makefile targets, environment variables, and directory layout.