Quick Start¶

This guide walks through a complete fuzzing workflow — from seed generation to crash triage — in under five minutes.

What you will do

Generate a seed corpus of valid GGUF files
Build a native harness linked against llama.cpp
Run a fuzzing campaign
Triage any crashes that are found

1. Generate a seed corpus¶

Use crucible-gen to create structurally valid GGUF files and mutated variants:

crucible-gen --output corpus/generated --count 50

This produces 50 seed files in corpus/generated/. Each seed is a well-formed GGUF binary that exercises different combinations of:

Header versions and metadata key/value types
Tensor descriptors with varying dtypes and dimensions
Alignment padding and offset layouts

The generator also emits mutated variants that deliberately bend structural rules (truncated tensors, invalid enum values, overlapping offsets) to give the fuzzer a head start on interesting code paths.

More seeds, better coverage

A count of 50 is enough to get started. For longer campaigns, consider --count 200 or higher to widen the initial input space.

2. Download real models (optional)¶

For additional coverage, pull small real-world GGUF models from HuggingFace:

make download-corpus

Downloaded models are saved to corpus/real/. These complement the generated seeds by providing production layouts that the generator may not synthesize on its own.

Warning

Downloaded models can be several hundred megabytes. The Makefile target selects the smallest quantised variants available, but make sure you have adequate disk space.

3. Build a harness¶

Build a fuzzing harness linked against llama.cpp. Choose the engine that matches your setup:

libFuzzerAFL++Go native

make harness-libfuzzer LLAMA_CPP=/path/to/llama.cpp

Produces crucible-libfuzzer — a single binary with built-in coverage feedback.

make harness-afl LLAMA_CPP=/path/to/llama.cpp

Produces crucible-afl — an instrumented binary for use with afl-fuzz.

No harness build needed. The Go native fuzzer runs directly:

make fuzz-go

This uses go test -fuzz under the hood and is useful for testing Crucible's own Go parsing and mutation logic.

4. Run a campaign¶

Launch a fuzzing campaign with crucible run:

libFuzzerAFL++Go native

crucible run \
  --harness ./crucible-libfuzzer \
  --corpus ./corpus/generated \
  --output ./crashes \
  --dict corpus/gguf.dict \
  --jobs 8

crucible run \
  --harness ./crucible-afl \
  --corpus ./corpus/generated \
  --output ./crashes \
  --dict corpus/gguf.dict \
  --jobs 8 \
  --engine afl

make fuzz-go FUZZ_TIMEOUT=300

Flag	Description
`--harness`	Path to the compiled fuzzer binary
`--corpus`	Directory (or directories) containing seed inputs
`--output`	Where to write crashing inputs
`--jobs`	Number of parallel fuzzing workers
`--engine`	Fuzzing engine: `libfuzzer` (default) or `afl`
`--dict`	Path to a fuzzer dictionary (auto-detected if omitted)

Corpus directory

The --corpus flag passes the directory directly to the fuzzing engine. libFuzzer reads top-level files in the given directory; it does not recurse into subdirectories. Point --corpus at a flat directory containing your seed files (e.g., corpus/minimal/).

5. Check status¶

While a campaign is running, open a second terminal and query progress:

crucible status

Sample output:

Campaign Status:
  Crashes found: 3
  Total size:    48KB
  Crash dir:     ./crashes

Engine telemetry

For detailed metrics (executions, exec/s, corpus size), check the fuzzing engine's own output — libFuzzer prints live stats to stderr, and AFL++ writes to <output>/main/fuzzer_stats.

6. Triage crashes¶

Once the campaign finishes (or you stop it), analyze the crashing inputs:

make triage-libfuzzer

This replays binary crash files through the harness with sanitizers enabled, deduplicates by stack hash, classifies each root cause, and writes a report for each unique finding.

Text-log triage

If you are using the Go native harness (which produces sanitizer text logs rather than binary reproducers), use make triage instead — no --harness is needed.

7. Read reports¶

Reports are written as individual files under ./reports/:

reports/
  crash-001-heap-buffer-overflow.md
  crash-002-null-deref.md
  crash-003-assertion-failure.md

Each report contains:

# Crash Report: heap-buffer-overflow

## Summary
  Type:       heap-buffer-overflow
  Stack hash: a4f9c31e
  Target:     gguf
  Harness:    ./crucible-libfuzzer
  Reproducer: crashes/crash-abc123def456

## Stack Trace
  #0 gguf_load_tensor_data (gguf.c:482)
  #1 gguf_init_from_file  (gguf.c:310)
  #2 LLVMFuzzerTestOneInput (harness.c:24)

## Reproduction
  ./crucible-libfuzzer crashes/crash-abc123def456

## Notes
  The input contains a tensor descriptor whose offset extends past the
  end of the data section, triggering an out-of-bounds read during
  tensor loading.

What next?

Minimise the reproducer: crucible triage --minimize --crashes ./crashes --harness ./crucible-libfuzzer
File upstream bugs with the reproducer and stack trace
Add the minimised input to corpus/minimal/ so future runs detect regressions

Next: Configuration — Makefile targets, environment variables, and directory layout.