Skip to content

Quick Start

This guide walks through a complete fuzzing workflow — from seed generation to crash triage — in under five minutes.

What you will do

  1. Generate a seed corpus of valid GGUF files
  2. Build a native harness linked against llama.cpp
  3. Run a fuzzing campaign
  4. Triage any crashes that are found

1. Generate a seed corpus

Use crucible-gen to create structurally valid GGUF files and mutated variants:

crucible-gen --output corpus/generated --count 50

This produces 50 seed files in corpus/generated/. Each seed is a well-formed GGUF binary that exercises different combinations of:

  • Header versions and metadata key/value types
  • Tensor descriptors with varying dtypes and dimensions
  • Alignment padding and offset layouts

The generator also emits mutated variants that deliberately bend structural rules (truncated tensors, invalid enum values, overlapping offsets) to give the fuzzer a head start on interesting code paths.

More seeds, better coverage

A count of 50 is enough to get started. For longer campaigns, consider --count 200 or higher to widen the initial input space.


2. Download real models (optional)

For additional coverage, pull small real-world GGUF models from HuggingFace:

make download-corpus

Downloaded models are saved to corpus/real/. These complement the generated seeds by providing production layouts that the generator may not synthesize on its own.

Warning

Downloaded models can be several hundred megabytes. The Makefile target selects the smallest quantised variants available, but make sure you have adequate disk space.


3. Build a harness

Build a fuzzing harness linked against llama.cpp. Choose the engine that matches your setup:

make harness-libfuzzer LLAMA_CPP=/path/to/llama.cpp

Produces crucible-libfuzzer — a single binary with built-in coverage feedback.

make harness-afl LLAMA_CPP=/path/to/llama.cpp

Produces crucible-afl — an instrumented binary for use with afl-fuzz.

No harness build needed. The Go native fuzzer runs directly:

make fuzz-go

This uses go test -fuzz under the hood and is useful for testing Crucible's own Go parsing and mutation logic.


4. Run a campaign

Launch a fuzzing campaign with crucible run:

crucible run \
  --harness ./crucible-libfuzzer \
  --corpus ./corpus/generated \
  --output ./crashes \
  --dict corpus/gguf.dict \
  --jobs 8
crucible run \
  --harness ./crucible-afl \
  --corpus ./corpus/generated \
  --output ./crashes \
  --dict corpus/gguf.dict \
  --jobs 8 \
  --engine afl
make fuzz-go FUZZ_TIMEOUT=300
Flag Description
--harness Path to the compiled fuzzer binary
--corpus Directory (or directories) containing seed inputs
--output Where to write crashing inputs
--jobs Number of parallel fuzzing workers
--engine Fuzzing engine: libfuzzer (default) or afl
--dict Path to a fuzzer dictionary (auto-detected if omitted)

Corpus directory

The --corpus flag passes the directory directly to the fuzzing engine. libFuzzer reads top-level files in the given directory; it does not recurse into subdirectories. Point --corpus at a flat directory containing your seed files (e.g., corpus/minimal/).


5. Check status

While a campaign is running, open a second terminal and query progress:

crucible status

Sample output:

Campaign Status:
  Crashes found: 3
  Total size:    48KB
  Crash dir:     ./crashes

Engine telemetry

For detailed metrics (executions, exec/s, corpus size), check the fuzzing engine's own output — libFuzzer prints live stats to stderr, and AFL++ writes to <output>/main/fuzzer_stats.


6. Triage crashes

Once the campaign finishes (or you stop it), analyze the crashing inputs:

make triage-libfuzzer

This replays binary crash files through the harness with sanitizers enabled, deduplicates by stack hash, classifies each root cause, and writes a report for each unique finding.

Text-log triage

If you are using the Go native harness (which produces sanitizer text logs rather than binary reproducers), use make triage instead — no --harness is needed.


7. Read reports

Reports are written as individual files under ./reports/:

reports/
  crash-001-heap-buffer-overflow.md
  crash-002-null-deref.md
  crash-003-assertion-failure.md

Each report contains:

# Crash Report: heap-buffer-overflow

## Summary
  Type:       heap-buffer-overflow
  Stack hash: a4f9c31e
  Target:     gguf
  Harness:    ./crucible-libfuzzer
  Reproducer: crashes/crash-abc123def456

## Stack Trace
  #0 gguf_load_tensor_data (gguf.c:482)
  #1 gguf_init_from_file  (gguf.c:310)
  #2 LLVMFuzzerTestOneInput (harness.c:24)

## Reproduction
  ./crucible-libfuzzer crashes/crash-abc123def456

## Notes
  The input contains a tensor descriptor whose offset extends past the
  end of the data section, triggering an out-of-bounds read during
  tensor loading.

What next?

  • Minimise the reproducer: crucible triage --minimize --crashes ./crashes --harness ./crucible-libfuzzer
  • File upstream bugs with the reproducer and stack trace
  • Add the minimised input to corpus/minimal/ so future runs detect regressions

Next: Configuration — Makefile targets, environment variables, and directory layout.