Configuration¶
Makefile targets¶
The project Makefile is the primary build and run interface. All targets respect the environment variables listed below.
| Target | Description |
|---|---|
build | Compile all Go binaries (crucible, crucible-gen, crucible-triage) |
test | Run Go unit tests (./cmd/... and ./pkg/...) |
generate | Run crucible-gen to create a seed corpus |
harness-libfuzzer | Build the libFuzzer harness (requires LLAMA_CPP) |
harness-afl | Build the AFL++ harness (requires LLAMA_CPP) |
run-libfuzzer | Build and run a libFuzzer campaign |
run-afl | Build and run an AFL++ campaign |
triage | Triage crashes in ./crashes and write reports |
triage-libfuzzer | Replay binary crashes through harness and generate reports |
fuzz-go | Run Go-native fuzz tests (go test -fuzz) |
fuzz-mutator | Fuzz the mutation engine itself |
fuzz-roundtrip | Fuzz the GGUF encode/decode round-trip path |
coverage | Collect code coverage from corpus replay |
test-short | Run Go unit tests with a shorter timeout |
lint | Run go vet and go fmt |
vet | Run go vet static analysis |
fmt | Run go fmt code formatting |
clean | Remove built binaries, crash artifacts, and generated corpus |
Environment variables¶
Set these before invoking make or the CLI tools directly.
| Variable | Default | Description |
|---|---|---|
LLAMA_CPP | ~/src/llama.cpp | Path to a local llama.cpp checkout (required for harness targets) |
FUZZ_JOBS | nproc / sysctl -n hw.ncpu | Number of parallel fuzzing workers |
FUZZ_TIMEOUT | 0 (unlimited) | Campaign timeout in seconds (0 = run until stopped) |
FUZZ_MAX_LEN | 10485760 (10 MiB) | Maximum input size in bytes fed to the harness |
GO | go | Path to the Go binary (useful for non-standard installs) |
Version Pinning
Target versions are pinned in VERSIONS.env at the project root for CI reproducibility:
Persistent configuration
Export variables in your shell profile to avoid repeating them:
CLI flags reference¶
Each binary accepts its own set of flags. The table below is a summary — see the full reference in the CLI section of the documentation.
crucible¶
| Flag | Description |
|---|---|
--harness | Path to the compiled harness binary |
--corpus | Corpus directory (passed to engine) |
--output | Output directory for crashes |
--jobs | Parallel worker count |
--engine | libfuzzer (default) or afl |
--timeout | Per-testcase timeout |
--max-len | Maximum input size in bytes |
--dict | Fuzzer dictionary path (auto-detected from corpus dir if empty) |
--dry-run | Print configuration and exit |
--help | Show help |
crucible-gen¶
| Flag | Description |
|---|---|
--output | Directory to write generated seeds |
--count | Number of mutated files to generate |
--seed | Random seed (0 = time-based) |
--mutate | Also generate mutated variants |
--talos | Generate Talos CVE-targeted seeds |
--arch | Generate architecture-targeted seeds |
--clip | Generate clip/vision-model seeds |
--help | Show help |
crucible-triage¶
| Flag | Description |
|---|---|
--crashes | Directory containing crash inputs |
--output | Directory for report output |
--harness | Harness binary (required for binary crash replay) |
--minimize | Minimize crash reproducers (recurses into subdirectories) |
--replay-timeout | Timeout per replay execution (default 30s) |
--replay-env | Extra env vars for replay (KEY=VALUE, repeatable) |
--target | Target surface for reports (auto-detected from harness) |
--sarif | Write SARIF 2.1.0 output to this file path |
--help | Show help |
Directory structure¶
A typical Crucible workspace looks like this:
crucible/
corpus/
minimal/ # Hand-crafted minimal valid GGUF files
real/ # Real models downloaded from HuggingFace
generated/ # Output of crucible-gen
reconstructed/ # CVE-targeted seeds (crucible-gen --talos)
targeted/ # Targeted seeds (crucible-gen --arch / --clip)
arch-seeds/ # Per-architecture dispatch seeds (100+ architectures)
grammar/ # GBNF grammar test cases
jinja/ # Jinja template test cases
json-schema/ # JSON Schema test cases
rpc/ # RPC protocol message seeds
server/ # HTTP endpoint test cases
lora/ # LoRA adapter format seeds
whisper/ # whisper.cpp model seeds
whisper-audio/ # PCM audio seeds for whisper inference
tflite/ # TensorFlow Lite model seeds
torchscript/ # PyTorch TorchScript model seeds
safetensors/ # SafeTensors format seeds
onnx/ # ONNX model format seeds
gguf.dict # GGUF-specific fuzzer dictionary
grammar.dict # GBNF grammar dictionary
jinja.dict # Jinja template dictionary
json-schema.dict # JSON Schema dictionary
pytorch.dict # PyTorch-specific dictionary
rpc.dict # RPC protocol dictionary
server.dict # HTTP/JSON endpoint dictionary
tflite.dict # TFLite operator dictionary
crashes/ # Crashing inputs discovered during campaigns
reports/ # Triage reports (one per unique crash)
Campaign directories
Active fuzzing campaigns may create additional subdirectories under corpus/ (e.g., campaign-deep/, libfuzzer-campaign/). These are runtime artifacts managed by the fuzzing engine and do not need to be committed.
Note
The --corpus flag passes the directory directly to the fuzzing engine. libFuzzer reads top-level files in the corpus directory; it does not recurse into subdirectories. Point --corpus at a flat directory containing your seed files, or use crucible generate --output ./seeds to create one.
Fuzzer dictionary¶
The file corpus/gguf.dict contains tokens and byte sequences that are significant to the GGUF binary format. Both libFuzzer and AFL++ can consume this dictionary to guide mutation toward structurally meaningful changes.
The dictionary includes:
- Magic bytes — the
GGUFfile signature - Version constants — known GGUF version numbers
- Metadata key prefixes — common key strings such as
general.architectureandtokenizer.ggml.model - Tensor dtype identifiers — enum values for
F16,Q4_0,Q8_0, etc. - Alignment values — common padding boundaries (32, 64, 128)
Using the dictionary
The run-libfuzzer and run-afl Makefile targets automatically pass corpus/gguf.dict if the file exists. No extra flags needed.
Extending the dictionary
If you are fuzzing a custom GGUF extension, add its specific magic values and key strings to gguf.dict. One token per line, using the libFuzzer dictionary format:
Next: See the Architecture section for details on how Crucible's mutation engine targets GGUF structure.