Skip to content

Architecture

This page describes the high-level design of Crucible, the relationships between its packages, and the data flow from seed files through mutation and fuzzing to triaged vulnerability reports.

High-Level Data Flow

The following diagram shows how data moves through the system during a fuzzing campaign:

flowchart LR
    Seeds["Seeds\n(minimal GGUF files)"]
    Corpus["Corpus\n(generated + downloaded)"]
    Mutator["Mutator\n(structure-aware)"]
    Harness["Harness\n(libFuzzer / AFL++ / Go)"]
    Target["Target Parser\n(llama.cpp / Ollama)"]
    Crashes["Crashes\n(ASAN/UBSAN output)"]
    Triager["Triager\n(dedup + classify)"]
    Reports["Reports\n(CVE-ready)"]

    Seeds --> Corpus
    Corpus --> Mutator
    Mutator --> Harness
    Harness --> Target
    Target -->|crash| Crashes
    Target -->|no crash| Harness
    Crashes --> Triager
    Triager --> Reports

Package Relationships

Crucible is organized into four core packages under pkg/ and three command-line entry points under cmd/ (plus a C-archive build for the custom mutator):

graph TD
    subgraph "cmd/"
        crucible["cmd/crucible\n(main orchestrator)"]
        gen["cmd/crucible-gen\n(corpus generator)"]
        triage_cmd["cmd/crucible-triage\n(crash triager)"]
        mutator["cmd/crucible-mutator\n(C-archive for libFuzzer)"]
        rpcmutator["cmd/crucible-rpc-mutator\n(RPC C-archive for libFuzzer)"]
    end

    subgraph "pkg/"
        gguf["pkg/gguf\n(format types + reader/writer)"]
        mut["pkg/mutator\n(GGUF mutation engine)"]
        rpcmut["pkg/mutator/rpc\n(RPC mutation engine)"]
        corpus["pkg/corpus\n(seed generation + management)"]
        triage["pkg/triage\n(crash dedup + reporting)"]
        coverage["pkg/coverage\n(LLVM coverage collection + reporting)"]
    end

    subgraph "harness/"
        libfuzzer["harness/libfuzzer\n(C harness)"]
        aflpp["harness/aflpp\n(C harness)"]
        gofuzz["harness/go\n(Go native fuzz tests)"]
    end

    crucible --> gguf
    crucible --> mut
    crucible --> corpus
    gen --> gguf
    gen --> corpus
    triage_cmd --> triage

    mut --> gguf
    rpcmut --> gguf
    corpus --> gguf
    corpus --> mut

    gofuzz --> gguf
    gofuzz --> mut

    libfuzzer -.->|"links against"| gguf
    aflpp -.->|"links against"| gguf

Dependency direction

Dependencies flow strictly downward. pkg/gguf is the foundation with zero internal dependencies. pkg/mutator depends only on pkg/gguf. pkg/mutator/rpc depends on pkg/gguf. pkg/corpus depends on both pkg/gguf and pkg/mutator. pkg/triage and pkg/coverage are fully standalone.

Directory Structure

crucible/
├── cmd/
│   ├── crucible/              # Main orchestrator binary
│   │   └── main.go
│   ├── crucible-gen/          # Corpus generation tool
│   │   └── main.go
│   ├── crucible-mutator/      # Custom mutator (C-archive for libFuzzer)
│   │   └── main.go
│   ├── crucible-rpc-mutator/  # RPC graph-compute mutator (C-archive)
│   │   └── main.go
│   └── crucible-triage/       # Crash triage and reporting
│       ├── main.go
│       └── watch.go           # Watch-mode with checkpoint persistence
├── pkg/
│   ├── gguf/                  # GGUF format implementation
│   │   ├── format.go          # Types: Header, MetadataKV, TensorInfo, File
│   │   ├── format_test.go     # Format unit tests
│   │   ├── reader.go          # Binary deserialization (Unmarshal)
│   │   ├── reader_test.go     # Reader unit tests
│   │   └── writer.go          # Binary serialization (Marshal)
│   ├── mutator/               # Structure-aware mutation engine
│   │   ├── mutator.go         # Orchestrator: weighted category selection
│   │   ├── mutator_test.go    # Mutation engine tests
│   │   ├── header.go          # 5 header mutation strategies
│   │   ├── metadata.go        # 13 metadata mutation strategies
│   │   ├── tensorinfo.go      # 8 tensor info mutation strategies
│   │   ├── alignment.go       # 3 alignment mutation strategies
│   │   ├── data.go            # 6 tensor data mutation strategies
│   │   ├── consistency.go     # 6 cross-field consistency strategies
│   │   └── model_loader.go    # 5 model-loader strategies (weighted under metadata)
│   │   └── rpc/               # RPC graph-compute mutation engine
│   │       ├── op.go          # 2 op-enum strategies
│   │       ├── opparams.go    # 2 function-pointer/parameter strategies
│   │       ├── dimensions.go  # 3 tensor dimension strategies
│   │       ├── graph.go       # 3 graph-structure strategies
│   │       ├── strides.go     # 2 stride strategies
│   │       └── flags.go       # 2 tensor-flag strategies
│   ├── corpus/                # Corpus generation and management
│   │   ├── corpus.go          # Corpus loading and enumeration
│   │   ├── corpus_test.go     # Corpus unit tests
│   │   ├── generate.go        # Seed file generation
│   │   ├── minimize.go        # Corpus minimization
│   │   └── minimize_test.go   # Minimization tests
│   └── triage/                # Crash analysis and reporting
│       ├── triage.go          # Crash classification, dedup by stack hash
│       ├── triage_test.go     # Triage unit tests
│       ├── cwe.go             # CWE identifier mapping for each crash type
│       ├── stackhash.go       # Stable stack frame hashing
│       ├── stackhash_test.go  # Stack hash tests
│       ├── minimize.go        # Crash reproducer minimization (recursive WalkDir)
│       ├── minimize_test.go   # Minimize tests
│       ├── replay.go          # Crash replay against harness binaries
│       ├── replay_test.go     # Replay unit tests
│       ├── report.go          # CVE-ready report generation with CVSS
│       ├── report_target_test.go  # Report target detection tests
│       ├── sarif.go           # SARIF 2.1.0 output with target tags
│       ├── sarif_test.go      # SARIF output tests
│       └── fixture_test.go    # Shared test fixtures
│   └── coverage/              # LLVM coverage collection
│       └── coverage.go        # Replay corpus, merge profraw, HTML report
├── harness/
│   ├── libfuzzer/             # libFuzzer C++ harnesses (37 variants)
│   │   ├── harness.cpp        # LLVMFuzzerTestOneInput targeting gguf_init_from_file
│   │   └── Makefile
│   ├── aflpp/                 # AFL++ C harness
│   │   ├── harness.c
│   │   └── Makefile
│   └── go/                    # Go native fuzz tests
│       └── fuzz_test.go       # FuzzGGUFReader, FuzzMutator, FuzzRoundTrip
├── corpus/                    # Seed corpus directory
│   └── gguf.dict              # GGUF-specific dictionary for fuzzer guidance
├── crashes/                   # Crash artifacts output directory
├── targets/                   # Build configs for fuzz targets
│   ├── llamacpp/Makefile
│   └── ollama/Makefile
├── .github/
│   └── workflows/
│       ├── ci.yml             # CI pipeline: lint, test, build
│       └── docs.yml           # Documentation build and deploy
├── scripts/
│   ├── build-linux.sh         # Cross-compile harnesses for Linux
│   ├── build-targets.sh       # Build all target parsers
│   ├── download-corpus.sh     # Download seed corpus from model repos
│   ├── gen-arch-seeds.py      # Generate per-architecture GGUF seeds
│   ├── gen-lora-seeds.py      # Generate LoRA-specific GGUF seeds
│   ├── gen-server-seeds.py    # Generate HTTP endpoint test seeds
│   ├── gen-whisper-audio-seeds.py  # Generate PCM audio test seeds
│   ├── generate-targeted-seeds.py  # Generate TALOS CVE-targeted seeds
│   ├── launch-phase2.sh       # Launch Phase 2 campaign (multi-harness)
│   ├── run-all-campaigns.sh   # Launch all campaigns in parallel
│   ├── run-campaign.sh        # Single campaign launcher
│   └── rotate-logs.sh         # Rotate campaign logs
├── docs/                      # mkdocs-material documentation source
├── Makefile                   # Top-level build, test, fuzz, triage targets
├── mkdocs.yml                 # Documentation site configuration
├── FUZZING-ROADMAP.md         # Living checklist of targets, campaigns, and findings
├── VERSIONS.env               # Pinned target versions for CI reproducibility
├── go.mod
└── go.sum

Package Details

pkg/gguf -- Format Implementation

The foundation package. It defines the Go types that mirror the GGUF binary specification:

  • Header -- 4-byte magic (GGUF), version (uint32), tensor count (uint64), metadata KV count (uint64)
  • MetadataKV -- key string + typed value (14 scalar types plus arrays)
  • TensorInfo -- tensor name, dimensions, ggml_type enum, byte offset into the data section
  • File -- the complete in-memory representation of a GGUF file

The package provides Unmarshal([]byte) (*File, error) for parsing and Marshal(*File) ([]byte, error) for serialization. These form the round-trip pipeline that the mutation engine depends on.

Why not modify bytes directly?

Byte-level mutations are fast but structurally blind. A flipped byte in the metadata section is overwhelmingly likely to corrupt the length prefix of a subsequent field, causing the parser to reject the file immediately. By operating on parsed structures, Crucible ensures mutations produce files that reach deep parsing code paths where the real vulnerabilities live.

pkg/mutator -- Mutation Engine

The mutation engine implements 46 GGUF strategies organized into 6 weighted categories (across 7 strategy files, with model-loader strategies weighted under the Metadata category). A separate RPC mutation engine (pkg/mutator/rpc/) implements 14 strategies across 6 categories targeting the RPC_CMD_GRAPH_COMPUTE wire format. On each call to Mutate(), it:

  1. Randomly selects 1 to 3 mutations to apply
  2. For each mutation, picks a category using weighted random selection
  3. Picks a strategy uniformly at random within the chosen category
  4. Applies the strategy to the in-memory *gguf.File
  5. Serializes the result back to bytes via gguf.Marshal

Each strategy file (header.go, metadata.go, tensorinfo.go, alignment.go, data.go, consistency.go, model_loader.go) exports a *Strategies() []Strategy function. The Mutator registers all of them at construction time.

The Strategy interface is intentionally minimal:

type Strategy interface {
    Name() string
    Mutate(f *gguf.File, rng *rand.Rand)
}

This makes it straightforward to add new strategies -- implement the interface, add it to the appropriate *Strategies() function, and it is automatically picked up by the engine.

pkg/corpus -- Seed Management

Handles three responsibilities:

  1. Generation -- crucible-gen produces structurally varied seed GGUF files covering different metadata types, tensor configurations, and alignment values
  2. Loading -- reads seed files from disk for the fuzzing harnesses
  3. Minimization -- reduces the corpus to a minimal set that maximizes code coverage (useful after extended campaigns)

pkg/triage -- Crash Analysis

Fully decoupled from the GGUF format -- it operates on ASAN/UBSAN text output. Its pipeline:

  1. Classification -- pattern-matches ASAN output to identify the crash type (heap overflow, use-after-free, integer overflow, null deref, stack overflow, assertion failure)
  2. Stack hashing -- extracts stack frames and computes a stable hash for deduplication; the same bug triggered by different inputs produces the same hash
  3. Minimization -- recursively minimizes crash reproducers via WalkDir, preserving subdirectory structure and returning aggregate MinimizeSummary counts
  4. Report generation -- produces CVE-ready reports with CVSS 3.1 scoring, target/harness metadata, and minimized reproducer paths
  5. SARIF export -- writes SARIF 2.1.0 output with target tags and crucible/stackHash/v1 + crucible/target fingerprints

Mutation Pipeline

The following diagram shows the detailed flow when a single mutated test case is produced:

flowchart TD
    A["Parse seed file\n<code>gguf.Unmarshal(bytes)</code>"] --> B{"Select category\n(weighted random)"}

    B -->|"35%"| C1["Metadata + Model-Loader\n(13 + 5 = 18 strategies)"]
    B -->|"35%"| C2["Tensor Info\n(8 strategies)"]
    B -->|"10%"| C3["Header\n(5 strategies)"]
    B -->|"10%"| C4["Consistency\n(6 strategies)"]
    B -->|"5%"| C5["Alignment\n(3 strategies)"]
    B -->|"5%"| C6["Data\n(6 strategies)"]

    C1 --> D["Select strategy\n(uniform within category)"]
    C2 --> D
    C3 --> D
    C4 --> D
    C5 --> D
    C6 --> D

    D --> E["Apply mutation\nto *gguf.File"]
    E --> F{"More mutations?\n(1-3 total)"}
    F -->|yes| B
    F -->|no| G["Serialize\n<code>gguf.Marshal(file)</code>"]
    G --> H["Output mutated bytes\nto harness"]

Why 1 to 3 mutations per test case?

Applying multiple mutations per input increases the chance of triggering bugs that require two or more preconditions -- for example, a mismatched tensor_count header combined with a dimension product overflow. Single mutations find shallow bugs; stacked mutations find the deep ones.

Harness Integration

Crucible uses a split architecture where the mutation engine is written in Go but the fuzz harnesses target C/C++ parsers:

flowchart LR
    subgraph "Go side"
        Gen["crucible-gen\n(Go)"]
        Mut["pkg/mutator\n(Go)"]
    end

    subgraph "Corpus"
        Seeds["corpus/\n(GGUF files on disk)"]
    end

    subgraph "C side"
        LF["libFuzzer harness\n(C + ASAN)"]
        AFL["AFL++ harness\n(C + ASAN)"]
        Target["llama.cpp\ngguf_init_from_file()"]
    end

    subgraph "Go side (native)"
        GoFuzz["Go fuzz tests\n(FuzzGGUFReader)"]
        GoTarget["pkg/gguf\nUnmarshal()"]
    end

    Gen --> Seeds
    Mut --> Seeds
    Seeds --> LF
    Seeds --> AFL
    LF --> Target
    AFL --> Target
    Seeds --> GoFuzz
    GoFuzz --> GoTarget

How it works

  1. Seed generation: crucible-gen uses pkg/corpus and pkg/mutator to produce an initial corpus of structurally varied GGUF files, written to corpus/

  2. C harnesses (libFuzzer and AFL++): The fuzzer engine reads corpus files, applies its own byte-level mutations, and feeds the result to the harness. The harness writes the input to a temp file and calls gguf_init_from_file() from llama.cpp, compiled with -fsanitize=fuzzer,address,undefined. Any memory safety violation is caught by ASAN and reported as a crash.

  3. Go native harness: Go's built-in fuzzer calls FuzzGGUFReader which exercises gguf.Unmarshal directly, and FuzzMutator which verifies the mutation engine itself does not panic on arbitrary inputs. FuzzRoundTrip checks parse-serialize-parse consistency.

C harnesses require llama.cpp source

The libFuzzer and AFL++ harnesses link against llama.cpp's static libraries (libggml.a, libllama.a, and associated backend libraries). Set LLAMA_CPP to your local clone path:

make harness-libfuzzer LLAMA_CPP=~/src/llama.cpp

The Go native harness has no external dependencies and works out of the box.

Target parsers

Full workflow (targets/ + triage)

These targets have dedicated Makefiles in targets/<name>/ that handle cloning, building with sanitizer instrumentation, and running campaigns:

Target Parser Function Notes
llama.cpp gguf_init_from_file(), grammar engines, RPC Primary target
Ollama gguf_init_from_file() (vendored) Bundles its own fork of llama.cpp with custom patches

See the Fuzzing llama.cpp and Fuzzing Ollama guides for end-to-end workflows.

Harness-only (libFuzzer binaries, no targets/ workflow yet)

These targets have libFuzzer harness source in harness/libfuzzer/ but no automated clone/build workflow in targets/:

Target Parser Function Notes
whisper.cpp whisper_init_from_buffer_with_params() Audio model loader; shares gguf.cpp with llama.cpp
stable-diffusion.cpp ModelLoader::init_from_file() Multi-format model loader (GGUF, SafeTensors, ckpt)

Build these manually by cloning the upstream repo and pointing the harness Makefile at the source tree.