Mutation Engine¶

The mutation engine is the core innovation of Crucible. Rather than treating GGUF files as opaque byte streams, it parses the binary structure and applies targeted mutations to specific fields, reaching deep code paths that generic fuzzers never touch.

Why Structure-Aware Beats Generic¶

Generic fuzzers like AFL and libFuzzer treat inputs as flat byte arrays. For a format like GGUF, this means the vast majority of mutations produce files rejected at the very first check — the 4-byte magic number.

Generic Fuzzer Problem

A random bit-flip has a 99.99% chance of corrupting the magic bytes, version field, or count fields in the header. The target rejects the file immediately, and no deeper code is ever exercised.

Structure-Aware Advantage

Crucible parses the seed into a typed Go struct, mutates specific fields within structural constraints, then re-serializes. Every generated file passes initial parsing and reaches the code paths where real bugs live.

Consider the difference:

Generic FuzzerCrucible

Iteration 1: corrupt magic       → rejected at byte 0
Iteration 2: corrupt magic       → rejected at byte 0
Iteration 3: corrupt version     → rejected at byte 4
Iteration 4: corrupt magic       → rejected at byte 0
...
Iteration 10000: finally valid   → reaches metadata parser

Iteration 1: mutate metadata key → reaches metadata string handler
Iteration 2: corrupt tensor dims → reaches dimension validation
Iteration 3: poison alignment    → reaches padding calculator
Iteration 4: overflow dim product → reaches allocation logic
...
Every iteration reaches deep code paths

Mutation Pipeline¶

Each fuzzing iteration follows this pipeline:

flowchart LR
    A[Parse Seed] --> B[Select 1-3\nMutations]
    B --> C[Weighted Category\nSelection]
    C --> D[Strategy\nSelection]
    D --> E[Apply to\nGGUF Struct]
    E --> F[Serialize\nto Bytes]
    F --> G[Write to\nTarget]

Parse Seed — Load and deserialize a .gguf file from the corpus into a *gguf.File struct
Select Mutation Count — Randomly choose 1 to 3 mutations to apply per iteration
Weighted Category Selection — Pick a mutation category using the distribution below
Strategy Selection — Uniformly select a specific strategy within that category
Apply Mutation — Call Strategy.Mutate(*gguf.File, *rand.Rand) to modify the struct in place
Serialize — Re-encode the mutated struct back to valid GGUF binary
Write — Pass the bytes to the target binary via stdin or temp file

Category Weights¶

Mutation categories are weighted based on historical CVE density and code path coverage:

pie title Mutation Category Weights
    "Metadata + Model-Loader (35%)" : 35
    "TensorInfo (35%)" : 35
    "Header (10%)" : 10
    "Consistency (10%)" : 10
    "Alignment (5%)" : 5
    "Data (5%)" : 5

Why 70% Metadata + TensorInfo¶

The metadata and tensor info sections receive 70% of mutation budget for good reason. Analysis of historical CVEs in llama.cpp and related parsers shows these sections contain the most bug-dense code:

Metadata parsing involves variable-length strings, nested arrays, and type dispatch — classic sources of buffer overflows and type confusion
Tensor info parsing involves dimension arithmetic (multiplication of multiple uint64 values), offset calculations, and memory allocation sizing — classic sources of integer overflows
Model-loader targeting (5 strategies in model_loader.go, weighted under the Metadata category) fuzzes architecture dispatch, hyperparameter handling, and tensor name schemas — exercising the llama_model_load() path that runs after GGUF parsing

CVE Evidence

The majority of GGUF-related vulnerabilities discovered by Cisco Talos, Trail of Bits, and independent researchers have been in metadata string handling, tensor dimension validation, and alignment calculation code paths.

Strategy Categories¶

Crucible implements 46 strategies across 6 weighted categories (7 strategy files). Every strategy implements the same interface:

Strategy Interface

type Strategy interface {
    Name() string                          // (1)!
    Mutate(f *gguf.File, rng *rand.Rand)   // (2)!
}

Returns a human-readable name for logging and crash reports
Modifies the GGUF file struct in place using the provided RNG for determinism

Header Strategies¶

Strategy	What It Does
`header.magic_corrupt`	Partial magic corruption (keep 1-2 valid bytes)
`header.version`	Set version to 0, 1, 999, or `UINT32_MAX`
`header.tensor_count`	Set `tensor_count` to 0 while tensors remain
`header.metadata_kv_count`	Set `tensor_count` to `UINT64_MAX`
`header.version_mismatch`	Set version that disagrees with field sizes used

Metadata Strategies¶

Strategy	What It Does
`metadata.key_length`	Empty keys, 1MB+ keys, embedded null bytes
`metadata.key_content`	Non-UTF8 sequences, null sleds, path traversal strings, surrogate pairs
`metadata.key_shadow`	Duplicate keys like `general.architecture` with conflicting value types
`metadata.value_type`	Invalid enum values (14+, `UINT32_MAX`), type confusion between similar types
`metadata.deep_array`	Create arrays nested to extreme depth
`metadata.array`	Empty arrays, nested arrays, element type mismatch, large arrays (100K+ elements)
`metadata.string_value`	Empty strings, 10MB strings, embedded nulls, non-UTF8
`metadata.invalid_utf8`	Inject non-UTF-8 byte sequences in string values
`metadata.alignment_poison`	Set `general.alignment` to 0, 1, 3, 7, `UINT32_MAX`
`metadata.reorder`	Randomize the order of metadata key-value pairs
`metadata.add_extra`	Inject 50-250 extra KV pairs with random types and large values
`metadata.int_overflow`	`UINT32_MAX`, `UINT64_MAX`, `INT64_MIN` in integer fields
`metadata.string_truncated`	Declared string length exceeds actual bytes available

TensorInfo Strategies¶

Strategy	What It Does
`tensorinfo.n_dims`	Set n_dims to 0, 5+, `UINT32_MAX` (spec allows 1-4)
`tensorinfo.dim_overflow`	Set individual dimension values to 0 or `UINT64_MAX`
`tensorinfo.type`	Invalid `ggml_type` enum values (5, 15, 255, `UINT32_MAX`)
`tensorinfo.offset`	Set offset beyond file size, `UINT64_MAX`, overlapping
`tensorinfo.name`	Empty names, 1MB names, embedded nulls, non-UTF8, duplicates
`tensorinfo.dim_product_overflow`	Dimension values whose product overflows `uint64`
`tensorinfo.name_collision`	Give two tensors the same name
`tensorinfo.offset_wraparound`	Offset + size wraps `uint64`, bypassing bounds checks

Model-Loader Strategies¶

These target the model-loading path (llama_model_load) rather than raw GGUF parsing. They are registered under the Metadata category for weighting purposes.

Strategy	What It Does
`model.architecture`	Set `general.architecture` to bogus/unknown/empty values
`model.hyperparam`	Overflow hyperparameter keys (embedding_length, head_count, etc.) with `UINT32_MAX`
`model.vocab`	Mutate tokenizer keys (model, bos/eos/pad token IDs) with invalid values
`model.layer_count`	Set `block_count` to extreme values (0, `UINT32_MAX`)
`model.tensor_name_schema`	Corrupt tensor names to break the name → layer mapping

Alignment Strategies¶

Strategy	What It Does
`alignment.padding`	Set alignment to 0, prime numbers, `UINT32_MAX`, OS page size
`alignment.extra_padding`	Insert random non-zero bytes before tensor data section
`alignment.missing_padding`	Metadata claims alignment but padding bytes are absent

Data Strategies¶

Strategy	What It Does
`data.truncate`	Truncate data section mid-tensor
`data.overlap`	Multiple tensors pointing to the same offset
`data.zero_length`	Empty data section with non-zero tensor count
`data.shorter`	Data section shorter than the sum of all tensor sizes
`data.garbage_fill`	Fill data section with random bytes
`data.nan_inf`	Inject NaN and Infinity values into tensor data

Consistency Strategies¶

Strategy	What It Does
`consistency.tensor_count`	Make `tensor_count` disagree with actual tensor entries
`consistency.metadata_count`	Make `metadata_kv_count` disagree with actual pairs
`consistency.offset_beyond`	Tensor offset + tensor data size > total file size
`consistency.tensor_size`	Dimensions claim X bytes but actual data region is Y bytes
`consistency.duplicate_offset`	Multiple tensors claim the same offset range
`consistency.alignment_disagree`	Metadata alignment value != actual file padding alignment

Critical Bug Patterns¶

The strategies above are designed to trigger specific classes of vulnerabilities:

Integer Overflow in Dimension Products

When tensor dimensions are multiplied to compute n_elements, large values cause silent overflow in C/C++. A tensor with dimensions [UINT64_MAX, 2] wraps around to a small allocation size, but the parser then writes far more data than allocated.

Strategies: HugeDimension, ExcessiveDims

Type Confusion

Writing a value as one type but tagging it as another causes the parser to interpret raw bytes incorrectly. A 4-byte float tagged as a string causes the parser to read the float bits as a string length, leading to out-of-bounds reads.

Strategies: WrongValueType, TypeConfusion

Alignment Poisoning

The general.alignment metadata key controls padding calculations. Setting it to 0 causes division-by-zero in offset % alignment. Setting it to a massive value causes the serializer to attempt allocating gigabytes of padding.

Strategies: ZeroAlignment, HugeAlignment

Deterministic Reproduction¶

Every fuzzing run is seeded with a 64-bit integer from crypto/rand. This seed is:

Logged at the start of each run
Recorded in every crash report
Sufficient to reproduce the exact sequence of mutations

# Reproduce a crash with a known seed
crucible generate --seed 8827361950234 --count 1 --corpus ./corpus

Reproducibility Guarantee

Given the same RNG seed, corpus, and Crucible version, the fuzzer produces byte-identical output files. This makes every crash trivially reproducible for debugging and CVE reporting.