Crash Triage¶

When the fuzzer finds a crash, the work is only half done. Crucible automates the triage pipeline — classifying crash types, deduplicating by stack trace, scoring severity, and generating actionable reports.

Triage Pipeline¶

flowchart LR
    A[Crash Files] --> B[Classify]
    B --> C[Deduplicate]
    C --> D[Map CWE]
    D --> E[Score]
    E --> F[Generate Reports]
    F --> G[Export SARIF]

Classify — Parse the sanitizer output to determine the crash type
Deduplicate — Hash the stack trace to group identical crashes
Map CWE — Assign a CWE identifier based on the vulnerability class
Score — Assign a CVSS score based on the vulnerability class
Generate Reports — Produce markdown reports with all details needed for CVE filing
Export SARIF — Optionally write SARIF 2.1.0 output for CI/security tooling integration

Crash Classification¶

Crucible parses AddressSanitizer (ASAN), MemorySanitizer (MSAN), and UndefinedBehaviorSanitizer (UBSAN) output to classify each crash into a known vulnerability type:

Sanitizer Signal	Classification	CVSS Score	Severity
`heap-buffer-overflow`	HeapOverflow	9.8	Critical
`global-buffer-overflow`	GlobalOverflow	9.8	Critical
`heap-use-after-free`	UseAfterFree	9.8	Critical
`double-free`	DoubleFree	9.8	Critical
Integer overflow (UBSAN)	IntegerOverflow	8.8	High
`stack-buffer-overflow`	StackOverflow	7.5	High
`FPE` (divide by zero)	DivByZero	7.5	High
`allocator is returning null`	AllocTooBig	7.5	High
`SEGV` (null pointer)	NullDeref	5.3	Medium
Assertion failure	AssertionFailure	5.3	Medium
`use-after-poison`	UseAfterPoison	5.3	Medium
Timeout	Timeout	5.3	Medium
`out-of-memory`	OOM	5.3	Medium
UBSAN enum load	EnumLoad	3.3	Low
`misaligned-access`	MisalignedAccess	3.3	Low
Unrecognized signal	Unknown	0.0	Unknown

CVSS Scores Are Estimates

The automatic CVSS scores reflect the typical severity of each vulnerability class in a file-parsing context. Actual CVSS scoring depends on the specific attack scenario, deployment context, and exploitability. Always perform manual analysis before including a score in a CVE report.

Severity Rationale¶

Critical (9.8)High (7.5-8.8)Medium (5.3)Low (3.3)

HeapOverflow, GlobalOverflow, UseAfterFree, and DoubleFree receive the highest scores because they are reliably exploitable for arbitrary code execution. In the context of GGUF parsing, a malicious model file could achieve remote code execution when a user loads it in any tool that uses llama.cpp.

StackOverflow, IntegerOverflow, DivByZero, and AllocTooBig are serious but may be harder to exploit depending on stack layout, overflow magnitude, and allocator behavior. Integer overflows in allocation size calculations frequently lead to heap overflows, which is why they score 8.8.

NullDeref, AssertionFailure, UseAfterPoison, Timeout, and OOM typically cause denial of service only. They crash the process but rarely provide a path to code execution. Still worth reporting — a malicious model that crashes every inference server is a real threat.

EnumLoad and MisalignedAccess are undefined behavior per the C++ standard but rarely exploitable. EnumLoad occurs when untrusted values are loaded into C++ enums before range validation. MisalignedAccess causes crashes on strict-alignment architectures (e.g., ARM) but is typically harmless on x86.

Stack Hash Deduplication¶

A single bug often produces thousands of crash files as the fuzzer continues running. Crucible deduplicates by hashing the crash's stack trace.

How It Works¶

Parse the ASAN/MSAN/UBSAN output from stderr
Extract the top 5 stack frames (function name, source file, line number)
Normalize by removing memory addresses, build paths, and frame numbers
Hash the normalized frames with SHA-256
Group crashes with identical hashes as the same unique bug

Raw ASAN frame:
    #0 0x55a3c2f1e4b7 in gguf_init_from_file /src/llama.cpp/ggml/src/gguf.c:387:21

Normalized:
    gguf_init_from_file|gguf.c|387

Final hash input (top 5 frames joined):
    gguf_init_from_file|gguf.c|387
    ggml_tensor_overhead|ggml.c|2104
    ggml_new_tensor_impl|ggml.c|2285
    ggml_new_tensor|ggml.c|2342
    llama_model_load|llama.cpp:4517

Why Top 5 Frames

Using the full stack trace is too specific — minor code changes shift deep frames. Using only the crash point is too broad — different bugs can crash at the same function. The top 5 frames strike the right balance for grouping related crashes while separating distinct root causes.

Dedup in Practice¶

crashes/
├── heap_overflow_a1b2c3d4/
│   ├── report.md           ← generated report
│   ├── first.gguf          ← first reproducer found
│   ├── 002.gguf            ← additional reproducers (same hash)
│   └── 003.gguf
├── integer_overflow_e5f6a7b8/
│   ├── report.md
│   └── first.gguf
└── null_deref_c9d0e1f2/
    ├── report.md
    └── first.gguf

Each unique stack hash gets its own directory. The first crash file is preserved as the primary reproducer. Subsequent duplicates are stored but not re-triaged.

Bulk Crash Deduplication¶

After a long campaign, crash directories can accumulate thousands of duplicate files. The crucible triage dedup command deduplicates an entire directory in one pass:

Full Mode (stack hash)Fast Mode (content fingerprint)

crucible triage dedup \
  --crashes ./crashes \
  --harness ./crucible-libfuzzer \
  --delete

Replays every crash through the harness, groups by stack hash, and keeps the smallest reproducer per unique bug. Accurate but requires a harness binary and takes longer.

crucible triage dedup \
  --crashes ./crashes \
  --fast \
  --delete

Groups by (file_size, SHA-256 of first 64 bytes) without replaying. No harness needed — useful for quick cleanup when thousands of near-identical files accumulate.

Both modes default to dry-run (report only). Pass --delete to actually remove duplicates.

See the CLI reference for all flags.

Automatic CVSS Scoring¶

Each crash type maps to a base CVSS v3.1 vector string:

CVSS Mapping

var cvssVectors = map[CrashType]string{
    HeapOverflow:      "CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:U/C:H/I:H/A:H",  // (1)!
    GlobalOverflow:    "CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:U/C:H/I:H/A:H",  // (2)!
    UseAfterFree:      "CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:U/C:H/I:H/A:H",  // (3)!
    DoubleFree:        "CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:U/C:H/I:H/A:H",  // (4)!
    IntegerOverflow:   "CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:U/C:H/I:H/A:H",  // (5)!
    StackOverflow:     "CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:U/C:N/I:N/A:H",  // (6)!
    DivByZero:         "CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:U/C:N/I:N/A:H",  // (7)!
    AllocTooBig:       "CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:U/C:N/I:N/A:H",  // (8)!
    NullDeref:         "CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:U/C:N/I:N/A:L",  // (9)!
    AssertionFailure:  "CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:U/C:N/I:N/A:L",  // (10)!
    UseAfterPoison:    "CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:U/C:N/I:N/A:L",  // (11)!
    Timeout:           "CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:U/C:N/I:N/A:L",  // (12)!
    OOM:               "CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:U/C:N/I:N/A:L",  // (13)!
    EnumLoad:          "CVSS:3.1/AV:N/AC:H/PR:N/UI:R/S:U/C:N/I:N/A:L",  // (14)!
    MisalignedAccess:  "CVSS:3.1/AV:N/AC:H/PR:N/UI:R/S:U/C:N/I:N/A:L",  // (15)!
    Unknown:           "",                                                 // (16)!
}

9.8 Critical — heap corruption enables arbitrary code execution
9.8 Critical — global buffer overflow enables arbitrary code execution
9.8 Critical — use-after-free enables arbitrary code execution
9.8 Critical — double-free enables arbitrary code execution
8.8 High — integer overflow often leads to heap corruption
7.5 High — stack smashing can enable code execution
7.5 High — hardware divide-by-zero causes unconditional crash (x86_64)
7.5 High — oversized allocation causes denial of service
5.3 Medium — null dereference causes denial of service
5.3 Medium — assertion failure causes denial of service
5.3 Medium — use-after-poison indicates memory safety violation
5.3 Medium — execution timeout indicates algorithmic complexity issue
5.3 Medium — out-of-memory indicates unbounded allocation
3.3 Low — undefined behavior from invalid enum value load
3.3 Low — misaligned memory access (crashes on strict-alignment architectures)
0.0 Unknown — unrecognized crash signal

The attack vector is Network with User Interaction Required because the typical attack scenario is: attacker publishes a malicious GGUF model, victim downloads and loads it.

CWE Classification¶

Every crash type maps to a CWE (Common Weakness Enumeration) identifier. The CWE is included in generated reports and SARIF output.

Crash Type	CWE	Name
HeapOverflow	CWE-122	Heap-based Buffer Overflow
GlobalOverflow	CWE-120	Buffer Copy without Checking Size of Input
StackOverflow	CWE-121	Stack-based Buffer Overflow
UseAfterFree	CWE-416	Use After Free
DoubleFree	CWE-415	Double Free
NullDeref	CWE-476	NULL Pointer Dereference
IntegerOverflow	CWE-190	Integer Overflow or Wraparound
DivByZero	CWE-369	Divide By Zero
AssertionFailure	CWE-617	Reachable Assertion
OOM	CWE-400	Uncontrolled Resource Consumption
AllocTooBig	CWE-789	Memory Allocation with Excessive Size
Timeout	CWE-400	Uncontrolled Resource Consumption
UseAfterPoison	CWE-416	Use After Free
EnumLoad	CWE-843	Access of Resource Using Incompatible Type
MisalignedAccess	CWE-188	Reliance on Data/Memory Layout

The CWEForType() function in pkg/triage performs this mapping. See the API reference for details.

Report Generation¶

For each unique crash, Crucible generates a structured markdown report containing everything needed to file a CVE or security advisory:

Example: report.md

# Crash Report: HeapOverflow in gguf_init_from_file

| Field | Value |
|-------|-------|
| **Type** | Heap Buffer Overflow |
| **CWE** | CWE-122 — Heap-based Buffer Overflow |
| **Location** | gguf.c:387 |
| **Function** | gguf_init_from_file |
| **CVSS Score** | 9.8 Critical |
| **CVSS Vector** | CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:U/C:H/I:H/A:H |
| **Stack Hash** | a1b2c3d4e5f6a7b8 |
| **Target** | gguf |
| **Harness** | ./harness/libfuzzer/crucible-libfuzzer |
| **Minimized** | reports/minimized/crash-a1b2c3d4 |
| **Found** | 2026-04-01T14:23:07Z |
| **Seed** | 8827361950234 |
| **Crucible Version** | 0.1.0 |

## Reproducer

    ./crucible-libfuzzer crashes/heap_overflow_a1b2c3d4/first.gguf

## ASAN Output

    ==12345==ERROR: AddressSanitizer: heap-buffer-overflow on address ...
    READ of size 4 at 0x... thread T0
        #0 gguf_init_from_file gguf.c:387
        #1 ggml_tensor_overhead ggml.c:2104
        ...

## CVE Template

**Title:** Heap buffer overflow in llama.cpp GGUF metadata parsing

**Description:** A heap buffer overflow exists in the GGUF file parser
of llama.cpp. A specially crafted .gguf file can trigger an out-of-bounds
read/write in gguf_init_from_file when parsing metadata values. An attacker
could exploit this by distributing a malicious model file.

**Affected:** llama.cpp (version), Ollama, LM Studio, and any application
using the ggml GGUF parser.

CVE Template

The CVE template section provides a starting point for responsible disclosure. Always verify the details, confirm affected versions, and coordinate with the maintainers before filing.

SARIF Export¶

Crucible can export triage results in SARIF 2.1.0 (Static Analysis Results Interchange Format), a standard consumed by GitHub Code Scanning, VS Code SARIF Viewer, and other security tools.

crucible-triage --crashes ./crashes --output ./reports --sarif ./results.sarif

The SARIF file contains:

Tool metadata — Crucible name and version as the analysis tool
Rules — One rule per crash type (e.g., heap-buffer-overflow), with full description and CWE reference
Results — One result per unique crash, with:
- Source location (file path, line number)
- Severity level (error, warning, note) mapped from CVSS score
- Fingerprints: crucible/stackHash/v1 (stack hash) and crucible/target (target surface) for cross-run deduplication
- CWE reference for the crash type
Rule tags — Each rule includes security, CWE ID, and target/<surface> tags

flowchart LR
    A[Crash Reports] --> B[WriteSARIF]
    B --> C[results.sarif]
    C --> D[GitHub Code Scanning]
    C --> E[VS Code SARIF Viewer]
    C --> F[Other SARIF Consumers]

Integration with the Fuzzer¶

Crash triage runs automatically during fuzzing. When the target process exits with a non-zero status and produces sanitizer output:

flowchart TD
    A[Target Process Exits] --> B{Exit Code != 0?}
    B -->|No| C[Continue Fuzzing]
    B -->|Yes| D{Sanitizer Output?}
    D -->|No| E[Log as Timeout/Unknown]
    D -->|Yes| F[Parse Crash Type]
    F --> G[Compute Stack Hash]
    G --> H{Hash Seen Before?}
    H -->|Yes| I[Store as Duplicate]
    H -->|No| J[Create Report Directory]
    J --> K[Generate Report]
    K --> L[Log New Unique Crash]
    L --> C
    I --> C

The fuzzer tracks unique crash count in real time. The summary printed at the end of a run includes:

Total crashes found
Unique crashes (by stack hash)
Breakdown by crash type and severity
Paths to all generated reports