Crash Triage¶
When the fuzzer finds a crash, the work is only half done. Crucible automates the triage pipeline — classifying crash types, deduplicating by stack trace, scoring severity, and generating actionable reports.
Triage Pipeline¶
flowchart LR
A[Crash Files] --> B[Classify]
B --> C[Deduplicate]
C --> D[Map CWE]
D --> E[Score]
E --> F[Generate Reports]
F --> G[Export SARIF] - Classify — Parse the sanitizer output to determine the crash type
- Deduplicate — Hash the stack trace to group identical crashes
- Map CWE — Assign a CWE identifier based on the vulnerability class
- Score — Assign a CVSS score based on the vulnerability class
- Generate Reports — Produce markdown reports with all details needed for CVE filing
- Export SARIF — Optionally write SARIF 2.1.0 output for CI/security tooling integration
Crash Classification¶
Crucible parses AddressSanitizer (ASAN), MemorySanitizer (MSAN), and UndefinedBehaviorSanitizer (UBSAN) output to classify each crash into a known vulnerability type:
| Sanitizer Signal | Classification | CVSS Score | Severity |
|---|---|---|---|
heap-buffer-overflow | HeapOverflow | 9.8 | Critical |
global-buffer-overflow | GlobalOverflow | 9.8 | Critical |
heap-use-after-free | UseAfterFree | 9.8 | Critical |
double-free | DoubleFree | 9.8 | Critical |
| Integer overflow (UBSAN) | IntegerOverflow | 8.8 | High |
stack-buffer-overflow | StackOverflow | 7.5 | High |
FPE (divide by zero) | DivByZero | 7.5 | High |
allocator is returning null | AllocTooBig | 7.5 | High |
SEGV (null pointer) | NullDeref | 5.3 | Medium |
| Assertion failure | AssertionFailure | 5.3 | Medium |
use-after-poison | UseAfterPoison | 5.3 | Medium |
| Timeout | Timeout | 5.3 | Medium |
out-of-memory | OOM | 5.3 | Medium |
| UBSAN enum load | EnumLoad | 3.3 | Low |
misaligned-access | MisalignedAccess | 3.3 | Low |
| Unrecognized signal | Unknown | 0.0 | Unknown |
CVSS Scores Are Estimates
The automatic CVSS scores reflect the typical severity of each vulnerability class in a file-parsing context. Actual CVSS scoring depends on the specific attack scenario, deployment context, and exploitability. Always perform manual analysis before including a score in a CVE report.
Severity Rationale¶
HeapOverflow, GlobalOverflow, UseAfterFree, and DoubleFree receive the highest scores because they are reliably exploitable for arbitrary code execution. In the context of GGUF parsing, a malicious model file could achieve remote code execution when a user loads it in any tool that uses llama.cpp.
StackOverflow, IntegerOverflow, DivByZero, and AllocTooBig are serious but may be harder to exploit depending on stack layout, overflow magnitude, and allocator behavior. Integer overflows in allocation size calculations frequently lead to heap overflows, which is why they score 8.8.
NullDeref, AssertionFailure, UseAfterPoison, Timeout, and OOM typically cause denial of service only. They crash the process but rarely provide a path to code execution. Still worth reporting — a malicious model that crashes every inference server is a real threat.
EnumLoad and MisalignedAccess are undefined behavior per the C++ standard but rarely exploitable. EnumLoad occurs when untrusted values are loaded into C++ enums before range validation. MisalignedAccess causes crashes on strict-alignment architectures (e.g., ARM) but is typically harmless on x86.
Stack Hash Deduplication¶
A single bug often produces thousands of crash files as the fuzzer continues running. Crucible deduplicates by hashing the crash's stack trace.
How It Works¶
- Parse the ASAN/MSAN/UBSAN output from stderr
- Extract the top 5 stack frames (function name, source file, line number)
- Normalize by removing memory addresses, build paths, and frame numbers
- Hash the normalized frames with SHA-256
- Group crashes with identical hashes as the same unique bug
Raw ASAN frame:
#0 0x55a3c2f1e4b7 in gguf_init_from_file /src/llama.cpp/ggml/src/gguf.c:387:21
Normalized:
gguf_init_from_file|gguf.c|387
Final hash input (top 5 frames joined):
gguf_init_from_file|gguf.c|387
ggml_tensor_overhead|ggml.c|2104
ggml_new_tensor_impl|ggml.c|2285
ggml_new_tensor|ggml.c|2342
llama_model_load|llama.cpp:4517
Why Top 5 Frames
Using the full stack trace is too specific — minor code changes shift deep frames. Using only the crash point is too broad — different bugs can crash at the same function. The top 5 frames strike the right balance for grouping related crashes while separating distinct root causes.
Dedup in Practice¶
crashes/
├── heap_overflow_a1b2c3d4/
│ ├── report.md ← generated report
│ ├── first.gguf ← first reproducer found
│ ├── 002.gguf ← additional reproducers (same hash)
│ └── 003.gguf
├── integer_overflow_e5f6a7b8/
│ ├── report.md
│ └── first.gguf
└── null_deref_c9d0e1f2/
├── report.md
└── first.gguf
Each unique stack hash gets its own directory. The first crash file is preserved as the primary reproducer. Subsequent duplicates are stored but not re-triaged.
Bulk Crash Deduplication¶
After a long campaign, crash directories can accumulate thousands of duplicate files. The crucible triage dedup command deduplicates an entire directory in one pass:
Replays every crash through the harness, groups by stack hash, and keeps the smallest reproducer per unique bug. Accurate but requires a harness binary and takes longer.
Both modes default to dry-run (report only). Pass --delete to actually remove duplicates.
See the CLI reference for all flags.
Automatic CVSS Scoring¶
Each crash type maps to a base CVSS v3.1 vector string:
var cvssVectors = map[CrashType]string{
HeapOverflow: "CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:U/C:H/I:H/A:H", // (1)!
GlobalOverflow: "CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:U/C:H/I:H/A:H", // (2)!
UseAfterFree: "CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:U/C:H/I:H/A:H", // (3)!
DoubleFree: "CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:U/C:H/I:H/A:H", // (4)!
IntegerOverflow: "CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:U/C:H/I:H/A:H", // (5)!
StackOverflow: "CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:U/C:N/I:N/A:H", // (6)!
DivByZero: "CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:U/C:N/I:N/A:H", // (7)!
AllocTooBig: "CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:U/C:N/I:N/A:H", // (8)!
NullDeref: "CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:U/C:N/I:N/A:L", // (9)!
AssertionFailure: "CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:U/C:N/I:N/A:L", // (10)!
UseAfterPoison: "CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:U/C:N/I:N/A:L", // (11)!
Timeout: "CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:U/C:N/I:N/A:L", // (12)!
OOM: "CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:U/C:N/I:N/A:L", // (13)!
EnumLoad: "CVSS:3.1/AV:N/AC:H/PR:N/UI:R/S:U/C:N/I:N/A:L", // (14)!
MisalignedAccess: "CVSS:3.1/AV:N/AC:H/PR:N/UI:R/S:U/C:N/I:N/A:L", // (15)!
Unknown: "", // (16)!
}
- 9.8 Critical — heap corruption enables arbitrary code execution
- 9.8 Critical — global buffer overflow enables arbitrary code execution
- 9.8 Critical — use-after-free enables arbitrary code execution
- 9.8 Critical — double-free enables arbitrary code execution
- 8.8 High — integer overflow often leads to heap corruption
- 7.5 High — stack smashing can enable code execution
- 7.5 High — hardware divide-by-zero causes unconditional crash (x86_64)
- 7.5 High — oversized allocation causes denial of service
- 5.3 Medium — null dereference causes denial of service
- 5.3 Medium — assertion failure causes denial of service
- 5.3 Medium — use-after-poison indicates memory safety violation
- 5.3 Medium — execution timeout indicates algorithmic complexity issue
- 5.3 Medium — out-of-memory indicates unbounded allocation
- 3.3 Low — undefined behavior from invalid enum value load
- 3.3 Low — misaligned memory access (crashes on strict-alignment architectures)
- 0.0 Unknown — unrecognized crash signal
The attack vector is Network with User Interaction Required because the typical attack scenario is: attacker publishes a malicious GGUF model, victim downloads and loads it.
CWE Classification¶
Every crash type maps to a CWE (Common Weakness Enumeration) identifier. The CWE is included in generated reports and SARIF output.
| Crash Type | CWE | Name |
|---|---|---|
| HeapOverflow | CWE-122 | Heap-based Buffer Overflow |
| GlobalOverflow | CWE-120 | Buffer Copy without Checking Size of Input |
| StackOverflow | CWE-121 | Stack-based Buffer Overflow |
| UseAfterFree | CWE-416 | Use After Free |
| DoubleFree | CWE-415 | Double Free |
| NullDeref | CWE-476 | NULL Pointer Dereference |
| IntegerOverflow | CWE-190 | Integer Overflow or Wraparound |
| DivByZero | CWE-369 | Divide By Zero |
| AssertionFailure | CWE-617 | Reachable Assertion |
| OOM | CWE-400 | Uncontrolled Resource Consumption |
| AllocTooBig | CWE-789 | Memory Allocation with Excessive Size |
| Timeout | CWE-400 | Uncontrolled Resource Consumption |
| UseAfterPoison | CWE-416 | Use After Free |
| EnumLoad | CWE-843 | Access of Resource Using Incompatible Type |
| MisalignedAccess | CWE-188 | Reliance on Data/Memory Layout |
The CWEForType() function in pkg/triage performs this mapping. See the API reference for details.
Report Generation¶
For each unique crash, Crucible generates a structured markdown report containing everything needed to file a CVE or security advisory:
# Crash Report: HeapOverflow in gguf_init_from_file
| Field | Value |
|-------|-------|
| **Type** | Heap Buffer Overflow |
| **CWE** | CWE-122 — Heap-based Buffer Overflow |
| **Location** | gguf.c:387 |
| **Function** | gguf_init_from_file |
| **CVSS Score** | 9.8 Critical |
| **CVSS Vector** | CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:U/C:H/I:H/A:H |
| **Stack Hash** | a1b2c3d4e5f6a7b8 |
| **Target** | gguf |
| **Harness** | ./harness/libfuzzer/crucible-libfuzzer |
| **Minimized** | reports/minimized/crash-a1b2c3d4 |
| **Found** | 2026-04-01T14:23:07Z |
| **Seed** | 8827361950234 |
| **Crucible Version** | 0.1.0 |
## Reproducer
./crucible-libfuzzer crashes/heap_overflow_a1b2c3d4/first.gguf
## ASAN Output
==12345==ERROR: AddressSanitizer: heap-buffer-overflow on address ...
READ of size 4 at 0x... thread T0
#0 gguf_init_from_file gguf.c:387
#1 ggml_tensor_overhead ggml.c:2104
...
## CVE Template
**Title:** Heap buffer overflow in llama.cpp GGUF metadata parsing
**Description:** A heap buffer overflow exists in the GGUF file parser
of llama.cpp. A specially crafted .gguf file can trigger an out-of-bounds
read/write in gguf_init_from_file when parsing metadata values. An attacker
could exploit this by distributing a malicious model file.
**Affected:** llama.cpp (version), Ollama, LM Studio, and any application
using the ggml GGUF parser.
CVE Template
The CVE template section provides a starting point for responsible disclosure. Always verify the details, confirm affected versions, and coordinate with the maintainers before filing.
SARIF Export¶
Crucible can export triage results in SARIF 2.1.0 (Static Analysis Results Interchange Format), a standard consumed by GitHub Code Scanning, VS Code SARIF Viewer, and other security tools.
The SARIF file contains:
- Tool metadata — Crucible name and version as the analysis tool
- Rules — One rule per crash type (e.g.,
heap-buffer-overflow), with full description and CWE reference - Results — One result per unique crash, with:
- Source location (file path, line number)
- Severity level (error, warning, note) mapped from CVSS score
- Fingerprints:
crucible/stackHash/v1(stack hash) andcrucible/target(target surface) for cross-run deduplication - CWE reference for the crash type
- Rule tags — Each rule includes
security, CWE ID, andtarget/<surface>tags
flowchart LR
A[Crash Reports] --> B[WriteSARIF]
B --> C[results.sarif]
C --> D[GitHub Code Scanning]
C --> E[VS Code SARIF Viewer]
C --> F[Other SARIF Consumers] Integration with the Fuzzer¶
Crash triage runs automatically during fuzzing. When the target process exits with a non-zero status and produces sanitizer output:
flowchart TD
A[Target Process Exits] --> B{Exit Code != 0?}
B -->|No| C[Continue Fuzzing]
B -->|Yes| D{Sanitizer Output?}
D -->|No| E[Log as Timeout/Unknown]
D -->|Yes| F[Parse Crash Type]
F --> G[Compute Stack Hash]
G --> H{Hash Seen Before?}
H -->|Yes| I[Store as Duplicate]
H -->|No| J[Create Report Directory]
J --> K[Generate Report]
K --> L[Log New Unique Crash]
L --> C
I --> C The fuzzer tracks unique crash count in real time. The summary printed at the end of a run includes:
- Total crashes found
- Unique crashes (by stack hash)
- Breakdown by crash type and severity
- Paths to all generated reports