Skip to content

Crash Triage

When the fuzzer finds a crash, the work is only half done. Crucible automates the triage pipeline — classifying crash types, deduplicating by stack trace, scoring severity, and generating actionable reports.

Triage Pipeline

flowchart LR
    A[Crash Files] --> B[Classify]
    B --> C[Deduplicate]
    C --> D[Map CWE]
    D --> E[Score]
    E --> F[Generate Reports]
    F --> G[Export SARIF]
  1. Classify — Parse the sanitizer output to determine the crash type
  2. Deduplicate — Hash the stack trace to group identical crashes
  3. Map CWE — Assign a CWE identifier based on the vulnerability class
  4. Score — Assign a CVSS score based on the vulnerability class
  5. Generate Reports — Produce markdown reports with all details needed for CVE filing
  6. Export SARIF — Optionally write SARIF 2.1.0 output for CI/security tooling integration

Crash Classification

Crucible parses AddressSanitizer (ASAN), MemorySanitizer (MSAN), and UndefinedBehaviorSanitizer (UBSAN) output to classify each crash into a known vulnerability type:

Sanitizer Signal Classification CVSS Score Severity
heap-buffer-overflow HeapOverflow 9.8 Critical
global-buffer-overflow GlobalOverflow 9.8 Critical
heap-use-after-free UseAfterFree 9.8 Critical
double-free DoubleFree 9.8 Critical
Integer overflow (UBSAN) IntegerOverflow 8.8 High
stack-buffer-overflow StackOverflow 7.5 High
FPE (divide by zero) DivByZero 7.5 High
allocator is returning null AllocTooBig 7.5 High
SEGV (null pointer) NullDeref 5.3 Medium
Assertion failure AssertionFailure 5.3 Medium
use-after-poison UseAfterPoison 5.3 Medium
Timeout Timeout 5.3 Medium
out-of-memory OOM 5.3 Medium
UBSAN enum load EnumLoad 3.3 Low
misaligned-access MisalignedAccess 3.3 Low
Unrecognized signal Unknown 0.0 Unknown

CVSS Scores Are Estimates

The automatic CVSS scores reflect the typical severity of each vulnerability class in a file-parsing context. Actual CVSS scoring depends on the specific attack scenario, deployment context, and exploitability. Always perform manual analysis before including a score in a CVE report.

Severity Rationale

HeapOverflow, GlobalOverflow, UseAfterFree, and DoubleFree receive the highest scores because they are reliably exploitable for arbitrary code execution. In the context of GGUF parsing, a malicious model file could achieve remote code execution when a user loads it in any tool that uses llama.cpp.

StackOverflow, IntegerOverflow, DivByZero, and AllocTooBig are serious but may be harder to exploit depending on stack layout, overflow magnitude, and allocator behavior. Integer overflows in allocation size calculations frequently lead to heap overflows, which is why they score 8.8.

NullDeref, AssertionFailure, UseAfterPoison, Timeout, and OOM typically cause denial of service only. They crash the process but rarely provide a path to code execution. Still worth reporting — a malicious model that crashes every inference server is a real threat.

EnumLoad and MisalignedAccess are undefined behavior per the C++ standard but rarely exploitable. EnumLoad occurs when untrusted values are loaded into C++ enums before range validation. MisalignedAccess causes crashes on strict-alignment architectures (e.g., ARM) but is typically harmless on x86.

Stack Hash Deduplication

A single bug often produces thousands of crash files as the fuzzer continues running. Crucible deduplicates by hashing the crash's stack trace.

How It Works

  1. Parse the ASAN/MSAN/UBSAN output from stderr
  2. Extract the top 5 stack frames (function name, source file, line number)
  3. Normalize by removing memory addresses, build paths, and frame numbers
  4. Hash the normalized frames with SHA-256
  5. Group crashes with identical hashes as the same unique bug
Raw ASAN frame:
    #0 0x55a3c2f1e4b7 in gguf_init_from_file /src/llama.cpp/ggml/src/gguf.c:387:21

Normalized:
    gguf_init_from_file|gguf.c|387

Final hash input (top 5 frames joined):
    gguf_init_from_file|gguf.c|387
    ggml_tensor_overhead|ggml.c|2104
    ggml_new_tensor_impl|ggml.c|2285
    ggml_new_tensor|ggml.c|2342
    llama_model_load|llama.cpp:4517

Why Top 5 Frames

Using the full stack trace is too specific — minor code changes shift deep frames. Using only the crash point is too broad — different bugs can crash at the same function. The top 5 frames strike the right balance for grouping related crashes while separating distinct root causes.

Dedup in Practice

crashes/
├── heap_overflow_a1b2c3d4/
│   ├── report.md           ← generated report
│   ├── first.gguf          ← first reproducer found
│   ├── 002.gguf            ← additional reproducers (same hash)
│   └── 003.gguf
├── integer_overflow_e5f6a7b8/
│   ├── report.md
│   └── first.gguf
└── null_deref_c9d0e1f2/
    ├── report.md
    └── first.gguf

Each unique stack hash gets its own directory. The first crash file is preserved as the primary reproducer. Subsequent duplicates are stored but not re-triaged.

Bulk Crash Deduplication

After a long campaign, crash directories can accumulate thousands of duplicate files. The crucible triage dedup command deduplicates an entire directory in one pass:

crucible triage dedup \
  --crashes ./crashes \
  --harness ./crucible-libfuzzer \
  --delete

Replays every crash through the harness, groups by stack hash, and keeps the smallest reproducer per unique bug. Accurate but requires a harness binary and takes longer.

crucible triage dedup \
  --crashes ./crashes \
  --fast \
  --delete

Groups by (file_size, SHA-256 of first 64 bytes) without replaying. No harness needed — useful for quick cleanup when thousands of near-identical files accumulate.

Both modes default to dry-run (report only). Pass --delete to actually remove duplicates.

See the CLI reference for all flags.

Automatic CVSS Scoring

Each crash type maps to a base CVSS v3.1 vector string:

CVSS Mapping
var cvssVectors = map[CrashType]string{
    HeapOverflow:      "CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:U/C:H/I:H/A:H",  // (1)!
    GlobalOverflow:    "CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:U/C:H/I:H/A:H",  // (2)!
    UseAfterFree:      "CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:U/C:H/I:H/A:H",  // (3)!
    DoubleFree:        "CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:U/C:H/I:H/A:H",  // (4)!
    IntegerOverflow:   "CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:U/C:H/I:H/A:H",  // (5)!
    StackOverflow:     "CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:U/C:N/I:N/A:H",  // (6)!
    DivByZero:         "CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:U/C:N/I:N/A:H",  // (7)!
    AllocTooBig:       "CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:U/C:N/I:N/A:H",  // (8)!
    NullDeref:         "CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:U/C:N/I:N/A:L",  // (9)!
    AssertionFailure:  "CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:U/C:N/I:N/A:L",  // (10)!
    UseAfterPoison:    "CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:U/C:N/I:N/A:L",  // (11)!
    Timeout:           "CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:U/C:N/I:N/A:L",  // (12)!
    OOM:               "CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:U/C:N/I:N/A:L",  // (13)!
    EnumLoad:          "CVSS:3.1/AV:N/AC:H/PR:N/UI:R/S:U/C:N/I:N/A:L",  // (14)!
    MisalignedAccess:  "CVSS:3.1/AV:N/AC:H/PR:N/UI:R/S:U/C:N/I:N/A:L",  // (15)!
    Unknown:           "",                                                 // (16)!
}
  1. 9.8 Critical — heap corruption enables arbitrary code execution
  2. 9.8 Critical — global buffer overflow enables arbitrary code execution
  3. 9.8 Critical — use-after-free enables arbitrary code execution
  4. 9.8 Critical — double-free enables arbitrary code execution
  5. 8.8 High — integer overflow often leads to heap corruption
  6. 7.5 High — stack smashing can enable code execution
  7. 7.5 High — hardware divide-by-zero causes unconditional crash (x86_64)
  8. 7.5 High — oversized allocation causes denial of service
  9. 5.3 Medium — null dereference causes denial of service
  10. 5.3 Medium — assertion failure causes denial of service
  11. 5.3 Medium — use-after-poison indicates memory safety violation
  12. 5.3 Medium — execution timeout indicates algorithmic complexity issue
  13. 5.3 Medium — out-of-memory indicates unbounded allocation
  14. 3.3 Low — undefined behavior from invalid enum value load
  15. 3.3 Low — misaligned memory access (crashes on strict-alignment architectures)
  16. 0.0 Unknown — unrecognized crash signal

The attack vector is Network with User Interaction Required because the typical attack scenario is: attacker publishes a malicious GGUF model, victim downloads and loads it.

CWE Classification

Every crash type maps to a CWE (Common Weakness Enumeration) identifier. The CWE is included in generated reports and SARIF output.

Crash Type CWE Name
HeapOverflow CWE-122 Heap-based Buffer Overflow
GlobalOverflow CWE-120 Buffer Copy without Checking Size of Input
StackOverflow CWE-121 Stack-based Buffer Overflow
UseAfterFree CWE-416 Use After Free
DoubleFree CWE-415 Double Free
NullDeref CWE-476 NULL Pointer Dereference
IntegerOverflow CWE-190 Integer Overflow or Wraparound
DivByZero CWE-369 Divide By Zero
AssertionFailure CWE-617 Reachable Assertion
OOM CWE-400 Uncontrolled Resource Consumption
AllocTooBig CWE-789 Memory Allocation with Excessive Size
Timeout CWE-400 Uncontrolled Resource Consumption
UseAfterPoison CWE-416 Use After Free
EnumLoad CWE-843 Access of Resource Using Incompatible Type
MisalignedAccess CWE-188 Reliance on Data/Memory Layout

The CWEForType() function in pkg/triage performs this mapping. See the API reference for details.

Report Generation

For each unique crash, Crucible generates a structured markdown report containing everything needed to file a CVE or security advisory:

Example: report.md
# Crash Report: HeapOverflow in gguf_init_from_file

| Field | Value |
|-------|-------|
| **Type** | Heap Buffer Overflow |
| **CWE** | CWE-122 — Heap-based Buffer Overflow |
| **Location** | gguf.c:387 |
| **Function** | gguf_init_from_file |
| **CVSS Score** | 9.8 Critical |
| **CVSS Vector** | CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:U/C:H/I:H/A:H |
| **Stack Hash** | a1b2c3d4e5f6a7b8 |
| **Target** | gguf |
| **Harness** | ./harness/libfuzzer/crucible-libfuzzer |
| **Minimized** | reports/minimized/crash-a1b2c3d4 |
| **Found** | 2026-04-01T14:23:07Z |
| **Seed** | 8827361950234 |
| **Crucible Version** | 0.1.0 |

## Reproducer

    ./crucible-libfuzzer crashes/heap_overflow_a1b2c3d4/first.gguf

## ASAN Output

    ==12345==ERROR: AddressSanitizer: heap-buffer-overflow on address ...
    READ of size 4 at 0x... thread T0
        #0 gguf_init_from_file gguf.c:387
        #1 ggml_tensor_overhead ggml.c:2104
        ...

## CVE Template

**Title:** Heap buffer overflow in llama.cpp GGUF metadata parsing

**Description:** A heap buffer overflow exists in the GGUF file parser
of llama.cpp. A specially crafted .gguf file can trigger an out-of-bounds
read/write in gguf_init_from_file when parsing metadata values. An attacker
could exploit this by distributing a malicious model file.

**Affected:** llama.cpp (version), Ollama, LM Studio, and any application
using the ggml GGUF parser.

CVE Template

The CVE template section provides a starting point for responsible disclosure. Always verify the details, confirm affected versions, and coordinate with the maintainers before filing.

SARIF Export

Crucible can export triage results in SARIF 2.1.0 (Static Analysis Results Interchange Format), a standard consumed by GitHub Code Scanning, VS Code SARIF Viewer, and other security tools.

crucible-triage --crashes ./crashes --output ./reports --sarif ./results.sarif

The SARIF file contains:

  • Tool metadata — Crucible name and version as the analysis tool
  • Rules — One rule per crash type (e.g., heap-buffer-overflow), with full description and CWE reference
  • Results — One result per unique crash, with:
    • Source location (file path, line number)
    • Severity level (error, warning, note) mapped from CVSS score
    • Fingerprints: crucible/stackHash/v1 (stack hash) and crucible/target (target surface) for cross-run deduplication
    • CWE reference for the crash type
  • Rule tags — Each rule includes security, CWE ID, and target/<surface> tags
flowchart LR
    A[Crash Reports] --> B[WriteSARIF]
    B --> C[results.sarif]
    C --> D[GitHub Code Scanning]
    C --> E[VS Code SARIF Viewer]
    C --> F[Other SARIF Consumers]

Integration with the Fuzzer

Crash triage runs automatically during fuzzing. When the target process exits with a non-zero status and produces sanitizer output:

flowchart TD
    A[Target Process Exits] --> B{Exit Code != 0?}
    B -->|No| C[Continue Fuzzing]
    B -->|Yes| D{Sanitizer Output?}
    D -->|No| E[Log as Timeout/Unknown]
    D -->|Yes| F[Parse Crash Type]
    F --> G[Compute Stack Hash]
    G --> H{Hash Seen Before?}
    H -->|Yes| I[Store as Duplicate]
    H -->|No| J[Create Report Directory]
    J --> K[Generate Report]
    K --> L[Log New Unique Crash]
    L --> C
    I --> C

The fuzzer tracks unique crash count in real time. The summary printed at the end of a run includes:

  • Total crashes found
  • Unique crashes (by stack hash)
  • Breakdown by crash type and severity
  • Paths to all generated reports