Skip to content

Writing Fuzz Harnesses

A harness is a thin wrapper that feeds Crucible-generated inputs to a target parser. Crucible provides ready-made harnesses for libFuzzer, AFL++, and Go native fuzzing, but you may need custom harnesses for new targets.

Harness Architecture

graph LR
    A[Crucible Corpus] --> B[Fuzzer Engine]
    B --> C[Harness]
    C --> D[Target Parser]
    D --> E{Crash?}
    E -->|Yes| F[ASAN Output]
    E -->|No| B

The harness is responsible for:

  1. Receiving fuzzer input (byte array)
  2. Writing it to a temp file (most GGUF parsers take file paths)
  3. Calling the target parser function
  4. Exercising additional code paths if parsing succeeds
  5. Cleaning up

libFuzzer Harness Template

harness.c
#include <stddef.h>
#include <stdint.h>
#include <unistd.h>
#include "ggml.h"  // Target's header

int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
    if (size < 24) return 0;  // Min GGUF header size

    // Write to temp file
    char path[] = "/tmp/fuzz-XXXXXX";
    int fd = mkstemp(path);
    if (fd < 0) return 0;
    write(fd, data, size);
    close(fd);

    // Call target function
    // TODO: Replace with your target's GGUF loading function
    struct gguf_init_params params = {.no_alloc = true, .ctx = NULL};
    struct gguf_context *ctx = gguf_init_from_file(path, params);

    if (ctx) {
        // Exercise additional paths
        // TODO: Call getters, iterate metadata, etc.
        gguf_free(ctx);
    }

    unlink(path);
    return 0;
}

Build with:

clang -g -O1 -fsanitize=fuzzer,address,undefined \
  target_sources.c harness.c \
  -o my-harness -lstdc++ -lm

AFL++ Harness Template

harness.c
#include "ggml.h"

__AFL_FUZZ_INIT();

int main(void) {
    __AFL_INIT();
    unsigned char *buf = __AFL_FUZZ_TESTCASE_BUF;

    while (__AFL_LOOP(10000)) {
        int len = __AFL_FUZZ_TESTCASE_LEN;
        if (len < 24) continue;

        char path[] = "/tmp/afl-XXXXXX";
        int fd = mkstemp(path);
        write(fd, buf, len);
        close(fd);

        // TODO: Call your target parser
        struct gguf_init_params p = {.no_alloc = true, .ctx = NULL};
        struct gguf_context *ctx = gguf_init_from_file(path, p);
        if (ctx) gguf_free(ctx);
        unlink(path);
    }
    return 0;
}

Go Native Fuzz Template

fuzz_test.go
func FuzzMyParser(f *testing.F) {
    // Add seeds
    f.Add(minimalGGUFBytes)

    f.Fuzz(func(t *testing.T, data []byte) {
        // TODO: Call your Go GGUF parser
        result, err := myparser.Parse(data)
        if err != nil {
            return // Invalid input, expected
        }
        // Exercise parsed result
        _ = result
    })
}

Key Considerations

Sanitizers

Always compile with sanitizers enabled:

Sanitizer Flag Detects
AddressSanitizer -fsanitize=address Heap/stack overflows, use-after-free
UndefinedBehaviorSanitizer -fsanitize=undefined Integer overflow, null deref
MemorySanitizer -fsanitize=memory Uninitialized reads

ASAN + AFL++

For AFL++, use AFL_USE_ASAN=1 environment variable instead of passing -fsanitize=address directly.

Performance

  • Use no_alloc = true to skip tensor data allocation — still hits all parsing bugs
  • Cap iteration counts (e.g., i < 100) to avoid timeouts on malformed files with huge counts
  • Clean up temp files to avoid filling /tmp/

Coverage

Exercise as many code paths as possible after successful parsing:

  • Iterate metadata keys and call type-specific getters
  • Iterate tensor info and read names/offsets
  • Call any validation or size-calculation functions