Writing Fuzz Harnesses¶
A harness is a thin wrapper that feeds Crucible-generated inputs to a target parser. Crucible provides ready-made harnesses for libFuzzer, AFL++, and Go native fuzzing, but you may need custom harnesses for new targets.
Harness Architecture¶
graph LR
A[Crucible Corpus] --> B[Fuzzer Engine]
B --> C[Harness]
C --> D[Target Parser]
D --> E{Crash?}
E -->|Yes| F[ASAN Output]
E -->|No| B The harness is responsible for:
- Receiving fuzzer input (byte array)
- Writing it to a temp file (most GGUF parsers take file paths)
- Calling the target parser function
- Exercising additional code paths if parsing succeeds
- Cleaning up
libFuzzer Harness Template¶
harness.c
#include <stddef.h>
#include <stdint.h>
#include <unistd.h>
#include "ggml.h" // Target's header
int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
if (size < 24) return 0; // Min GGUF header size
// Write to temp file
char path[] = "/tmp/fuzz-XXXXXX";
int fd = mkstemp(path);
if (fd < 0) return 0;
write(fd, data, size);
close(fd);
// Call target function
// TODO: Replace with your target's GGUF loading function
struct gguf_init_params params = {.no_alloc = true, .ctx = NULL};
struct gguf_context *ctx = gguf_init_from_file(path, params);
if (ctx) {
// Exercise additional paths
// TODO: Call getters, iterate metadata, etc.
gguf_free(ctx);
}
unlink(path);
return 0;
}
Build with:
clang -g -O1 -fsanitize=fuzzer,address,undefined \
target_sources.c harness.c \
-o my-harness -lstdc++ -lm
AFL++ Harness Template¶
harness.c
#include "ggml.h"
__AFL_FUZZ_INIT();
int main(void) {
__AFL_INIT();
unsigned char *buf = __AFL_FUZZ_TESTCASE_BUF;
while (__AFL_LOOP(10000)) {
int len = __AFL_FUZZ_TESTCASE_LEN;
if (len < 24) continue;
char path[] = "/tmp/afl-XXXXXX";
int fd = mkstemp(path);
write(fd, buf, len);
close(fd);
// TODO: Call your target parser
struct gguf_init_params p = {.no_alloc = true, .ctx = NULL};
struct gguf_context *ctx = gguf_init_from_file(path, p);
if (ctx) gguf_free(ctx);
unlink(path);
}
return 0;
}
Go Native Fuzz Template¶
fuzz_test.go
func FuzzMyParser(f *testing.F) {
// Add seeds
f.Add(minimalGGUFBytes)
f.Fuzz(func(t *testing.T, data []byte) {
// TODO: Call your Go GGUF parser
result, err := myparser.Parse(data)
if err != nil {
return // Invalid input, expected
}
// Exercise parsed result
_ = result
})
}
Key Considerations¶
Sanitizers¶
Always compile with sanitizers enabled:
| Sanitizer | Flag | Detects |
|---|---|---|
| AddressSanitizer | -fsanitize=address | Heap/stack overflows, use-after-free |
| UndefinedBehaviorSanitizer | -fsanitize=undefined | Integer overflow, null deref |
| MemorySanitizer | -fsanitize=memory | Uninitialized reads |
ASAN + AFL++
For AFL++, use AFL_USE_ASAN=1 environment variable instead of passing -fsanitize=address directly.
Performance¶
- Use
no_alloc = trueto skip tensor data allocation — still hits all parsing bugs - Cap iteration counts (e.g.,
i < 100) to avoid timeouts on malformed files with huge counts - Clean up temp files to avoid filling
/tmp/
Coverage¶
Exercise as many code paths as possible after successful parsing:
- Iterate metadata keys and call type-specific getters
- Iterate tensor info and read names/offsets
- Call any validation or size-calculation functions