CLI Reference¶
hemlock ships as a single static binary organized into three engagement modes plus a top-level set of primitives. All commands write to stdout or to an output directory and return zero on success.
The mode-prefixed subcommands are aliases of the corresponding top-level commands plus engagement-scoping warnings. attack-mode subcommands emit a "running unscoped" warning unless ./.hemlock-engagement.toml is present, HEMLOCK_UNSCOPED_OK=1 is set, or --unscoped hemlock attack is passed. defend and research modes are unscoped by default.
Top-level primitives¶
The original generation + listing surface, callable directly without a mode prefix.
| Command | Description |
|---|---|
craft |
Generate poisoned documents for a single format and technique |
batch |
Generate poisoned documents across all formats and all techniques |
validate |
Simulate RAG extraction and check whether a payload survives |
run |
Fire crafted documents at a target RAG pipe and capture per-document outcomes |
score |
Compute composite stealth/survival rating for a technique |
list-techniques |
Display available hiding techniques with stealth scores |
list-payloads |
Display available payload templates and descriptions |
stack |
Bring up / tear down the canonical four-framework RAG battery (Docker-compose) |
Mode: attack¶
Offensive operations. Engagement-scoped output suitable for engagement-grade deliverables.
| Subcommand | Description |
|---|---|
attack craft |
Same as craft, with engagement-scoping warning |
attack batch |
Same as batch, with engagement-scoping warning |
attack run |
Same as run, with engagement-scoping warning |
attack score |
Same as score, with engagement-scoping warning |
attack optimize |
Run CEM / genetic / whitebox bound to a threat-model declaration |
attack report |
Render run-results JSONL as markdown / HTML / SARIF |
attack engagement init <id> |
Create a directory-local engagement context (./.hemlock-engagement.toml) |
attack engagement show |
Print and verify the current engagement context |
Mode: defend¶
Defensive operations. Operator-actionable output.
| Subcommand | Description |
|---|---|
defend detect |
Run strict-canary detection on collected response samples |
defend monitor |
Real-time detection middleware (HTTP reverse proxy) |
defend report |
Render run-results JSONL as a defensive operator brief |
Mode: research¶
Research operations. Reproducibility- and rigor-oriented output.
| Subcommand | Description |
|---|---|
research prereg freeze |
SHA-256 lock a pre-registration document |
research prereg verify |
Re-hash and confirm a frozen document is unchanged |
research prereg list |
Print the per-user prereg-history file |
research deposit build |
Build an artifact-evaluation deposit manifest (JSON + Markdown + verifier) |
research deposit verify |
Re-hash a deposit and report any divergence |
research reproduce |
Replay a deposit's run-results through the current detector |
Common workflows¶
Local end-to-end attack loop¶
Bring up the four-framework stack locally, generate poisoned documents, fire them at one of the pipes, render the result:
hemlock stack up
hemlock batch --count 1 --output ./engagement
hemlock run \
--manifest ./engagement/.hemlock-manifest.json \
--target langchain \
--query "What is the answer?" \
--output ./engagement/run-results.jsonl
hemlock attack report \
--input ./engagement/run-results.jsonl \
--format markdown \
--output ./engagement/deliverable.md
Defensive monitoring¶
Front a deployment with the detector and watch for canary hits:
hemlock defend monitor \
--upstream http://your-rag-endpoint:8100 \
--listen :8200 \
--audit-log ./monitor-audit.jsonl
Engagement-grade campaign¶
Initialize an engagement, declare a threat model, run the optimizer with budget cap, fire results, render a SARIF deliverable:
hemlock attack engagement init EX-2026-001
hemlock attack optimize init
# edit ./.hemlock-threatmodel.toml
hemlock attack optimize validate
hemlock attack optimize run \
--target-query "What is the security policy?" \
--embed-provider ollama \
--output ./out
hemlock attack run \
--manifest ./out/.hemlock-manifest.json \
--target langchain \
--query "What is the security policy?" \
--output ./out/run-results.jsonl
hemlock attack report \
--input ./out/run-results.jsonl \
--format sarif \
--output ./out/results.sarif
Output format¶
Commands that produce documents print a summary banner to stderr:
[hemlock] Generated 20 documents in ./output
poisoned-comment-001.html (stealth: 30)
poisoned-comment-002.html (stealth: 30)
...
poisoned-css-hide-005.html (stealth: 75)
Each line shows the filename and the stealth score (0–100) for the technique used. Higher scores indicate techniques that are harder to detect through visual inspection or basic content filtering.
Commands that emit JSONL traces (run) prepend a header row with provenance (hemlock_sha, mode, operator identity, engagement chain SHA-256) so the artifact alone is sufficient to reproduce the run context.
Global behavior¶
- No network access by default. hemlock never phones home, fetches templates, or resolves external resources. All generation is local. The exceptions:
--embed-provider openai|ollamamakes outbound API calls for embedding-based similarity scoring;runmakes calls to the configured--targetpipe;monitorreverse-proxies to the configured--upstream. - Deterministic output. Given the same flags and seeds, hemlock produces identical documents. Variant numbering is sequential.
- Exit codes.
0on success,1on usage or execution errors,2on validation non-survival (validatereports payload not found).
Shell completion
hemlock supports shell completion for Bash, Zsh, Fish, and PowerShell via the underlying Cobra framework:
JSON output
Most commands accept --json to emit structured output instead of the human-readable summary, suitable for pipeline integration.