CLI Reference¶

hemlock ships as a single static binary organized into three engagement modes plus a top-level set of primitives. All commands write to stdout or to an output directory and return zero on success.

hemlock <command> [flags]
hemlock <mode> <subcommand> [flags]

The mode-prefixed subcommands are aliases of the corresponding top-level commands plus engagement-scoping warnings. attack-mode subcommands emit a "running unscoped" warning unless ./.hemlock-engagement.toml is present, HEMLOCK_UNSCOPED_OK=1 is set, or --unscoped hemlock attack is passed. defend and research modes are unscoped by default.

Top-level primitives¶

The original generation + listing surface, callable directly without a mode prefix.

Command	Description
`craft`	Generate poisoned documents for a single format and technique
`batch`	Generate poisoned documents across all formats and all techniques
`validate`	Simulate RAG extraction and check whether a payload survives
`run`	Fire crafted documents at a target RAG pipe and capture per-document outcomes
`score`	Compute composite stealth/survival rating for a technique
`list-techniques`	Display available hiding techniques with stealth scores
`list-payloads`	Display available payload templates and descriptions
`stack`	Bring up / tear down the canonical four-framework RAG battery (Docker-compose)

Mode: `attack`¶

Offensive operations. Engagement-scoped output suitable for engagement-grade deliverables.

Subcommand	Description
`attack craft`	Same as `craft`, with engagement-scoping warning
`attack batch`	Same as `batch`, with engagement-scoping warning
`attack run`	Same as `run`, with engagement-scoping warning
`attack score`	Same as `score`, with engagement-scoping warning
`attack optimize`	Run CEM / genetic / whitebox bound to a threat-model declaration
`attack report`	Render run-results JSONL as markdown / HTML / SARIF
`attack engagement init <id>`	Create a directory-local engagement context (`./.hemlock-engagement.toml`)
`attack engagement show`	Print and verify the current engagement context

Mode: `defend`¶

Defensive operations. Operator-actionable output.

Subcommand	Description
`defend detect`	Run strict-canary detection on collected response samples
`defend monitor`	Real-time detection middleware (HTTP reverse proxy)
`defend report`	Render run-results JSONL as a defensive operator brief

Mode: `research`¶

Research operations. Reproducibility- and rigor-oriented output.

Subcommand	Description
`research prereg freeze`	SHA-256 lock a pre-registration document
`research prereg verify`	Re-hash and confirm a frozen document is unchanged
`research prereg list`	Print the per-user prereg-history file
`research deposit build`	Build an artifact-evaluation deposit manifest (JSON + Markdown + verifier)
`research deposit verify`	Re-hash a deposit and report any divergence
`research reproduce`	Replay a deposit's run-results through the current detector

Common workflows¶

Local end-to-end attack loop¶

Bring up the four-framework stack locally, generate poisoned documents, fire them at one of the pipes, render the result:

hemlock stack up
hemlock batch --count 1 --output ./engagement
hemlock run \
  --manifest ./engagement/.hemlock-manifest.json \
  --target langchain \
  --query "What is the answer?" \
  --output ./engagement/run-results.jsonl
hemlock attack report \
  --input ./engagement/run-results.jsonl \
  --format markdown \
  --output ./engagement/deliverable.md

Defensive monitoring¶

Front a deployment with the detector and watch for canary hits:

hemlock defend monitor \
  --upstream http://your-rag-endpoint:8100 \
  --listen :8200 \
  --audit-log ./monitor-audit.jsonl

Engagement-grade campaign¶

Initialize an engagement, declare a threat model, run the optimizer with budget cap, fire results, render a SARIF deliverable:

hemlock attack engagement init EX-2026-001
hemlock attack optimize init
# edit ./.hemlock-threatmodel.toml
hemlock attack optimize validate
hemlock attack optimize run \
  --target-query "What is the security policy?" \
  --embed-provider ollama \
  --output ./out
hemlock attack run \
  --manifest ./out/.hemlock-manifest.json \
  --target langchain \
  --query "What is the security policy?" \
  --output ./out/run-results.jsonl
hemlock attack report \
  --input ./out/run-results.jsonl \
  --format sarif \
  --output ./out/results.sarif

Output format¶

Commands that produce documents print a summary banner to stderr:

[hemlock] Generated 20 documents in ./output
  poisoned-comment-001.html          (stealth: 30)
  poisoned-comment-002.html          (stealth: 30)
  ...
  poisoned-css-hide-005.html         (stealth: 75)

Each line shows the filename and the stealth score (0–100) for the technique used. Higher scores indicate techniques that are harder to detect through visual inspection or basic content filtering.

Commands that emit JSONL traces (run) prepend a header row with provenance (hemlock_sha, mode, operator identity, engagement chain SHA-256) so the artifact alone is sufficient to reproduce the run context.

Global behavior¶

No network access by default. hemlock never phones home, fetches templates, or resolves external resources. All generation is local. The exceptions: --embed-provider openai|ollama makes outbound API calls for embedding-based similarity scoring; run makes calls to the configured --target pipe; monitor reverse-proxies to the configured --upstream.
Deterministic output. Given the same flags and seeds, hemlock produces identical documents. Variant numbering is sequential.
Exit codes. 0 on success, 1 on usage or execution errors, 2 on validation non-survival (validate reports payload not found).

Shell completion

hemlock supports shell completion for Bash, Zsh, Fish, and PowerShell via the underlying Cobra framework:

hemlock completion bash > /etc/bash_completion.d/hemlock
hemlock completion zsh > "${fpath[1]}/_hemlock"

JSON output

Most commands accept --json to emit structured output instead of the human-readable summary, suitable for pipeline integration.