Skip to content

CLI Reference

hemlock ships as a single static binary organized into three engagement modes plus a top-level set of primitives. All commands write to stdout or to an output directory and return zero on success.

hemlock <command> [flags]
hemlock <mode> <subcommand> [flags]

The mode-prefixed subcommands are aliases of the corresponding top-level commands plus engagement-scoping warnings. attack-mode subcommands emit a "running unscoped" warning unless ./.hemlock-engagement.toml is present, HEMLOCK_UNSCOPED_OK=1 is set, or --unscoped hemlock attack is passed. defend and research modes are unscoped by default.


Top-level primitives

The original generation + listing surface, callable directly without a mode prefix.

Command Description
craft Generate poisoned documents for a single format and technique
batch Generate poisoned documents across all formats and all techniques
validate Simulate RAG extraction and check whether a payload survives
run Fire crafted documents at a target RAG pipe and capture per-document outcomes
score Compute composite stealth/survival rating for a technique
list-techniques Display available hiding techniques with stealth scores
list-payloads Display available payload templates and descriptions
stack Bring up / tear down the canonical four-framework RAG battery (Docker-compose)

Mode: attack

Offensive operations. Engagement-scoped output suitable for engagement-grade deliverables.

Subcommand Description
attack craft Same as craft, with engagement-scoping warning
attack batch Same as batch, with engagement-scoping warning
attack run Same as run, with engagement-scoping warning
attack score Same as score, with engagement-scoping warning
attack optimize Run CEM / genetic / whitebox bound to a threat-model declaration
attack report Render run-results JSONL as markdown / HTML / SARIF
attack engagement init <id> Create a directory-local engagement context (./.hemlock-engagement.toml)
attack engagement show Print and verify the current engagement context

Mode: defend

Defensive operations. Operator-actionable output.

Subcommand Description
defend detect Run strict-canary detection on collected response samples
defend monitor Real-time detection middleware (HTTP reverse proxy)
defend report Render run-results JSONL as a defensive operator brief

Mode: research

Research operations. Reproducibility- and rigor-oriented output.

Subcommand Description
research prereg freeze SHA-256 lock a pre-registration document
research prereg verify Re-hash and confirm a frozen document is unchanged
research prereg list Print the per-user prereg-history file
research deposit build Build an artifact-evaluation deposit manifest (JSON + Markdown + verifier)
research deposit verify Re-hash a deposit and report any divergence
research reproduce Replay a deposit's run-results through the current detector

Common workflows

Local end-to-end attack loop

Bring up the four-framework stack locally, generate poisoned documents, fire them at one of the pipes, render the result:

hemlock stack up
hemlock batch --count 1 --output ./engagement
hemlock run \
  --manifest ./engagement/.hemlock-manifest.json \
  --target langchain \
  --query "What is the answer?" \
  --output ./engagement/run-results.jsonl
hemlock attack report \
  --input ./engagement/run-results.jsonl \
  --format markdown \
  --output ./engagement/deliverable.md

Defensive monitoring

Front a deployment with the detector and watch for canary hits:

hemlock defend monitor \
  --upstream http://your-rag-endpoint:8100 \
  --listen :8200 \
  --audit-log ./monitor-audit.jsonl

Engagement-grade campaign

Initialize an engagement, declare a threat model, run the optimizer with budget cap, fire results, render a SARIF deliverable:

hemlock attack engagement init EX-2026-001
hemlock attack optimize init
# edit ./.hemlock-threatmodel.toml
hemlock attack optimize validate
hemlock attack optimize run \
  --target-query "What is the security policy?" \
  --embed-provider ollama \
  --output ./out
hemlock attack run \
  --manifest ./out/.hemlock-manifest.json \
  --target langchain \
  --query "What is the security policy?" \
  --output ./out/run-results.jsonl
hemlock attack report \
  --input ./out/run-results.jsonl \
  --format sarif \
  --output ./out/results.sarif

Output format

Commands that produce documents print a summary banner to stderr:

[hemlock] Generated 20 documents in ./output
  poisoned-comment-001.html          (stealth: 30)
  poisoned-comment-002.html          (stealth: 30)
  ...
  poisoned-css-hide-005.html         (stealth: 75)

Each line shows the filename and the stealth score (0–100) for the technique used. Higher scores indicate techniques that are harder to detect through visual inspection or basic content filtering.

Commands that emit JSONL traces (run) prepend a header row with provenance (hemlock_sha, mode, operator identity, engagement chain SHA-256) so the artifact alone is sufficient to reproduce the run context.


Global behavior

  • No network access by default. hemlock never phones home, fetches templates, or resolves external resources. All generation is local. The exceptions: --embed-provider openai|ollama makes outbound API calls for embedding-based similarity scoring; run makes calls to the configured --target pipe; monitor reverse-proxies to the configured --upstream.
  • Deterministic output. Given the same flags and seeds, hemlock produces identical documents. Variant numbering is sequential.
  • Exit codes. 0 on success, 1 on usage or execution errors, 2 on validation non-survival (validate reports payload not found).

Shell completion

hemlock supports shell completion for Bash, Zsh, Fish, and PowerShell via the underlying Cobra framework:

hemlock completion bash > /etc/bash_completion.d/hemlock
hemlock completion zsh > "${fpath[1]}/_hemlock"

JSON output

Most commands accept --json to emit structured output instead of the human-readable summary, suitable for pipeline integration.