Skip to content

hemlock run

Closes the craft → fire → measure loop. Reads a manifest (.hemlock-manifest.json) produced by craft or batch, then for each document:

  1. POST /ingest (multipart) with the document file
  2. POST /query with the configured query
  3. Run the strict-canary detector against the response
  4. Write one JSONL row per document

Output is suitable for hemlock attack report and hemlock research reproduce downstream. Aliased under hemlock attack run for engagement-mode framing.

Synopsis

hemlock run --manifest <path> --target <url> [flags]
hemlock attack run [same flags]

Flags

Flag Type Default Description
--manifest string ./output/.hemlock-manifest.json Path to a .hemlock-manifest.json produced by craft or batch
--docs-dir string (parent of --manifest) Directory containing the document files referenced by the manifest
--target string (required) Pipe URL, or one of: langchain, llamaindex, unstructured, haystack
--query string (manifest's target_query) Query to fire after each ingest
--collection string hemlock_run_<unix-ts> ChromaDB collection name
--per-doc-collection bool false Use a fresh collection per document (avoids cross-doc retrieval interference)
--top-k int 5 Retrieval k passed to /query
--limit int 0 Fire at most N documents (0 = all)
--filter-format string Only fire documents of this format (txt, html, ...)
--filter-technique string Only fire documents using this hiding technique
--canary-version string v1 Canary registry version used by the embedded detector
--payload-category string (per-doc payload) Indicator-list category to use for detection
--output string ./run-results-<unix-ts>.jsonl JSONL output path
--timeout int 60 Per-HTTP-request timeout in seconds
--scope-check bool true Abort if an engagement context is in scope and --target is not in scope.allowed_targets

Target shortcuts

When --target is one of the canonical pipe names, the URL resolves to localhost on the corresponding port:

Shortcut Resolves to
langchain http://localhost:8100
llamaindex http://localhost:8101
unstructured http://localhost:8102
haystack http://localhost:8103

Any other value is taken verbatim as a base URL.

Output format

The JSONL stream begins with one header row (type=header) recording provenance — hemlock SHA, mode, operator identity, engagement ID + chain SHA-256, target, query, collection, top-k, manifest path, canary registry version. Each subsequent row is a result row (type=result) with the document reference, ingest status + chunks, query response + sources, latency, and the per-doc strict-canary verdict.

The header row alone is sufficient to reproduce the run context; the per-row verdicts are computed by the embedded detector at the registry version pinned in the header.

Engagement scoping

When run inside a directory containing ./.hemlock-engagement.toml, the target's host:port must appear in scope.allowed_targets. Pass --scope-check=false to override (recorded in the JSONL header for audit).

Examples

Smoke an entire batch against the local LangChain pipe:

hemlock batch --count 1 --output ./engagement
hemlock run \
  --manifest ./engagement/.hemlock-manifest.json \
  --target langchain \
  --query "What is the answer?" \
  --output ./engagement/run-results.jsonl

Sample output (stderr):

[hemlock run] 5 documents fired
  ingest:     5 ok, 0 failed
  query:      5 ok, 0 failed
  detected:   1 (high-confidence: 1)
  output:     /abs/path/run-results.jsonl

Fire only chunk-boundary technique documents at a remote pod, isolating each into its own collection:

hemlock run \
  --manifest ./output/.hemlock-manifest.json \
  --target http://205.196.19.135:8100 \
  --query "What should the model do?" \
  --filter-technique chunk-boundary \
  --per-doc-collection \
  --output ./run-pod.jsonl