Skip to content

hemlock research deposit

Build an artifact-evaluation (AE) deposit manifest from a staged deposit tree, and verify a manifest later by re-hashing every file. The manifest format is a JSON file pinning per-file SHA-256, plus a Markdown companion and an optional portable shell verifier.

Synopsis

hemlock research deposit build <deposit-root> [flags]
hemlock research deposit verify <deposit-root> [flags]

build

Walk a deposit root, hash every file (subject to skip rules), compute per-bundle aggregate hashes for directories under <root>/data/, and emit:

  • <root>/deposit-manifest.json — JSON manifest with one entry per file
  • <root>/MANIFEST.md — Markdown companion (human-readable)
  • (optional) <root>/verify-ae.sh — portable shell script that re-hashes every file and reports divergences
Flag Type Default Description
--json string <root>/deposit-manifest.json Override JSON output path
--md string <root>/MANIFEST.md Override Markdown output path
--no-md bool false Skip Markdown emission entirely. Use when the deposit ships a hand-curated MANIFEST.md (e.g., one with a Data-inventory section, Pre-registration block, or Reproducibility statement) that the auto-generated companion does not preserve. Mutually exclusive with --md.
--verifier string Shell-verifier output path. Pass - to use the default <root>/verify-ae.sh. Omit to skip verifier emission.
--operator string Override operator identity recorded in the manifest

verify

Re-hash every file recorded in <root>/deposit-manifest.json (or --manifest <path>) and report mismatched, missing, or unexpected files. Returns non-zero on any divergence.

Flag Type Default Description
--manifest string <root>/deposit-manifest.json Path to the manifest

Manifest shape

The JSON manifest is a flat structure:

{
  "schema_version": 1,
  "generated_at": "2026-04-29T18:32:38Z",
  "root": "/abs/path/to/deposit",
  "operator_identity": "...",
  "hemlock_sha": "...",
  "hemlock_version": "...",
  "total_files": 70,
  "total_bytes": 11534336,
  "files": [
    {
      "path": "Dockerfile.repro",
      "sha256": "7f0218c1...",
      "size_bytes": 2241
    },
    ...
  ],
  "bundles": [
    {
      "name": "validation-4.6-mx3-pod-72b-fp8-fullfw-r1",
      "file_count": 8,
      "total_bytes": 504219,
      "aggregate_sha256": "..."
    },
    ...
  ]
}

Bundle aggregate SHAs match find . -type f | sort | xargs shasum -a 256 | shasum -a 256 so a reviewer can re-derive the same value with standard shell tools without running hemlock.

Skip rules

build skips:

  • The manifest files it would generate (deposit-manifest.json, deposit-manifest.md, verify-deposit.sh)
  • .DS_Store
  • .git/, __pycache__/, .pytest_cache/, .mypy_cache/ (anywhere in the tree)

Examples

hemlock research deposit build ./paper/artifact/paper-a \
  --verifier - \
  --operator "Nathan Keys <nathan.keys@pm.me>"

hemlock research deposit verify ./paper/artifact/paper-a
#   [hemlock] verified 70 files, 0 mismatches, 0 missing, 0 unexpected