Skip to content

First Assessment

This path is for an authorized target you control. Start with a local target and a clear oracle. Do not rely on the agent response as evidence.

Assessment Shape

Assay needs three things:

  1. A running Seam API that can deliver one HTTP probe through an intercept.
  2. A case file with direct and laundered routes.
  3. An oracle that observes the side effect out of band.

The simplest oracle is file_tripwire: the target writes a known local file only if the privileged action happens.

Start Seam API

cd agentic-redteam/seam
go build ./cmd/seam
./seam api \
  --listen 127.0.0.1:8401 \
  --transcript out.json \
  --schema ../schema/transcript.schema.json

Start an intercept through the API, then run Assay against that API:

cd agentic-redteam/assay
python3 -m assay.cli run \
  --case cases/refund_tripwire.yaml \
  --seam http://127.0.0.1:8401 \
  --out finding.json \
  --trials 3

Adapt The Case

Change only the target-specific fields first:

  • direct.method, direct.path, direct.headers, and direct.body_json
  • the matching framings.* route fields
  • oracle.path
  • oracle.expected_string or expected JSON field

Keep the proof policy unchanged:

  • reset the oracle before every route
  • direct route should not confirm the side effect
  • laundered or rewritten route must confirm the side effect
  • agent text never counts as evidence

Review The Result

Render a report after a successful finding:

python3 -m assay.cli report render \
  --finding finding.json \
  --transcript ../seam/out.json \
  --out report

The finding is useful when method.delta_confirmed is true and the oracle observation identifies the side effect. If direct and laundered both succeed, the behavior may still be risky, but it is not the differential laundering claim this case is designed to prove.

Boundary

AIT does not perform transparent TLS interception. For HTTPS targets, configure the target or client to use an explicit HTTP endpoint, local test fixture, or authorized test proxy arrangement where Seam receives plaintext application traffic.