Operator Experience¶

AIT needs two operator surfaces:

Seam operator CLI: rich live controls for in-path offensive work.
AIT workbench CLI: cross-tool workflows that call Seam, meshmapper, and Assay.

This keeps the tools easy to use without turning one tool into a hidden monolith.

Responsibility Split¶

Surface	Owns	Does Not Own
Seam	capture, proxy, stdio wrapping, active rules, session status, transcript verification, rule diagnostics	Assay validation, meshmapper graph inference, multi-tool reports
Assay	optional validation probes, deterministic payload sweeps, direct-vs-laundered trials, oracles, findings, report rendering	transport interception, graph inference, hot-path adaptive generation
meshmapper	artifact ingestion, graph refs, unvalidated hypotheses, target suggestions	transport interception, validation
`ait`	run directories, lab workflows, tool process supervision, artifact collection, operator guidance	reimplementing Seam/Assay/meshmapper internals

Seam Operator CLI¶

The existing low-level commands stay stable:

seam tap ...
seam proxy ...
seam stdio tap ...
seam stdio proxy ...
seam api ...
seam robustness run ...
seam transcript verify ...

M7.2a adds higher-level commands for offensive operations:

seam doctor
seam rules list
seam rules test --rules rules/ --fixture examples/a2a-agent-card.json
seam rules trace --transcript out.json
seam rules explain --rules rules/ --rule a2a_prompt_laundering_replace
seam transcript inspect --transcript out.json --schema schemas/transcript.schema.json
seam transcript redact --transcript out.json --out out.redacted.json --schema schemas/transcript.schema.json
seam session status --transcript out.json
seam session tail --transcript out.json --limit 5
seam profile list
seam profile run lab --mode proxy --upstream http://127.0.0.1:8500 --rules rules/

Profile commands expand visible manifests such as local-safe, authorized-range, and lab into ordinary flags. They do not create hidden restrictions. When an explicit flag conflicts with a profile value, the explicit flag wins and Seam warns.

Session status/tail and rules explain are now part of the operator loop. transcript redact creates report-safe derivative copies; raw transcripts remain source evidence.

Operational requirements:

loopback data-plane default
explicit remote bind opt-in
clear no-transparent-TLS warning
per-run API token
safe run directory defaults
visible rule match/rewrite counts
clear failure when an expected offensive rule never matched
transcript hash verification and concise decoded summaries

AIT Workbench CLI¶

ait M2 calls public tool surfaces and collects their artifacts:

ait doctor
ait lab list
ait lab run langgraph-refund --trials 3
ait lab run content-decision --trials 3
ait lab run full-agent-mesh --trials 3
ait lab run full-agent-mesh --scenario content_rewrite --trials 1
ait run inspect --run .ait/runs/<run-id>
ait capture --upstream http://127.0.0.1:8500
ait assess --case cases/refund_family.yaml --seam http://127.0.0.1:8401
ait report --run .ait/runs/<run-id>

Run layout:

.ait/runs/<timestamp>-<slug>/
  run.json
  logs/
  seam/
  assay/
  meshmapper/
  transcripts/
  graphs/
  paths/
  findings/
  reports/

Workbench M2 guarantees:

preserve every underlying command in run.json
capture stdout/stderr logs for each underlying tool command
store artifact paths and checksums
record lab id, scenario, trial count, exit status, and important artifact paths
run the LangGraph refund, content-decision, and full-agent-mesh lab scripts through public shell entrypoints
summarize run status with ait run inspect
never treat agent self-report as proof

Offensive Usability Targets¶

The current offensive arc delivered:

Seam operator diagnostics and rule tracing.
mutate.replace with captures/templates for prompt-injection laundering and value echoing.
An Assay technique library and assay craft generator that turns intent plus variables into saved case-family artifacts.
Lab L5, a real-deciding deterministic target where success depends on rewritten content.
ait lab run orchestration that writes run manifests, logs, artifacts, and reports.