Operator Cockpit¶
The AIT cockpit is the browser workspace for a run directory or a single Seam transcript. It is a React app served by ait, backed by /status, /api/ui-config, and /artifact/...; it reads the same transcripts, graph, paths, findings, reports, and logs that the tools wrote.
Start Live¶
Start the dashboard before the lab begins:
python3 -m ait.cli workbench lab full-agent-mesh \
--scenario content_rewrite \
--trials 1 \
--serve \
--listen 127.0.0.1:8788
For the polished first demo path:
python3 -m ait.cli demo full-agent-mesh \
--scenario content_rewrite \
--trials 1 \
--serve-live \
--listen 127.0.0.1:8788
The command prints the cockpit URL immediately. Leave it open while traffic flows.
Tabs¶
| Tab | What It Shows | What Good Looks Like |
|---|---|---|
| Operate | Live A2A/MCP/HTTP feed with search, flow, protocol, direction, rule, and outcome labels. | A2A rows appear; rewritten rows show the rule id and control 403s are labeled as baseline refusals or blocked attempts. |
| Message | Selected record details, outcome, transcript hash, and before/after decoded values. | The offensive replacement is visible in the after pane. |
| Seam Ops | Listener, upstream, rules, counters, rewrite timeline, parsed diagnostics, equivalent CLI commands, and transcript tail. | Matches and rewrites are nonzero for offensive runs; trace, test, tail, inspect, and verify actions render parsed summaries. |
| Map | Interactive graph visual, hypothesis/node/edge/trust filters, selected node/edge/path details, graph refs, provenance warnings, source refs, and binding scaffold actions. | Suspicious paths are visible as graph routes with traffic evidence and explicit validation status. |
| Validate | Optional Assay board with route delta, trial rows, technique/mutation matrix, negative controls, oracle evidence, confidence fields, transcript refs, and replay artifacts. | Direct fails, laundered succeeds, delta_confirmed is true, and the oracle panel shows the side effect. |
| Artifacts | Grouped report, transcript, graph, finding, robustness, and log files. | All expected files are present and logs explain failures. |
Reading A2A And MCP Traffic¶
In Operate, filter protocol to a2a to find Agent-to-Agent JSON-RPC messages such as message/send. Rewritten rows include a rule value. Select one, then open Message to compare the decoded before/after text.
MCP traffic appears as JSON-RPC tool calls and tool results. Filter by protocol or search for a tool name to find the planner-to-tool path.
In Lab L6, 403 does not automatically mean the demo failed. The direct route is supposed to refuse a privileged refund, and blocked laundered/control attempts are useful contrast. The proof condition is the pair: baseline/control refusal plus oracle-observed success after the intended rewrite or framing.
Operate To Map To Validate¶
Use the cockpit in this order:
- Seam: confirm traffic crossed the listener and the expected rule fired.
- Message: confirm the exact decoded value that changed.
- Map: inspect why the traffic implies a suspicious path.
- Validate: confirm the side effect with oracle evidence only when you need impact proof.
meshmapper hypotheses are intentionally unvalidated. Assay is optional impact validation.
Seam Actions¶
The Seam tab can launch bounded diagnostics and store outputs under workbench/actions/seam/:
trace: runseam rules traceagainst the selected transcript and rules; the cockpit shows matched records, missed rules, miss reasons, and touched decoded paths.test: runseam rules testagainst a fixture; the cockpit shows expected rule, matched rule, touched paths, and fixture evaluation.tail: run a boundedseam session tailwith a fixed limit; the cockpit shows recent transcript rows and links them back to Message where possible.inspect: runseam transcript inspect --decoded --json; the cockpit shows record counts, protocols/kinds, rules applied, decoded keys, and latest hash.verify: runseam transcript verify; the cockpit shows schema/hash-chain status and the first failure when verification fails.
Use this loop when a rewrite does not happen: trace the transcript, open the missed record in Message, adjust the decoded path or predicate, test the fixture, rerun the proxy, then inspect and verify the final transcript.
Equivalent CLI wrappers are available:
python3 -m ait.cli seam trace \
--run .ait/runs/<run-id> \
--transcript lab/transcripts/edge.json \
--rules lab/rules/edge
Stored action history can be compared and annotated:
python3 -m ait.cli seam actions list --run .ait/runs/<run-id>
python3 -m ait.cli seam actions compare \
--run .ait/runs/<run-id> \
--left trace-1 \
--right test-1
python3 -m ait.cli seam actions note \
--run .ait/runs/<run-id> \
--action-id trace-1 \
--text "Rule fired after predicate match."
Path To Case¶
The meshmapper tab shows path_details when the run has graph.json, paths.json, and transcripts. Select a hypothesis to see source badges, evidence timeline rows, related traffic, graph refs, and proof refs when Assay transcript hashes overlap.
Inspect the same path from a terminal:
python3 -m ait.cli meshmapper inspect-path \
--run .ait/runs/<run-id> \
--hypothesis-id <hypothesis-id>
Use Create validation binding scaffold only as a starting point. The scaffold names the graph hypothesis and likely route shape, but the operator must still fill target-specific routes, payloads, variables, and oracle configuration.
Use Candidate Seam rules when you want to go from a selected path back into
live operation. Each candidate rule row includes a copyable ait map launch
command. Saved launch plans appear in the Map tab and under Artifacts.
Launch A Candidate Rule¶
Saved launch plans are field-offense workflow records. They show the selected hypothesis, rule family, expected Seam rule id, decoded fields, missing listener or upstream fields, attempts, logs, and linked operate run.
From the cockpit Map tab:
- Select a hypothesis.
- Pick a candidate Seam rule family.
- Create or open the launch plan.
- Click execute when listener/upstream are known, or retry after filling missing fields from the CLI.
- Follow the linked operate run back to Operate and Message to confirm traffic and rewrites.
Equivalent CLI:
python3 -m ait.cli map launch execute \
--run .ait/runs/<run-id> \
--launch-id <launch-id> \
--serve
Verify Demo Packs¶
Demo-pack verification is a rule-confidence step before a live run. It validates
descriptor fields, rule files, fixtures, payloads, expected decoded paths, and
negative-control pairing. Seam-focused packs also run fixture-backed
seam rules test and store the result under workbench/demo-verifications/.
python3 -m ait.cli demo verify a2a-agent-card-spoof
Open Artifacts to inspect the latest verification status, expected rule, fixture, touched paths, rule-test log, and negative-control pairing.
Validate Impact Sweep¶
When a run came from a crafted case family, open Validate and use the subviews:
- Verdict: direct-vs-laundered totals, confidence interval, generation provenance, and the technique-by-mutation matrix.
- Trials: every route execution with framing,
probe_id,mutation_id, result, and oracle observation. - Techniques / Mutations: aggregate survival rates; clicking a row filters the trial table.
- Negative Controls: control probes that should not produce a side effect.
- Oracle Evidence: the tripwire, callback, or privileged-read observation that proves effect.
- Replay Artifacts: finding, report, transcripts, robustness summaries, and saved case-family files.
Use the matrix to find the offensive payload family that worked, then jump to the trial row and transcript refs before opening the report.
Runtime Assets¶
The React source lives in web/ait-cockpit/. Runtime assets are committed under ait/static/cockpit/ so normal operators do not need Node installed.
cd web/ait-cockpit
npm install
npm run test
npm run build
If those assets are missing, ait workbench serve falls back to the older embedded cockpit.