Skip to content

Runtime L7-L10 Decision Meshes

L7-L10 extend the real-deciding lab pattern to framework-specific deterministic targets. They reuse the Lab L5 content-decision chain, but stamp artifacts with framework/runtime metadata so the rest of the toolkit can exercise cross-framework reporting.

CrewAI-Shaped Decision Mesh

TRIALS=3 bash lab/runtime/l7/crewai_decision_mesh/run_lab.sh

Expected artifacts are written under lab/runtime/l7/crewai_decision_mesh/out/ unless OUT_DIR is supplied. The run emits a Seam transcript, meshmapper graph, path hypotheses, Assay finding, robustness bundle, rendered report, tripwire, expectation report, and framework.json.

AutoGen-Shaped Decision Mesh

TRIALS=3 bash lab/runtime/l8/autogen_decision_mesh/run_lab.sh

Expected artifacts are written under lab/runtime/l8/autogen_decision_mesh/out/ unless OUT_DIR is supplied. The acceptance pattern is the same as L7: direct 0/N, baseline laundered without rewrite 0/N, rewritten laundered >0/N, verified transcript, unproven meshmapper hypothesis, and oracle-backed Assay finding.

Workbench Runs

python3 -m ait.cli lab run crewai-decision --trials 1
python3 -m ait.cli lab run autogen-decision --trials 1
python3 -m ait.cli lab run openai-agents-decision --trials 1
python3 -m ait.cli lab run microsoft-agent-framework-decision --trials 1

These labs are deterministic and do not call live LLMs by default.

OpenAI Agents-Style Decision Mesh

TRIALS=3 bash lab/runtime/l9/openai_agents_decision_mesh/run_lab.sh

This lab currently provides a deterministic OpenAI Agents-style runtime label and trace over the L5 proof chain. A future RUNTIME_MODE=real adapter will use the installed SDK with local deterministic tools.

Microsoft Agent Framework-Style Decision Mesh

TRIALS=3 bash lab/runtime/l10/microsoft_agent_framework_decision_mesh/run_lab.sh

This lab currently provides a deterministic Microsoft Agent Framework-style runtime label and trace over the L5 proof chain. A future RUNTIME_MODE=real adapter will use the installed framework with local deterministic tools.

Runtime Modes

RUNTIME_MODE=stub is the default and runs the deterministic content-decision chain. RUNTIME_MODE=real requires the optional framework package, records framework construct metadata plus a deterministic agent/task/message/tool-call flow in real-runtime.json, and then runs the same proof chain. The trace uses local deterministic tools; no model is called.

cd lab/runtime/l7/crewai_decision_mesh
python3 -m pip install -r requirements.txt
RUNTIME_MODE=real TRIALS=1 bash run_lab.sh
cd lab/runtime/l8/autogen_decision_mesh
python3 -m pip install -r requirements.txt
RUNTIME_MODE=real TRIALS=1 bash run_lab.sh

Real mode is opt-in and still uses deterministic local decisions. It does not make live LLM calls. If the optional package is missing, the adapter exits with an install hint instead of silently falling back to stub mode.

Real mode output includes:

  • real-runtime.json
  • framework.json
  • framework metadata inside expectations.json
  • runtime_trace.tool_calls[] showing the deterministic privileged tool path
  • runtime_trace.verdict with direct false and rewritten/laundered true

Smoke Through AIT

Use ait lab smoke-runtime when you want a quick installed-framework check without remembering each lab path. Stub mode always runs the deterministic wrapper. Real mode checks the optional dependency first and records a skipped metadata artifact when it is not installed.

python3 -m ait.cli lab smoke-runtime \
  --lab crewai-decision \
  --runtime-mode stub \
  --trials 1
python3 -m ait.cli lab smoke-runtime \
  --lab autogen-decision \
  --runtime-mode real \
  --trials 1 \
  --json

Smoke metadata records dependency status, runtime mode, live_model: false, produced artifacts, and expectation status.