Skip to content

Assay Findings

Assay writes proof-carrying findings that validate against finding.schema.json. A finding is complete only when it carries oracle evidence, trial statistics, and Seam transcript references.

Evidence Rule

The finding's evidence.observed field must come from the oracle, never from an agent claim. If no laundered route succeeds, Assay keeps the oracle's miss summary so reviewers can see what was checked.

Route And Framing Statistics

Assay reports the aggregate laundered route in compatibility fields:

  • stats.trials
  • stats.success_rate
  • stats.ci_low
  • stats.ci_high

It also reports both routes under stats.by_route.direct and stats.by_route.laundered, including successes, success rates, and confidence intervals.

Confidence intervals use a deterministic Wilson score interval. method.delta_confirmed is true only when the direct route has zero successes and the laundered route has at least one success; findings also include method.verdict_policy so reviewers can see that rule directly.

For framed cases, Assay also reports:

  • method.framings[]
  • method.delta_confirmed_framings[]
  • method.trial_results[].framing
  • method.trial_results[].probe_id
  • method.trial_results[].mutation_id
  • stats.trials_per_framing
  • stats.by_framing.<framing>.direct
  • stats.by_framing.<framing>.laundered
  • evidence.by_framing

Robustness scenarios also populate stats.robust_across with condition booleans such as framing:tool_response, route_order:laundered_first, and variables:default.

M6.1 scenarios may also include runtime labels such as runtime:langgraph and model-profile labels such as model_profile:deterministic_no_llm. These are lab-provided metadata, not instructions for Assay to start a runtime.

Case-family sweeps also report:

  • method.case_family
  • method.techniques[]
  • method.mutations[]
  • method.trial_results[].technique_id
  • method.trial_results[].mutation_id
  • stats.by_technique
  • stats.by_mutation
  • stats.technique_mutation_matrix

Trial Ledger

method.trial_results[] records each direct and laundered route attempt:

{
  "trial_index": 0,
  "route": "laundered",
  "succeeded": true,
  "transcript_ref": "sha256:...",
  "evidence": {
    "oracle": "file_tripwire",
    "observed": "{\"account\":\"ATTACKER-CTRL\"}"
  }
}

transcript_refs[] lifts the Seam hashes into the top-level finding so a reviewer can reconnect the result to the captured delivery path.

Relationship To Hypotheses

meshmapper hypotheses are suspects with proven: false. Assay findings are the proof step. M5-lite can attach a selected hypothesis through an explicit binding file. The finding records that provenance under method.hypothesis_binding and preserves hypothesis owasp_asi, atlas, and graph_ref where present.

Use reports to render findings and supporting artifacts into Markdown/HTML review packets.