Pareto Sweep¶

pareto_sweep.py ablates the --injection-weight parameter from 0.0 to 1.0 in configurable steps, measuring both injection and retrieval rates at each weight. The result is a Pareto frontier — the empirical trade-off curve between retrieval quality and injection effectiveness.

Purpose¶

The core question this tool answers: What happens to injection success and retrieval success as we shift the optimizer's objective from pure similarity (weight=0.0) toward pure injection prediction (weight=1.0)?

Understanding this trade-off is critical for practical use because some scenarios demand high retrieval (the poisoned document must appear in search results) while others prioritize injection success even if retrieval occasionally fails.

How It Works¶

flowchart TD
    A["For weight in 0.0, 0.1, ..., 1.0"] --> B["For run in 1..N"]
    B --> C["hemlock batch --genetic<br/>--injection-weight=W"]
    C --> D["Ingest into ChromaDB"]
    D --> E["injection_test.py"]
    D --> F["retrieval_test.py"]
    E --> G["injection_rate"]
    F --> H["retrieval_rate"]
    G --> I["Aggregate per weight"]
    H --> I

    style A fill:#4a148c,stroke:#7c43bd,color:#ffffff
    style I fill:#00695c,stroke:#00897b,color:#ffffff

For each weight value:

Runs N independent trials (default 10)
Each trial generates a fresh corpus with hemlock batch --genetic --injection-weight=W
Runs both injection_test.py and retrieval_test.py against the generated corpus
Records injection rate (fraction of frameworks with injection detected) and retrieval rate (fraction with poisoned document in top-5)
Aggregates per-weight statistics for analysis

CLI Usage¶

Basic Sweep¶

python harness/pareto_sweep.py \
  --config harness/authority-config.json \
  --output-dir reports/pareto-qwen7b \
  --model qwen2.5:7b \
  --payload-category redirect \
  --runs-per-weight 10

Custom Weight Range¶

python harness/pareto_sweep.py \
  --config harness/authority-config.json \
  --output-dir reports/pareto-fine \
  --model qwen2.5:7b \
  --weight-min 0.2 \
  --weight-max 0.6 \
  --weight-step 0.05 \
  --runs-per-weight 15

Resume After Interruption¶

python harness/pareto_sweep.py \
  --config harness/authority-config.json \
  --output-dir reports/pareto-qwen7b \
  --model qwen2.5:7b \
  --resume

The --resume flag skips weight/run combinations that already have results in the output directory.

All Flags¶

Flag	Default	Description
`--config`	(required)	Config JSON file with pipeline endpoints
`--output-dir`	(required)	Output directory
`--model`		Target LLM model
`--payload-category`	`redirect`	Payload category to sweep
`--runs-per-weight`	`10`	Number of independent trials per weight value
`--weight-min`	`0.0`	Minimum injection weight
`--weight-max`	`1.0`	Maximum injection weight
`--weight-step`	`0.1`	Weight increment step size
`--injection-model-host`	`http://localhost:9090`	Reward server URL
`--resume`	`false`	Skip completed weight/run combinations
`--batch-timeout`	`1800`	Timeout in seconds for each `hemlock batch` subprocess

Output¶

`pareto-summary.json`¶

{
  "results": {
    "0.0000": [
      {
        "run": 1,
        "weight": 0.0,
        "injection_rate": 0.25,
        "retrieval_rate": 0.75,
        "injected": 1,
        "inj_total": 4,
        "retrieved_top5": 3,
        "ret_total": 4,
        "framework_breakdown": {
          "langchain": {"injected": false, "retrieved": true},
          "llamaindex": {"injected": true, "retrieved": true},
          "haystack": {"injected": false, "retrieved": true},
          "unstructured": {"injected": false, "retrieved": false}
        }
      }
    ],
    "0.1000": [...]
  },
  "weights": [0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0],
  "model": "qwen2.5:7b",
  "payload_category": "redirect"
}

Compute Requirements¶

The default sweep runs $11 \text{ weights} \times 10 \text{ runs} = 110$ trials. Each trial includes corpus generation (~15s), injection testing (~30s), and retrieval testing (~30s). Total time at 7B scale: approximately 2–3 hours.

Fine-grained sweeps (step=0.05, runs=15) require proportionally more compute.

Analysis¶

Feed the output into statistical_analysis.py for bootstrap confidence intervals and pairwise significance tests:

python harness/statistical_analysis.py \
  --input reports/pareto-qwen7b/pareto-summary.json \
  --output reports/pareto-qwen7b/pareto-statistics.json \
  --mode pareto

Then generate the Pareto frontier figure:

python harness/generate_figures.py \
  --pareto-stats reports/pareto-qwen7b/pareto-statistics.json \
  --output-dir figures/ \
  --figure 4