Pareto Sweep¶
pareto_sweep.py ablates the --injection-weight parameter from 0.0 to 1.0 in configurable steps, measuring both injection and retrieval rates at each weight. The result is a Pareto frontier — the empirical trade-off curve between retrieval quality and injection effectiveness.
Purpose¶
The core question this tool answers: What happens to injection success and retrieval success as we shift the optimizer's objective from pure similarity (weight=0.0) toward pure injection prediction (weight=1.0)?
Understanding this trade-off is critical for practical use because some scenarios demand high retrieval (the poisoned document must appear in search results) while others prioritize injection success even if retrieval occasionally fails.
How It Works¶
flowchart TD
A["For weight in 0.0, 0.1, ..., 1.0"] --> B["For run in 1..N"]
B --> C["hemlock batch --genetic<br/>--injection-weight=W"]
C --> D["Ingest into ChromaDB"]
D --> E["injection_test.py"]
D --> F["retrieval_test.py"]
E --> G["injection_rate"]
F --> H["retrieval_rate"]
G --> I["Aggregate per weight"]
H --> I
style A fill:#4a148c,stroke:#7c43bd,color:#ffffff
style I fill:#00695c,stroke:#00897b,color:#ffffff
For each weight value:
- Runs
Nindependent trials (default 10) - Each trial generates a fresh corpus with
hemlock batch --genetic --injection-weight=W - Runs both
injection_test.pyandretrieval_test.pyagainst the generated corpus - Records injection rate (fraction of frameworks with injection detected) and retrieval rate (fraction with poisoned document in top-5)
- Aggregates per-weight statistics for analysis
CLI Usage¶
Basic Sweep¶
python harness/pareto_sweep.py \
--config harness/authority-config.json \
--output-dir reports/pareto-qwen7b \
--model qwen2.5:7b \
--payload-category redirect \
--runs-per-weight 10
Custom Weight Range¶
python harness/pareto_sweep.py \
--config harness/authority-config.json \
--output-dir reports/pareto-fine \
--model qwen2.5:7b \
--weight-min 0.2 \
--weight-max 0.6 \
--weight-step 0.05 \
--runs-per-weight 15
Resume After Interruption¶
python harness/pareto_sweep.py \
--config harness/authority-config.json \
--output-dir reports/pareto-qwen7b \
--model qwen2.5:7b \
--resume
The --resume flag skips weight/run combinations that already have results in the output directory.
All Flags¶
| Flag | Default | Description |
|---|---|---|
--config |
(required) | Config JSON file with pipeline endpoints |
--output-dir |
(required) | Output directory |
--model |
Target LLM model | |
--payload-category |
redirect |
Payload category to sweep |
--runs-per-weight |
10 |
Number of independent trials per weight value |
--weight-min |
0.0 |
Minimum injection weight |
--weight-max |
1.0 |
Maximum injection weight |
--weight-step |
0.1 |
Weight increment step size |
--injection-model-host |
http://localhost:9090 |
Reward server URL |
--resume |
false |
Skip completed weight/run combinations |
--batch-timeout |
1800 |
Timeout in seconds for each hemlock batch subprocess |
Output¶
pareto-summary.json¶
{
"results": {
"0.0000": [
{
"run": 1,
"weight": 0.0,
"injection_rate": 0.25,
"retrieval_rate": 0.75,
"injected": 1,
"inj_total": 4,
"retrieved_top5": 3,
"ret_total": 4,
"framework_breakdown": {
"langchain": {"injected": false, "retrieved": true},
"llamaindex": {"injected": true, "retrieved": true},
"haystack": {"injected": false, "retrieved": true},
"unstructured": {"injected": false, "retrieved": false}
}
}
],
"0.1000": [...]
},
"weights": [0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0],
"model": "qwen2.5:7b",
"payload_category": "redirect"
}
Compute Requirements¶
The default sweep runs $11 \text{ weights} \times 10 \text{ runs} = 110$ trials. Each trial includes corpus generation (~15s), injection testing (~30s), and retrieval testing (~30s). Total time at 7B scale: approximately 2–3 hours.
Fine-grained sweeps (step=0.05, runs=15) require proportionally more compute.
Analysis¶
Feed the output into statistical_analysis.py for bootstrap confidence intervals and pairwise significance tests:
python harness/statistical_analysis.py \
--input reports/pareto-qwen7b/pareto-summary.json \
--output reports/pareto-qwen7b/pareto-statistics.json \
--mode pareto
Then generate the Pareto frontier figure:
python harness/generate_figures.py \
--pareto-stats reports/pareto-qwen7b/pareto-statistics.json \
--output-dir figures/ \
--figure 4
See Also¶
- Joint Optimization — the scoring function being ablated
- Statistical Analysis — bootstrap CIs and significance tests
- Figure Generation — Pareto frontier visualization (Figure 4)
- Validation Experiments — controlled A/B comparisons