Skip to content

Pareto Sweep

pareto_sweep.py ablates the --injection-weight parameter from 0.0 to 1.0 in configurable steps, measuring both injection and retrieval rates at each weight. The result is a Pareto frontier — the empirical trade-off curve between retrieval quality and injection effectiveness.

Purpose

The core question this tool answers: What happens to injection success and retrieval success as we shift the optimizer's objective from pure similarity (weight=0.0) toward pure injection prediction (weight=1.0)?

Understanding this trade-off is critical for practical use because some scenarios demand high retrieval (the poisoned document must appear in search results) while others prioritize injection success even if retrieval occasionally fails.

How It Works

flowchart TD
    A["For weight in 0.0, 0.1, ..., 1.0"] --> B["For run in 1..N"]
    B --> C["hemlock batch --genetic<br/>--injection-weight=W"]
    C --> D["Ingest into ChromaDB"]
    D --> E["injection_test.py"]
    D --> F["retrieval_test.py"]
    E --> G["injection_rate"]
    F --> H["retrieval_rate"]
    G --> I["Aggregate per weight"]
    H --> I

    style A fill:#4a148c,stroke:#7c43bd,color:#ffffff
    style I fill:#00695c,stroke:#00897b,color:#ffffff

For each weight value:

  1. Runs N independent trials (default 10)
  2. Each trial generates a fresh corpus with hemlock batch --genetic --injection-weight=W
  3. Runs both injection_test.py and retrieval_test.py against the generated corpus
  4. Records injection rate (fraction of frameworks with injection detected) and retrieval rate (fraction with poisoned document in top-5)
  5. Aggregates per-weight statistics for analysis

CLI Usage

Basic Sweep

python harness/pareto_sweep.py \
  --config harness/authority-config.json \
  --output-dir reports/pareto-qwen7b \
  --model qwen2.5:7b \
  --payload-category redirect \
  --runs-per-weight 10

Custom Weight Range

python harness/pareto_sweep.py \
  --config harness/authority-config.json \
  --output-dir reports/pareto-fine \
  --model qwen2.5:7b \
  --weight-min 0.2 \
  --weight-max 0.6 \
  --weight-step 0.05 \
  --runs-per-weight 15

Resume After Interruption

python harness/pareto_sweep.py \
  --config harness/authority-config.json \
  --output-dir reports/pareto-qwen7b \
  --model qwen2.5:7b \
  --resume

The --resume flag skips weight/run combinations that already have results in the output directory.

All Flags

Flag Default Description
--config (required) Config JSON file with pipeline endpoints
--output-dir (required) Output directory
--model Target LLM model
--payload-category redirect Payload category to sweep
--runs-per-weight 10 Number of independent trials per weight value
--weight-min 0.0 Minimum injection weight
--weight-max 1.0 Maximum injection weight
--weight-step 0.1 Weight increment step size
--injection-model-host http://localhost:9090 Reward server URL
--resume false Skip completed weight/run combinations
--batch-timeout 1800 Timeout in seconds for each hemlock batch subprocess

Output

pareto-summary.json

{
  "results": {
    "0.0000": [
      {
        "run": 1,
        "weight": 0.0,
        "injection_rate": 0.25,
        "retrieval_rate": 0.75,
        "injected": 1,
        "inj_total": 4,
        "retrieved_top5": 3,
        "ret_total": 4,
        "framework_breakdown": {
          "langchain": {"injected": false, "retrieved": true},
          "llamaindex": {"injected": true, "retrieved": true},
          "haystack": {"injected": false, "retrieved": true},
          "unstructured": {"injected": false, "retrieved": false}
        }
      }
    ],
    "0.1000": [...]
  },
  "weights": [0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0],
  "model": "qwen2.5:7b",
  "payload_category": "redirect"
}

Compute Requirements

The default sweep runs $11 \text{ weights} \times 10 \text{ runs} = 110$ trials. Each trial includes corpus generation (~15s), injection testing (~30s), and retrieval testing (~30s). Total time at 7B scale: approximately 2–3 hours.

Fine-grained sweeps (step=0.05, runs=15) require proportionally more compute.

Analysis

Feed the output into statistical_analysis.py for bootstrap confidence intervals and pairwise significance tests:

python harness/statistical_analysis.py \
  --input reports/pareto-qwen7b/pareto-summary.json \
  --output reports/pareto-qwen7b/pareto-statistics.json \
  --mode pareto

Then generate the Pareto frontier figure:

python harness/generate_figures.py \
  --pareto-stats reports/pareto-qwen7b/pareto-statistics.json \
  --output-dir figures/ \
  --figure 4

See Also