Optimization Architecture¶
This page describes how the optimization components across both hemlock (Go) and hemlock-lab (Python) fit together to form the joint optimization pipeline.
System Overview¶
flowchart TB
subgraph "hemlock-lab (Python)"
BT["build_training_data.py<br/>Reports → Parquet"]
RM["reward_model.py<br/>Train MLP"]
RS["reward_server.py<br/>FastAPI :9090"]
BO["bayesian_optimizer.py<br/>GP + EI search"]
IT["injection_test.py<br/>Injection detection"]
RT["retrieval_test.py<br/>Retrieval measurement"]
PS["pareto_sweep.py<br/>Weight ablation"]
VR["validation_runner.py<br/>A/B experiments"]
SA["statistical_analysis.py<br/>Bootstrap CIs"]
GF["generate_figures.py<br/>Publication plots"]
end
subgraph "hemlock (Go)"
HC["hemlock batch<br/>Document generation"]
OPT["Optimizers<br/>CEM / Genetic / Whitebox"]
SI["score_injection.go<br/>Reward model HTTP client"]
end
subgraph "Infrastructure"
CD["ChromaDB :8000"]
OL["Ollama :11434"]
FW["RAG Pipelines<br/>:8100–:8104"]
end
BT -->|training_data.parquet| RM
RM -->|reward_model.pt| RS
RS -.->|POST /predict-injection| SI
SI --> OPT
OPT --> HC
BO -->|hemlock batch| HC
HC -->|documents| CD
CD --> FW
FW --> IT
FW --> RT
PS -->|hemlock batch| HC
VR -->|hemlock batch| HC
IT -->|injection-results.json| SA
RT -->|retrieval-results.json| SA
PS -->|pareto-summary.json| SA
VR -->|validation-summary.json| SA
SA -->|statistics.json| GF
OL --> HC
OL --> FW
style RS fill:#00695c,stroke:#00897b,color:#ffffff
style OPT fill:#4a148c,stroke:#7c43bd,color:#ffffff
style BO fill:#4a148c,stroke:#7c43bd,color:#ffffff
Data Flow¶
The optimization system has three main data flows:
1. Training Pipeline¶
reports/**/injection-results.json
reports/**/retrieval-results.json
│
▼
build_training_data.py → training_data.parquet
│
▼
reward_model.py (5-fold CV, class-weighted BCE)
│
▼
reward_model.pt (MLP weights + scaler)
│
▼
reward_server.py (FastAPI on :9090)
This pipeline runs once to produce a trained model, then again whenever new experiment data is available.
2. Optimization Loop¶
bayesian_optimizer.py
│
├─ Select parameters (GP + EI)
│ │
│ ▼
├─ hemlock batch --genetic --injection-weight W ...
│ │
│ ▼
├─ Ingest into ChromaDB collection
│ │
│ ▼
├─ injection_test.py → reward
│ │
│ ▼
└─ Update GP with (params, reward) → next iteration
Each Bayesian optimizer evaluation is a full generate-ingest-test cycle. The reward model runs inside hemlock during document generation (via the --injection-weight flag), while the Bayesian optimizer uses end-to-end injection test results as its objective.
3. Evaluation Pipeline¶
validation_runner.py / pareto_sweep.py
│
▼
injection-results.json + retrieval-results.json
│
▼
statistical_analysis.py → statistics.json
│
▼
generate_figures.py → PDF + PNG figures
Component Interaction¶
During Document Generation¶
When hemlock runs with --injection-weight > 0, the scoring function in each optimizer (CEM, Genetic, Whitebox) calls scoreInjection() for each candidate:
sequenceDiagram
participant G as Genetic Optimizer
participant E as Ollama Embeddings
participant R as Reward Server
loop Each generation
loop Each candidate in population
G->>E: Embed candidate text
E-->>G: Embedding vector (768-dim)
G->>G: similarity = cosine(embedding, query)
G->>R: POST /predict-injection
R-->>G: {score: 0.73}
G->>G: fitness = (1-w_inj-w_nat)*sim + w_nat*nat + w_inj*inj
end
G->>G: Selection + crossover + mutation
end
During Bayesian Optimization¶
The Bayesian optimizer treats the entire generate-ingest-test pipeline as a black-box objective function:
sequenceDiagram
participant BO as Bayesian Optimizer
participant H as hemlock batch
participant C as ChromaDB
participant IT as injection_test.py
participant RT as retrieval_test.py
loop 50–100 evaluations
BO->>BO: GP surrogate → EI → next params
BO->>H: subprocess: hemlock batch --<mapped flags>
H->>H: Generate documents (may call reward server internally)
BO->>C: Ingest generated documents
BO->>IT: Run injection tests
IT-->>BO: Per-framework injection results
BO->>RT: Run retrieval tests
RT-->>BO: Per-framework retrieval rank
BO->>BO: reward = 0.3×retrieval + 0.7×injection
BO->>C: Delete collection (cleanup)
BO->>BO: Update GP with observation
end
Port Map¶
| Service | Port | Role |
|---|---|---|
| ChromaDB | 8000 | Vector store |
| LangChain | 8100 | RAG pipeline |
| LlamaIndex | 8101 | RAG pipeline |
| Unstructured | 8102 | RAG pipeline |
| Haystack | 8103 | RAG pipeline |
| ColPALI | 8104 | RAG pipeline |
| Reward Server | 9090 | Injection score prediction |
| Ollama | 11434 | LLM inference + embeddings |
Dependencies Between Scripts¶
| Script | Requires Running | Requires Files |
|---|---|---|
build_training_data.py |
— | reports/**/injection-results.json |
reward_model.py |
— | training_data.parquet |
reward_server.py |
— | reward_model.pt |
bayesian_optimizer.py |
Docker stack, Ollama | Config JSON |
pareto_sweep.py |
Docker stack, Ollama, reward server | Config JSON |
validation_runner.py |
Docker stack, Ollama | Config JSON, optionally best-params.json |
statistical_analysis.py |
— | *-summary.json |
generate_figures.py |
— | statistics.json |
See Also¶
- Joint Optimization (hemlock) — scoring function details
- Services Architecture — Docker container topology
- Bayesian Optimizer — GP hyperparameter search
- Reward Model — three-stage training pipeline