Architecture Overview¶
hemlock-lab is a Docker Compose validation environment that hosts 6 containers for testing how poisoned documents survive real-world RAG pipelines.
System Topology¶
graph TB
subgraph compose["Docker Compose (hemlock-net)"]
subgraph storage["Storage Layer"]
chromadb["chromadb<br/>:8000"]
end
subgraph pipelines["RAG Pipeline Containers"]
lc["langchain-rag<br/>:8100"]
li["llamaindex-rag<br/>:8101"]
un["unstructured-rag<br/>:8102"]
hy["haystack-rag<br/>:8103"]
cp["colpali-rag<br/>:8104"]
end
end
ollama["Ollama :11434<br/>(host)"]
subgraph workstation["Your Workstation"]
cli["docker compose / curl"]
end
workstation -->|localhost| compose
lc & li & un & hy & cp --> chromadb
lc & li & un & hy & cp -->|host.docker.internal| ollama
Service Dependency Graph¶
Every pipeline container depends on ChromaDB for vector storage and Ollama (on the host) for embeddings + LLM inference:
graph LR
A["Pipeline Container"] --> B["chromadb :8000"]
A -->|host.docker.internal| C["Ollama :11434"]
C --> D["nomic-embed-text"]
C --> E["smollm2:135m"]
B --> F["chromadb-data volume"]
Pipeline containers use depends_on with ChromaDB's healthcheck and restart: unless-stopped for automatic recovery:
Port Map¶
| Port | Service | Protocol | Notes |
|---|---|---|---|
| 8000 | chromadb | HTTP | Docker container |
| 11434 | Ollama | HTTP | Host process |
| 8100 | langchain-rag | HTTP | Docker container |
| 8101 | llamaindex-rag | HTTP | Docker container |
| 8102 | unstructured-rag | HTTP | Docker container |
| 8103 | haystack-rag | HTTP | Docker container |
| 8104 | colpali-rag | HTTP | Docker container |
Easy to remember
ChromaDB on 8000, Ollama on 11434. Pipelines in sequential order starting at 8100.
Container Details¶
Each pipeline container:
- Builds from a Dockerfile in
docker/<framework>-rag/ - Runs a FastAPI app on its assigned port
- Connects to ChromaDB via the
hemlock-netDocker network - Reaches Ollama via
host.docker.internal:11434 - Mounts
harness/system-promptsread-only for custom prompt support
ChromaDB uses the official chromadb/chroma:0.6.3 image with a persistent chromadb-data volume.
Design Principles¶
Docker Compose, Single Command¶
docker compose up -d brings up the full stack. docker compose down -v tears it down and clears data — enabling a clean reset cycle between test runs.
Uniform API Contract¶
Every pipeline exposes the same 4 endpoints:
| Endpoint | Method | Purpose |
|---|---|---|
/health |
GET | Service health + version |
/extract |
POST | Extract text from document |
/ingest |
POST | Ingest document into ChromaDB |
/query |
POST | Run RAG query (retrieve + prompt + LLM) |
This lets the test harness iterate over all 5 frameworks with identical requests.
Pinned Versions¶
Dependencies are pinned in Dockerfiles and docker-compose.yml for reproducible results:
| Component | Version |
|---|---|
| ChromaDB | 0.6.3 |
| LangChain | 0.3.35 |
| LlamaIndex | 0.12.33 |
| Unstructured | 0.17.2 |
| Haystack | 2.12.1 |
| Ollama model | smollm2:135m |
| Embedding model | nomic-embed-text |
Environment-Driven Config¶
All runtime configuration is passed via environment variables in docker-compose.yml, overridable with a .env file:
| Variable | Default | Purpose |
|---|---|---|
OLLAMA_MODEL |
smollm2:135m |
LLM for generation |
OLLAMA_EMBED_MODEL |
nomic-embed-text |
Embedding model |
SYSTEM_PROMPT_FILE |
(empty) | Custom system prompt |
Next Steps¶
- Services — Deep dive on each service's configuration
- Inventory — Configuration reference
- Pipelines — How each RAG framework works
- Optimization Architecture — Joint optimization system (Bayesian search, reward model, Pareto analysis)
Legacy: Proxmox Architecture
The original architecture used a single Ubuntu 24.04 VM (VMID 260, ailab-rag, 172.16.50.50) with all services installed natively via systemd. Each pipeline had its own Python venv under /opt/hemlock-lab/. Proxmox snapshots provided the reset mechanism. The deployment scripts are preserved in lab-scripts/.