Skip to content

Ingest Endpoint

Extracts text from an uploaded document, generates embeddings, and stores it in a ChromaDB collection.


Request

POST /ingest
Content-Type: multipart/form-data
Field Type Required Description
file file The document to ingest
collection string Target ChromaDB collection name

Response

Success (200)

{
  "status": "ingested",
  "collection": "test-collection",
  "chunks": 5,
  "document_id": "poisoned-001.html"
}
Field Type Description
status string Always "ingested" on success
collection string The collection name used
chunks integer Number of chunks created
document_id string Original filename used as document ID

Error (400)

{
  "error": "Missing required field: collection"
}

Error (500)

{
  "error": "ChromaDB connection failed",
  "detail": "Connection refused on port 8000"
}

How Ingestion Works

graph LR
    F["Upload file"] --> E["Extract text"]
    E --> C["Chunk text"]
    C --> V["Generate embeddings<br/>(Ollama)"]
    V --> S["Store in ChromaDB"]
  1. Extract — Same extraction logic as /extract endpoint
  2. Chunk — Split text into overlapping chunks (size varies by framework)
  3. Embed — Each chunk is embedded via Ollama (nomic-embed-text)
  4. Store — Chunks + embeddings stored in the specified ChromaDB collection

Collection Auto-Creation

If the specified collection doesn't exist, it's created automatically. If it exists, documents are added to the existing collection.


Examples

Ingest a Single Document

curl -X POST http://localhost:8100/ingest \
  -F "file=@document.html" \
  -F "collection=test-collection"

Ingest Multiple Documents

for doc in /tmp/hemlock-batch/*.html; do
  curl -s -X POST http://localhost:8100/ingest \
    -F "file=@${doc}" \
    -F "collection=test-collection"
done

Verify Ingestion

# Check collection document count via ChromaDB API
curl http://localhost:8000/api/v1/collections/test-collection/count

Chunking Strategies

Each framework chunks documents differently:

Framework Default Chunk Size Overlap
LangChain 1000 chars 200 chars
LlamaIndex 1024 tokens 20 tokens
Unstructured Element-based N/A
Haystack 500 words 50 words

Chunking affects retrieval

Smaller chunks are more precise but may lose context. Larger chunks preserve context but may dilute payload embeddings. This is why the same document can have different retrieval rankings across frameworks.


Next Steps