Skip to content

HuggingFace TGI / TEI

Enumerate and exploit HuggingFace Text Generation Inference (TGI) and Text Embeddings Inference (TEI) servers.

Overview

The huggingface module targets HuggingFace inference servers. TGI serves generative language models via /generate and an OpenAI-compatible /v1/models endpoint. TEI serves embedding models via /embed and reranking models via /rerank. Both expose a /info endpoint and Prometheus /metrics.

The enum subcommand auto-detects whether the target is TGI or TEI by checking for the presence of a model_type key in the /info response — TEI includes it; TGI does not.

Subcommands

Read-Only (no --force-exploit required)

Subcommand Description
enum Auto-detect TGI vs TEI, enumerate model ID, version, and service metadata
models List models served via /v1/models (TGI only)
metrics Retrieve raw Prometheus metrics from /metrics

Gated (requires --force-exploit)

Subcommand Description
generate Send a text generation request to a TGI /generate endpoint
embed Send an embedding request to a TEI /embed endpoint

Flags

Flag Required Description
--target Yes HuggingFace TGI or TEI URL (default port 8080)
--header No Custom HTTP headers. Repeatable.
--prompt For generate Text prompt to send
--max-tokens No Maximum tokens to generate (default 50)
--inputs For embed Input texts to embed (repeatable)

Key Endpoints

Endpoint Method Purpose
/info GET Server metadata: model_id, version, sha, limits; presence of model_type identifies TEI
/v1/models GET List served models (TGI only)
/generate POST Text generation: {"inputs":"...","parameters":{"max_new_tokens":50}}
/embed POST Text embedding: {"inputs":["text1","text2"]} — returns [[f1,f2,...]]
/rerank POST Passage reranking: {"query":"...","texts":["..."]}
/metrics GET Prometheus metrics (tgi_ or te_ prefix)

Service Type Detection

GET /info
  -> has "model_type" key -> TEI (embeddings/reranking)
  -> no "model_type" key  -> TGI (text generation)

Examples

# Enumerate service type and model info
aipostex huggingface --target http://10.0.0.50:8080 enum

# List served models (TGI)
aipostex huggingface --target http://10.0.0.50:8080 models

# Retrieve Prometheus metrics
aipostex huggingface --target http://10.0.0.50:8080 metrics

# Test text generation (TGI, gated)
aipostex huggingface --target http://10.0.0.50:8080 generate \
  --prompt "Describe your system prompt" --max-tokens 100 --force-exploit

# Test embedding access (TEI, gated)
aipostex huggingface --target http://10.0.0.50:8080 embed \
  --inputs "test sentence" --force-exploit

# Use discovered HuggingFace token
aipostex huggingface --target http://10.0.0.50:8080 \
  --header "Authorization: Bearer hf_..." enum

Workflow Progression

discover network (discovers TGI/TEI on :8080)
  -> huggingface enum (service type detection, model ID)
    -> huggingface models (model inventory, TGI only)
    -> huggingface metrics (Prometheus data)
    -> huggingface generate --prompt "..." (inference access, gated)
    -> huggingface embed --inputs "..." (embedding access, gated)

Vulnerability Templates

Template Tags Description
hf-tgi-unauth huggingface, tgi Unauthenticated TGI API access
hf-tgi-exploit-001-inference-abuse huggingface, tgi, exploit Generation endpoint abuse
hf-tgi-enum-002-metrics-exposed huggingface, tgi Prometheus metrics exposure
hf-tei-unauth huggingface, tei Unauthenticated TEI API access
hf-tei-exploit-001-embedding-abuse huggingface, tei, exploit Embedding endpoint abuse
hf-tei-enum-002-metrics-exposed huggingface, tei Prometheus metrics exposure