Skip to content

aipostex

Triton

professor-moody/aipostex

Triton Inference Server¶

Enumerate and exploit NVIDIA Triton Inference Server instances.

Overview¶

The triton module targets the Triton Inference Server REST API (KFServing v2 protocol). It discovers server metadata, lists loaded models with their configurations, probes for shared memory vulnerabilities (CVE-2025-23319/23320/23334), and tests inference and model lifecycle operations.

Subcommands¶

Read-Only (no `--force-exploit` required)¶

Subcommand	Description
`enum`	Server metadata, health status, and extensions
`models`	List all loaded models with detailed metadata
`model-config`	Detailed model configuration (instance groups, scheduling, optimization)
`shm-probe`	Probe shared memory regions for IPC vulnerability chain (CVE-2025-23319/23320/23334)

Gated (requires `--force-exploit`)¶

Subcommand	Description
`infer`	Send inference request to a model
`model-load`	Load a model from the repository (proves model injection surface)
`model-unload`	Unload a model (proves destructive model lifecycle access)

Flags¶

Flag	Required	Description
`--target`	Yes	Triton HTTP API URL (default port 8000)
`--header`	No	Custom HTTP headers. Repeatable.
`--model`	For `model-config`, `infer`, `model-load`, `model-unload`	Model name
`--payload`	For `infer`	JSON inference payload

Key Endpoints¶

Endpoint	Method	Purpose
`/v2`	GET	Server metadata (name, version, extensions)
`/v2/health/ready`	GET	Readiness probe
`/v2/health/live`	GET	Liveness probe
`/v2/models`	GET	List all loaded models
`/v2/models/<name>`	GET	Model metadata (inputs, outputs, platform)
`/v2/models/<name>/config`	GET	Detailed model configuration
`/v2/models/<name>/infer`	POST	Model inference
`/v2/repository/index`	POST	Model repository listing
`/v2/repository/models/<name>/load`	POST	Load model from repository
`/v2/repository/models/<name>/unload`	POST	Unload model
`/v2/systemsharedmemory/status`	GET	System shared memory regions
`/v2/cudasharedmemory/status`	GET	CUDA shared memory regions

SHM Probe (IPC Vulnerability Chain)¶

The shm-probe subcommand checks for the Wiz-discovered IPC vulnerability chain affecting Triton:

CVE-2025-23319 -- shared memory region manipulation
CVE-2025-23320 -- CUDA shared memory corruption
CVE-2025-23334 -- IPC exploitation for code execution

If shared memory status endpoints expose region data (names, keys, offsets, byte sizes), it indicates the IPC attack surface is accessible.

Examples¶

# Enumerate server metadata
./aipostex triton --target http://127.0.0.1:8000 enum

# List loaded models
./aipostex triton --target http://127.0.0.1:8000 models

# Get detailed model config
./aipostex triton --target http://127.0.0.1:8000 model-config --model resnet50

# Probe shared memory (IPC vuln chain)
./aipostex triton --target http://127.0.0.1:8000 shm-probe

# Test inference (gated)
./aipostex triton --target http://127.0.0.1:8000 infer \
  --model resnet50 --payload '{"inputs":[]}' --force-exploit

# Load a model from repository (gated)
./aipostex triton --target http://127.0.0.1:8000 model-load \
  --model test --force-exploit

Workflow Progression¶

discover network (discovers Triton on :8000)
  -> triton enum (server metadata, health)
    -> triton models (loaded model inventory)
    -> triton model-config --model <name> (detailed config)
    -> triton shm-probe (IPC vulnerability assessment)
    -> triton infer --model <name> (inference test, gated)
    -> triton model-load --model <name> (model injection, gated)