Triton Inference Server¶
Enumerate and exploit NVIDIA Triton Inference Server instances.
Overview¶
The triton module targets the Triton Inference Server REST API (KFServing v2 protocol). It discovers server metadata, lists loaded models with their configurations, probes for shared memory vulnerabilities (CVE-2025-23319/23320/23334), and tests inference and model lifecycle operations.
Subcommands¶
Read-Only (no --force-exploit required)¶
| Subcommand | Description |
|---|---|
enum |
Server metadata, health status, and extensions |
models |
List all loaded models with detailed metadata |
model-config |
Detailed model configuration (instance groups, scheduling, optimization) |
shm-probe |
Probe shared memory regions for IPC vulnerability chain (CVE-2025-23319/23320/23334) |
Gated (requires --force-exploit)¶
| Subcommand | Description |
|---|---|
infer |
Send inference request to a model |
model-load |
Load a model from the repository (proves model injection surface) |
model-unload |
Unload a model (proves destructive model lifecycle access) |
Flags¶
| Flag | Required | Description |
|---|---|---|
--target |
Yes | Triton HTTP API URL (default port 8000) |
--header |
No | Custom HTTP headers. Repeatable. |
--model |
For model-config, infer, model-load, model-unload |
Model name |
--payload |
For infer |
JSON inference payload |
Key Endpoints¶
| Endpoint | Method | Purpose |
|---|---|---|
/v2 |
GET | Server metadata (name, version, extensions) |
/v2/health/ready |
GET | Readiness probe |
/v2/health/live |
GET | Liveness probe |
/v2/models |
GET | List all loaded models |
/v2/models/<name> |
GET | Model metadata (inputs, outputs, platform) |
/v2/models/<name>/config |
GET | Detailed model configuration |
/v2/models/<name>/infer |
POST | Model inference |
/v2/repository/index |
POST | Model repository listing |
/v2/repository/models/<name>/load |
POST | Load model from repository |
/v2/repository/models/<name>/unload |
POST | Unload model |
/v2/systemsharedmemory/status |
GET | System shared memory regions |
/v2/cudasharedmemory/status |
GET | CUDA shared memory regions |
SHM Probe (IPC Vulnerability Chain)¶
The shm-probe subcommand checks for the Wiz-discovered IPC vulnerability chain affecting Triton:
- CVE-2025-23319 -- shared memory region manipulation
- CVE-2025-23320 -- CUDA shared memory corruption
- CVE-2025-23334 -- IPC exploitation for code execution
If shared memory status endpoints expose region data (names, keys, offsets, byte sizes), it indicates the IPC attack surface is accessible.
Examples¶
# Enumerate server metadata
./aipostex triton --target http://127.0.0.1:8000 enum
# List loaded models
./aipostex triton --target http://127.0.0.1:8000 models
# Get detailed model config
./aipostex triton --target http://127.0.0.1:8000 model-config --model resnet50
# Probe shared memory (IPC vuln chain)
./aipostex triton --target http://127.0.0.1:8000 shm-probe
# Test inference (gated)
./aipostex triton --target http://127.0.0.1:8000 infer \
--model resnet50 --payload '{"inputs":[]}' --force-exploit
# Load a model from repository (gated)
./aipostex triton --target http://127.0.0.1:8000 model-load \
--model test --force-exploit
Workflow Progression¶
discover network (discovers Triton on :8000)
-> triton enum (server metadata, health)
-> triton models (loaded model inventory)
-> triton model-config --model <name> (detailed config)
-> triton shm-probe (IPC vulnerability assessment)
-> triton infer --model <name> (inference test, gated)
-> triton model-load --model <name> (model injection, gated)