Skip to content

BentoML

Enumerate and exploit BentoML model serving services.

Overview

The bentoml module targets BentoML services that expose REST APIs for model inference. It discovers service metadata, parses OpenAPI specs for prediction endpoints, tests inference access, and extracts Prometheus metrics.

Subcommands

Read-Only (no --force-exploit required)

Subcommand Description
enum Enumerate service metadata, health, and API routes
routes Parse OpenAPI spec for all prediction endpoints with input schemas
metrics Retrieve Prometheus metrics (request counts, latency, model performance)

Gated (requires --force-exploit)

Subcommand Description
predict Send a prediction request to test inference access

Flags

Flag Required Description
--target Yes BentoML service URL (default port 3000)
--header No Custom HTTP headers. Repeatable.
--endpoint For predict Prediction endpoint path (default: /)
--payload For predict JSON payload for prediction

Examples

# Enumerate service metadata
./aipostex bentoml --target http://127.0.0.1:3000 enum

# List all prediction routes from OpenAPI spec
./aipostex bentoml --target http://127.0.0.1:3000 routes

# Extract Prometheus metrics
./aipostex bentoml --target http://127.0.0.1:3000 metrics

# Test inference access (gated)
./aipostex bentoml --target http://127.0.0.1:3000 predict \
  --endpoint /predict --payload '{"input": "test"}' --force-exploit

Key Endpoints

Endpoint Method Purpose
/ GET Service metadata (name, version)
/healthz GET Health check
/docs.json GET OpenAPI specification
/metrics GET Prometheus metrics
/<endpoint> POST Prediction endpoints (from OpenAPI spec)

Workflow Progression

discover network (discovers BentoML on :3000)
  -> bentoml enum (service metadata, routes)
    -> bentoml routes (detailed endpoint discovery)
    -> bentoml metrics (operational data)
    -> bentoml predict --endpoint <route> (inference test, gated)