Model scan (supply chain)¶

The model-scan command inspects local model artifacts (.pkl, .pickle, .pt, .pth, .bin, .onnx, .safetensors, .gguf, .pb, .h5, .keras) for deserialization and format risks. It emits standard report.Finding JSON like other modules.

Usage¶

aipostex model-scan --path /path/to/model.pt --format json
aipostex model-scan --path /data/models/ --format json
aipostex model-scan --path /data/models/ --max-file-size-mb 0 --exclude custom_vendor
aipostex model-scan --path ./weights.bin --hash sha256:<hex>

Flags¶

Flag	Description
`--path`	File or directory (required).
`--hash`	For a single file, compare SHA-256 to `sha256:hex` (tamper check).
`--max-file-size-mb`	Directory scans only: skip files larger than this (default `100`). Use `0` for no limit.
`--exclude`	Extra directory base names to skip during directory walks (merged with built-in vendor paths).
`--no-default-excludes`	Do not skip `.git`, `node_modules`, `venv`, etc.; only `--exclude` entries apply.

Risk types (examples)¶

Risk type	Severity	Meaning
`pickle-deserialization`	critical	Pickle can execute code on load.
`pytorch-pickle`	critical	`torch.save` ZIP+pickle checkpoint — unsafe `torch.load()` defaults.
`pickle-opcode-*`	critical	Dangerous opcode observed in scanned bytes.
`tf-python-op`	critical	TensorFlow SavedModel `.pb` containing a Python function op — arbitrary code execution on load.
`keras-lambda-layer`	high	`.h5` or `.keras` model with a Lambda layer — serialized Python source executed on load.
`onnx-custom-op`	high	ONNX model references a custom operator domain — may load untrusted native code.
`onnx-external-data`	medium	ONNX model references external tensor data files — path traversal / supply-chain risk.
`model-format`	info	ONNX, SafeTensors, GGUF, `.pb` (no Python ops), `.h5`/`.keras` (no Lambda) — note format / provenance.
`skipped-large-file`	info	File over `--max-file-size-mb` during directory scan.

Implementation¶

Logic lives in pkg/modelscan; CLI in cmd/aipostex/model_scan.go.