Injection vs. Model Scale¶
Page embargoed pending paper publication
The detailed analysis of how prompt-injection rates vary across model scale (7B → 72B), framework adaptation, and cross-model mismatch is part of an ongoing research paper currently under peer review. The full results — including paired-replay claim-grade evidence across the scale ladder — will be published in the paper and re-summarized on this page after publication.
Until then, this page holds a brief qualitative summary only. The numerical effect sizes that previously appeared here were historical exploratory sweeps (pre-fix epoch, April 2-12, 2026) that the paper itself does not cite as evidence; they have been moved out of the public repository while the paper is under review.
What this page will cover (after publication)¶
- Baseline injection rates across the scale ladder (7B BF16 → 32B BF16 → 72B AWQ-4bit → 72B FP8-dynamic)
- How transfer of 7B-tuned Bayesian-optimized parameters behaves up the scale curve
- Per-framework variance at each scale rung
- Cross-family generalization probe (Llama 3.1 8B)
Qualitative observations that hold¶
The following qualitative observations are paper-track findings; specific magnitudes are deferred to the paper:
- Scale matters but is not protective. Larger models do not uniformly resist corpus-poisoning attacks; the relationship is non-monotonic across categories.
- Framework choice is consequential. The four canonical RAG frameworks (LangChain, LlamaIndex, Unstructured, Haystack) differ measurably in how much they amplify or suppress poisoned-document influence on the model's output.
- Retrieval and injection decouple. Embedding-similarity optimization that improves retrieval ranking does not consistently translate to improved end-to-end injection success.
Reproduce locally¶
While the headline analysis is embargoed, the infrastructure for reproducing it is public:
hemlock craft/hemlock batch— generate poisoned documentshemlock stack up— bring up the four-framework RAG battery locallyhemlock run— fire crafted docs at a target pipe and capture resultshemlock attack report— render results as markdown / HTML / SARIF
You can run your own scale sweep using these primitives against any models and frameworks you have access to.
Related¶
- Optimization Analysis — companion page on adversarial optimization
- Joint Optimization — multi-objective scoring documentation