LLM Supply Chain Risks — Model and Dependency Attacks
LLM supply chain risks arise from the complex dependency chain involved in building AI applications: pretrained model weights, fine-tuning datasets, embedding models, vector databases, Python/JS packages, and API providers. A compromise at any point in this chain can introduce backdoors, leak data, or give attackers control over your AI system's behavior.
Attack Surfaces
Model weights: Downloading pretrained models from Hugging Face or other hubs — malicious models can contain backdoors or execute arbitrary code via pickle deserialization. Dependencies: Python packages (transformers, langchain, etc.) can be compromised via typosquatting, dependency confusion, or maintainer account takeover. Fine-tuning data: Third-party datasets used for fine-tuning may be poisoned. API providers: Third-party LLM APIs can change model behavior, logging policies, or data retention without notice.
Real-World Examples
Pickle deserialization attacks via malicious model files on Hugging Face. Dependency confusion attacks targeting AI-related package names. Compromised npm packages in LangChain.js ecosystem. Malicious fine-tuning datasets on public data repositories. Supply chain attacks via compromised Docker images for AI inference servers.
Defense Strategies
Pin and hash all dependencies. Use safetensors format instead of pickle for model weights. Audit model files before loading. Verify checksums of downloaded models. Use private model registries with access control. Scan dependencies with tools like pip-audit, npm audit, and Snyk. Implement model signing and provenance tracking. Use isolated environments (containers, VMs) for model inference.
Related Questions
Scan your system prompt with LochBot — free, client-side, no data sent anywhere.