P2P acceleration for ML model distribution.
zest speaks HuggingFace's Xet protocol (via zig-xet) for content addressing and BitTorrent (BEP 3 / BEP 10 / BEP XET) for peer-to-peer transfer. Models download from nearby peers first, fall back to HF's CDN.
zest pull meta-llama/Llama-3.1-70B
# pulls chunks from peers via BitTorrent, falls back to HF CDN
# drop-in compatible with existing HuggingFace cache layoutAfter pulling, transformers.AutoModel.from_pretrained("meta-llama/Llama-3.1-70B") just works — zero workflow change.
HuggingFace replaced Git LFS with Xet storage in 2025. Xet is excellent: chunk-level deduplication (~64KB CDC chunks), content-addressed xorbs, Merkle hashing, efficient incremental uploads. But it's still centralized — every download hits HF's servers via presigned S3 URLs.
When a popular model drops, tens of thousands of people download the same immutable xorbs from the same CDN. This is the exact topology BitTorrent was invented to fix.
zest is to Xet what WebTorrent is to HTTP — same content addressing, peers serve each other.
pip install zest-transfer
# or
uv pip install zest-transferThis installs the zest CLI and the Python library. No Zig toolchain needed.
zest needs a HuggingFace token to download models. Set it up once:
# option 1: environment variable
export HF_TOKEN=hf_xxxxxxxxxxxxxxxxxxxxx
# option 2: huggingface-cli (token saved to ~/.cache/huggingface/token)
pip install huggingface_hub
huggingface-cli loginGet your token at huggingface.co/settings/tokens.
import zest
zest.enable() # monkey-patches huggingface_hub for P2P downloads
zest.pull("meta-llama/Llama-3.1-8B") # download via P2P + CDN
zest.status() # server stats
zest.stop() # stop the server
# or auto-enable via env var:
# ZEST=1 python your_script.pyRequires Zig 0.16.0+.
git clone https://github.com/praveer13/zest.git
cd zest
zig build -Doptimize=ReleaseFast
# binary at ./zig-out/bin/zest (~9 MB static binary)# basic pull (CDN + DHT peer discovery)
zest pull meta-llama/Llama-3.1-8B
# specific revision
zest pull Qwen/Qwen2-7B --revision v1.0
# direct peer connection (no tracker/DHT needed)
zest pull gpt2 --peer 10.0.0.5:6881
# with BT tracker for peer discovery
zest pull meta-llama/Llama-3.1-8B --tracker http://tracker.example.com:6881
# CDN-only (disable P2P)
zest pull meta-llama/Llama-3.1-8B --no-p2p# run server in foreground (BT listener on :6881, HTTP API on :9847)
zest serve
# custom ports
zest serve --http-port 8080 --listen-port 7000
# start/stop as background service
zest start
zest stopThe HTTP API provides:
GET /v1/health— health checkGET /v1/status— JSON stats (peers, chunks served, xorbs cached)POST /v1/pull— trigger model downloadPOST /v1/stop— graceful shutdown
# announce cached xorbs via BT protocol so peers can fetch from you
zest seed --tracker http://tracker.example.com:6881# synthetic benchmarks (bencode, BLAKE3, wire framing)
zest bench --synthetic
# JSON output for CI
zest bench --synthetic --jsonzest version # print version
zest help # show usageThis is the quickest way to verify end-to-end P2P transfer works.
# on each server:
pip install zest-transfer# download a small model from HF (CDN)
zest pull gpt2
# start seeding (BT listener on :6881, HTTP API on :9847)
zest serve# --peer tells zest to try Server A directly
zest pull gpt2 --peer <server-a-ip>:6881The output will show download stats including how much came from peers vs CDN.
Same test, but from Python:
# server_a.py
import zest
zest.pull("gpt2") # downloads from CDN, auto-starts server, seeds
# server_b.py — once Server A is running
import subprocess
subprocess.run(["zest", "pull", "gpt2", "--peer", "<server-a-ip>:6881"])Or with the programmatic API:
# server A
import zest
zest.pull("gpt2")
# check status
print(zest.status())
# {"version": "0.3.1", "bt_peers": 0, "chunks_served": 0, "xorbs_cached": 12, ...}On Server B, check the download stats output:
Download stats:
From peers: 12 ← chunks came from Server A
From CDN: 0 ← nothing from HF servers
P2P ratio: 100.0%
On Server A, check the HTTP API:
curl http://localhost:9847/v1/status
# {"version":"0.3.0","bt_peers":1,"chunks_served":12,...}┌──────────────────────────────────────────────────────────┐
│ zest pull org/model │
├──────────────────────────────────────────────────────────┤
│ 1. zig-xet: list files, detect Xet-backed files │
│ 2. For each xorb: │
│ ✓ Check local cache (~/.cache/zest/xorbs/) │
│ ✓ Try direct peers (--peer flag) │
│ ✓ DHT get_peers(info_hash) + BT tracker announce │
│ ✓ BT handshake → BEP 10 → BEP XET CHUNK_REQUEST │
│ ✓ Download chunks from peers (P2P) │
│ ✓ Fall back to CDN (presigned S3 URL) if needed │
│ ✓ Verify BLAKE3 hash on every chunk │
│ ✓ Cache locally for future seeding │
├──────────────────────────────────────────────────────────┤
│ Reconstruct files → write to HF cache layout │
│ → transformers.from_pretrained() just works │
└──────────────────────────────────────────────────────────┘
zest implements the standard BitTorrent wire protocol with the BEP XET extension for chunk-level transfer:
- BEP 3 — Wire protocol (68-byte handshake, length-prefixed messages)
- BEP 5 — Kademlia DHT for decentralized peer discovery
- BEP 10 — Extension protocol (negotiates ut_xet support)
- BEP XET — CHUNK_REQUEST / CHUNK_RESPONSE / CHUNK_NOT_FOUND / CHUNK_ERROR
This means zest peers can interoperate with any BEP XET-compliant client, including ccbittorrent.
- Uses zig-xet for Xet protocol — production-quality implementation by Frank Denis (creator of libsodium). Handles auth, CAS, chunking, hashing, compression, and reconstruction.
- Never slower than vanilla hf_xet — worst case is CDN-only (same as status quo).
- No trust required for peers — BLAKE3 hash verification on every chunk.
- HF cache compatible — writes to
~/.cache/huggingface/hub/so all existing tooling works. - 64KB chunks — matches HuggingFace Xet's CDC parameters for content-level interop.
- Connection pooling — persistent BT connections reused across xorb downloads.
- Cached peer discovery — DHT/tracker queried once, reused for all xorbs (30s TTL refresh).
- Direct P2P data return — P2P data used immediately, no disk cache round-trip.
- Seed-while-downloading — newly downloaded xorbs are immediately available for serving to other peers.
zest/
├── build.zig Build configuration (Zig 0.16)
├── build.zig.zon Package manifest (depends on zig-xet)
├── DESIGN.md Design document (architecture, roadmap, BEP XET details)
├── CLAUDE.md AI assistant context
├── README.md This file
├── scripts/
│ └── build-wheel.sh Build Zig binary + Python wheel
├── python/
│ ├── pyproject.toml Python package metadata
│ └── zest/
│ ├── __init__.py Public API: enable(), pull(), status(), stop()
│ ├── server.py Zig binary lifecycle management
│ ├── client.py HTTP client for localhost API
│ └── hf_backend.py huggingface_hub monkey-patch
├── .github/workflows/
│ └── ci.yml CI: build, test, lint, benchmark, metrics
└── src/
├── main.zig CLI: pull, seed, serve, start, stop, bench
├── root.zig Library root, re-exports all modules
├── config.zig Cache dirs, HF token, DHT config, peer ID
├── bencode.zig Bencode encoder/decoder (BT message serialization)
├── peer_id.zig BT peer ID generation + SHA-1 info_hash
├── bt_wire.zig BT wire protocol (BEP 3 + BEP 10 framing)
├── bep_xet.zig BEP XET extension (4 message types)
├── bt_peer.zig BT peer connection lifecycle + pipelining
├── peer_pool.zig Connection pool for BT peer reuse
├── dht.zig Kademlia DHT (BEP 5) for peer discovery
├── bt_tracker.zig Standard BT HTTP tracker client
├── xet_bridge.zig Bridges zig-xet CAS with P2P swarm (cache→P2P→CDN waterfall)
├── parallel_download.zig Concurrent xorb fetching via Io.Group (up to 16 parallel)
├── swarm.zig Download orchestrator (cache→peers→CDN)
├── storage.zig File I/O, HF cache refs, xorb/chunk cache, XorbRegistry
├── server.zig BT TCP listener for seeding chunks (concurrent via Io.Group)
├── http_api.zig HTTP REST API for Python integration
└── bench.zig Synthetic benchmarks with JSON output
├── test/
│ └── hetzner/
│ └── p2p-test.sh 3-node Hetzner Cloud P2P integration test
Synthetic benchmark results (ReleaseFast, x86_64):
| Benchmark | Throughput | What it measures |
|---|---|---|
| blake3_64kb | 3,517 MB/s | Chunk hash verification speed |
| bt_wire_frame | 11,943 MB/s | BT message framing overhead |
| sha1_info_hash | 755 MB/s | info_hash computation |
| bencode_decode | 324 MB/s | BT message deserialization |
| bencode_encode | 206 MB/s | BT message serialization |
Run benchmarks: zest bench --synthetic or zest bench --synthetic --json for CI.
# run all tests (72 tests across 18 modules)
zig build test --summary all
# check formatting
zig fmt --check src/# build (debug)
zig build
# build (release, ~9 MB static binary)
zig build -Doptimize=ReleaseFast
# run directly
zig build run -- pull meta-llama/Llama-3.1-8B
# run tests
zig build test --summary all| Path | Contents |
|---|---|
~/.cache/huggingface/hub/models--{org}--{name}/ |
HF-compatible model cache |
~/.cache/huggingface/hub/models--{org}--{name}/snapshots/{commit}/ |
Model files |
~/.cache/huggingface/hub/models--{org}--{name}/refs/main |
Commit SHA ref |
~/.cache/zest/xorbs/{prefix}/{hash} |
Downloaded xorbs (for seeding) |
~/.cache/zest/chunks/{prefix}/{hash} |
Individual chunks (for BEP XET serving) |
~/.cache/zest/zest.pid |
PID file for background server |
- Phase 1: BT-Compliant P2P Core — BEP 3/5/10/XET, DHT, bencode, benchmarks
- Phase 2: Server Mode — BT TCP listener, HTTP REST API, serve/start/stop commands
- Phase 3: Transfer Optimizations — connection pooling, request pipelining, seed-while-downloading
- Phase 4: Python Package —
pip install zest, HF backend hook, auto-enable viaZEST=1 - Phase 5: XET Bridge + Parallel Downloads — xorb-level cache→P2P→CDN waterfall, 16x concurrent downloads, thread-safe peer pool
- Phase 6: P2P Optimizations — cached peer discovery (30s TTL), direct P2P data return (no cache round-trip), larger batch depth, typed P2P errors
- Phase 7: Ecosystem — vLLM, Ollama, llama.cpp integrations
See DESIGN.md for the full design document with architecture, BEP XET compliance details, and UX plans.
- BEP XET Specification — chunk-level BitTorrent extension
- zig-xet — Zig Xet protocol implementation by Frank Denis
- Xet Protocol Spec — HuggingFace content addressing
- xet-core — Rust reference implementation