A reference architecture demonstrating Apollo GraphQL Federated Subscriptions at scale using the HTTP Callback Protocol, with a React web app, multiple Router and subgraph pods, Kafka for event streaming, Redis for subscription state management, and a full observability stack.
The project contains two subgraphs that implement the HTTP Callback Protocol in different ways — a side-by-side comparison of both approaches:
| Subgraph | Approach | Scalability |
|---|---|---|
notifications |
ApolloServerPluginSubscriptionCallback — in-process state |
Single-pod (plugin holds state in memory) |
orders |
Manual callback protocol — Redis-backed state | Multi-pod (any pod can service any event) |
┌────────────────────────────────────────────────────────────────────────┐
│ React Web App (localhost:3000) │
│ Apollo Client 4 · HTTP multipart subscriptions · no WebSocket │
└─────────────────────────────┬──────────────────────────────────────────┘
│ HTTP multipart (chunked streaming)
▼
┌────────────────────────────────────────────────────────────────────────┐
│ nginx Load Balancer (localhost:4000) │
│ proxy_buffering off · proxy_read_timeout 3600s · resolver 127.0.0.11 │
└───────────────────┬──────────────────────────┬─────────────────────────┘
│ round-robin │ round-robin
┌──────────▼──────────┐ ┌──────────▼──────────┐
│ Apollo Router │ │ Apollo Router │
│ router-0:4000 │ │ router-1:4000 │
│ CALLBACK_PUBLIC_URL│ │ CALLBACK_PUBLIC_URL │
│ http://router-0: │ │ http://router-1: │
│ 4000/callback │ │ 4000/callback │
└──────────┬──────────┘ └──────────┬───────────┘
└──────────┬───────────────┘
│ HTTP Callback Protocol
┌────────────────────┼────────────────────┐
▼ ▼ ▼
┌──────────────────┐ ┌─────────────────┐ ┌──────────────────┐
│ Notifications │ │ Orders ×2 │ │ Users │
│ (4001) │ │ (4003) │ │ (4002) │
│ │ │ │ │ │
│ Plugin approach │ │ Manual Redis │ │ Query/Mutation │
│ In-process state│ │ approach │ │ only (no subs) │
│ Single-pod only │ │ Stateless │ │ Stateless │
│ │ │ Any pod → │ │ │
│ KafkaJS consumer│ │ any delivery │ │ │
└────────┬─────────┘ └────────┬────────┘ └──────────────────┘
│ │ KafkaJS consumer group
│ │ (3 partitions / 2 pods)
└──────────┬──────────┘
▼
┌────────────────────────────────────────────────────────────────────────┐
│ Apache Kafka 4.x (KRaft mode, no ZooKeeper) │
│ │
│ notification-events 3 partitions keyed by userId │
│ system-alerts 1 partition keyed by userId │
│ order-status-changed 3 partitions keyed by orderId │
│ (same orderId → same partition → ordered events) │
└─────────────────────────────────────────────────────────────────────────┘
┌──────────────────────────────────────────┐ ┌──────────────────────────┐
│ Redis 7.4 │ │ Subscription Manager ×2 │
│ │ │ │
│ substate:{id} Hash TTL 60s │ │ Redis leader election: │
│ callbackUrl │ │ SET sub:manager:lock │
│ verifier │ │ NX EX 35 │
│ orderId │ │ │
│ indexKey ← full Redis key │ │ Every 20s (leader only): │
│ subgraph ← for observability │ │ SCAN substate:* │
│ │ │ → POST check to Router │
│ subindex:{orderId} Set TTL 1h │ │ → 200: EXPIRE 60s │
│ {sub-id-1, sub-id-2, ...} │ │ → 404: DEL + SREM │
│ │ │ │
│ sub:manager:lock String TTL 35s │ │ Standby takes over │
│ manager-<hostname>-<pid> │ │ within 35s on crash │
└──────────────────────────────────────────┘ └──────────────────────────┘
┌────────────────────────────────────────────────────────────────────────┐
│ Observability │
│ │
│ Subgraphs + Router ──OTLP/HTTP:4318──▶ OTel Collector │
│ │ traces │ metrics │
│ ▼ ▼ │
│ Zipkin Prometheus │
│ :9411 :9090 │
│ │ │
│ Grafana │
│ :3001 │
└─────────────────────────────────────────────────────────────────────────┘
Callback URLs are pod-specific. When router-0 accepts a client subscription it registers
http://router-0:4000/callback/{id} with the subgraph. That URL is stored in Redis. Any orders
pod that consumes a Kafka event reads the URL from Redis and POSTs directly to router-0,
bypassing nginx. Events always reach the right pod — the one holding the client's open
HTTP response.
# 1. Copy environment template and fill in your Apollo Studio credentials
cp .env.example .env
# 2. Generate the supergraph schema (requires Rover CLI)
bash scripts/compose-supergraph.sh
# 3. Start the full scaled stack
docker compose -f docker-compose.scale.yml up -d --build \
--scale orders=2 \
--scale subscription-manager=2
# 4. Verify all checks pass
bash scripts/test-scale.sh| Service | URL |
|---|---|
| Web App | http://localhost:3000 |
| Apollo Sandbox | http://localhost:4000 |
| Kafka UI | http://localhost:8080 |
| Zipkin (traces) | http://localhost:9411 |
| Prometheus | http://localhost:9090 |
| Grafana (admin/admin) | http://localhost:3001 |
| Component | Version |
|---|---|
| Apollo Federation | v2.12 |
| Apollo Router | v2.11.x |
| Apollo Server | v5.4.0 |
| Apollo Client | v4.x |
| Redis | 7.4.x |
| Kafka | 4.x (KRaft, official Apache image) |
| Node.js | 22 (bookworm-slim) |
See VERSION_MATRIX.md for all pinned dependency versions.
apollo-subscriptions-at-scale/
├── docker-compose.infra.yml # Phase 1: Redis + Kafka only
├── docker-compose.yml # Full single-pod stack (dev)
├── docker-compose.scale.yml # Phase 6+: scaled stack (2 Routers, 2 Orders, 2 Managers)
├── supergraph-config.yaml # Rover supergraph composition config
├── subgraphs/
│ ├── notifications/ # Plugin approach: ApolloServerPluginSubscriptionCallback
│ ├── orders/ # Manual approach: Redis-backed callback protocol
│ └── users/ # Query/Mutation only (federation entity resolution)
├── subscription-manager/ # Dedicated heartbeat service (Redis leader election)
├── router/ # Apollo Router config (router.yaml + supergraph.graphql)
├── nginx/ # nginx load balancer config
├── web-app/ # React + Apollo Client 4 frontend
├── observability/ # OTel Collector, Prometheus, Grafana configs
├── helm/ # Kubernetes Helm charts
└── scripts/ # Verification and utility scripts
| File | Purpose |
|---|---|
| WALKTHROUGH.md | Full system walkthrough — every component explained with diagrams, design decisions, and a complete request lifecycle trace |
| CLAUDE.md | Architecture reference, critical config facts, key insights, testing guide |
| implementation-roadmap.md | Phased implementation plan |
| subscription-at-scale-plan.md | Original architecture design |
MIT