Skip to content

Commit 94372e9

Browse files
authored
adds external svcs docs (#111)
Signed-off-by: Ashraf Fouda <ashraf.m.fouda@gmail.com>
1 parent 7447a80 commit 94372e9

File tree

1 file changed

+168
-0
lines changed

1 file changed

+168
-0
lines changed
Lines changed: 168 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,168 @@
1+
# External Services
2+
3+
ZOS communicates with several external services for blockchain operations, messaging, package management, and node registration. All services have per-environment endpoints (dev, test, QA, prod) and most support redundant URLs with automatic failover.
4+
5+
Endpoint configuration is defined in [environment.go](../../pkg/environment/environment.go) and can be overridden at runtime via the [zos-config](https://github.com/threefoldtech/zos-config) repository (cached for 6 hours) or kernel boot parameters.
6+
7+
## TFChain (Substrate Blockchain)
8+
9+
| Environment | Endpoints |
10+
| ----------- | ------------------------------------------------------------------------------------------------------------------------------------------- |
11+
| prod | `wss://tfchain.grid.tf/`, `wss://tfchain.02.grid.tf`, `wss://02.tfchain.grid.tf/`, `wss://03.tfchain.grid.tf/`, `wss://04.tfchain.grid.tf/` |
12+
| test | `wss://tfchain.test.grid.tf/`, `wss://tfchain.02.test.grid.tf` |
13+
| qa | `wss://tfchain.qa.grid.tf/`, `wss://tfchain.02.qa.grid.tf/` |
14+
| dev | `wss://tfchain.dev.grid.tf/`, `wss://tfchain.02.dev.grid.tf` |
15+
16+
- **Protocol**: WebSocket Secure (WSS)
17+
- **Purpose**: Node registration, twin management, capacity reporting, contract management
18+
- **Client**: `github.com/threefoldtech/tfchain/clients/tfchain-client-go`
19+
- **Retry**: Exponential backoff (`cenkalti/backoff`), 500ms initial interval, 2s max interval, 5s max elapsed time. Applied to all substrate operations.
20+
- **Override**: kernel param `substrate=` or env var `ZOS_SUBSTRATE_URL`
21+
22+
## RMB Relay (Reliable Message Bus)
23+
24+
| Environment | Endpoint |
25+
| ----------- | -------------------------- |
26+
| prod | `wss://relay.grid.tf` |
27+
| test | `wss://relay.test.grid.tf` |
28+
| qa | `wss://relay.qa.grid.tf` |
29+
| dev | `wss://relay.dev.grid.tf` |
30+
31+
- **Protocol**: WebSocket Secure (WSS)
32+
- **Purpose**: P2P messaging between nodes, request routing from users to nodes
33+
- **Client**: `github.com/threefoldtech/tfgrid-sdk-go/rmb-sdk-go`
34+
- **Retry**: Handled by the external RMB SDK library (WebSocket reconnection)
35+
- **Override**: kernel param `relay=`
36+
- **Note**: Relay URLs are stored on-chain with limited space — max 4 relays per environment
37+
38+
## Hub (Package & Flist Repository)
39+
40+
| Service | V3 URL | V4 URL |
41+
| ------------------- | ------------------------------- | ---------------------------------- |
42+
| HTTP API | `https://hub.threefold.me` | `https://v4.hub.threefold.me` |
43+
| Redis (flist index) | `redis://hub.threefold.me:9900` | `redis://v4.hub.threefold.me:9940` |
44+
| ZDB (storage) | `zdb://hub.threefold.me:9900` | `zdb://v4.hub.threefold.me:9940` |
45+
46+
- **Purpose**: Downloading OS/service packages (flists), container base images, system binaries
47+
- **API endpoints**: `/api/flist/{repo}`, `/api/flist/{repo}/{name}/light`, `/api/flist/{repo}/tags/{tag}`
48+
- **Binary repos**: `tf-zos-v3-bins` (prod), `tf-zos-v3-bins.test`, `tf-zos-v3-bins.qanet`, `tf-zos-v3-bins.dev`
49+
- **Retry**: `go-retryablehttp`, 5 retries with exponential backoff, 20s HTTP timeout
50+
- **Override**: env var `ZOS_FLIST_URL`, `ZOS_BIN_REPO`
51+
52+
## GraphQL Gateway
53+
54+
| Environment | Endpoints |
55+
| ----------- | -------------------------------------------------------------------------------------------------------------------- |
56+
| prod | `https://graphql.grid.threefold.me/graphql`, `https://graphql.grid.tf/graphql`, `https://graphql.02.grid.tf/graphql` |
57+
| test | `https://graphql.test.grid.tf/graphql`, `https://graphql.02.test.grid.tf/graphql` |
58+
| qa | `https://graphql.qa.grid.tf/graphql`, `https://graphql.02.qa.grid.tf/graphql` |
59+
| dev | `https://graphql.dev.grid.tf/graphql`, `https://graphql.02.dev.grid.tf/graphql` |
60+
61+
- **Protocol**: HTTPS
62+
- **Purpose**: Grid metadata queries, node information, contract queries
63+
- **Retry**: No per-request retry. Sequential URL fallback — tries each endpoint in order until one succeeds
64+
65+
## Activation Service
66+
67+
| Environment | Endpoints |
68+
| ----------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------- |
69+
| prod | `https://activation.grid.threefold.me/activation/activate`, `https://activation.grid.tf/activation/activate`, `https://activation.02.grid.tf/activation/activate` |
70+
| test | `https://activation.test.grid.tf/activation/activate`, `https://activation.02.test.grid.tf/activation/activate` |
71+
| qa | `https://activation.qa.grid.tf/activation/activate`, `https://activation.02.qa.grid.tf/activation/activate` |
72+
| dev | `https://activation.dev.grid.tf/activation/activate`, `https://activation.02.dev.grid.tf/activation/activate` |
73+
74+
- **Protocol**: HTTPS
75+
- **Purpose**: Twin account creation and activation
76+
- **Retry**: Exponential backoff (`cenkalti/backoff`), 500ms initial, 2s max interval, 5s max elapsed time. Tries multiple URLs — moves to next URL only on activation service errors
77+
- **Override**: kernel param `activation=`
78+
79+
## Registrar
80+
81+
| Environment | Endpoint |
82+
| ----------- | -------------------------------------- |
83+
| prod | `https://registrar.prod4.threefold.me` |
84+
| qa | `https://registrar.qa4.grid.tf` |
85+
| test | `http://registrar.test4.grid.tf` |
86+
| dev | `http://registrar.dev4.grid.tf` |
87+
88+
- **Purpose**: Node registration, identity management
89+
- **Retry**: Exponential backoff (`cenkalti/backoff`), 2min max interval, indefinite retry (`MaxElapsedTime=0`) — retries forever until registration succeeds
90+
- **Terms of Service**: `http://zos.tf/terms/v0.1`
91+
92+
## KYC (Know Your Customer)
93+
94+
| Environment | Endpoint |
95+
| ----------- | -------------------------- |
96+
| prod | `https://kyc.threefold.me` |
97+
| test | `https://kyc.test.grid.tf` |
98+
| qa | `https://kyc.qa.grid.tf` |
99+
| dev | `https://kyc.dev.grid.tf` |
100+
101+
- **Purpose**: Identity verification, KYC compliance checks (twin verification at `/api/v1/status`)
102+
- **Retry**: `go-retryablehttp`, 5 retries with exponential backoff, 10s HTTP timeout
103+
104+
## GeoIP
105+
106+
Shared across all environments:
107+
108+
- `https://geoip.threefold.me/`
109+
- `https://geoip.grid.tf/`
110+
- `https://02.geoip.grid.tf/`
111+
- `https://03.geoip.grid.tf/`
112+
113+
- **Purpose**: Node geographic location detection (longitude, latitude, country, city)
114+
- **Retry**: `go-retryablehttp`, 5 retries with exponential backoff, 10s HTTP timeout. Also falls back to next URL in the list
115+
- **Source**: [geoip.go](../../pkg/geoip/geoip.go)
116+
117+
## ZOS Config (Runtime Configuration)
118+
119+
- **Base URL**: `https://raw.githubusercontent.com/threefoldtech/zos-config/main/`
120+
- **Files**: `dev.json`, `test.json`, `qa.json`, `prod.json`
121+
- **Purpose**: Runtime override of all service endpoints, peer lists (Yggdrasil, Mycelium), authorized users, admin twins, rollout upgrade farms
122+
- **Retry**: `go-retryablehttp`, 5 retries with exponential backoff, 10s HTTP timeout. Falls back to expired cache if all retries fail
123+
- **Cache**: 6 hours
124+
- **Source**: [config.go](../../pkg/environment/config.go)
125+
126+
## Overlay Networks
127+
128+
### Yggdrasil
129+
130+
- **Listen ports**: TCP 9943, TLS 9944, LinkLocal 9945
131+
- **Interface**: `ygg0`, MTU 65535
132+
- **Peer list**: sourced from zos-config `yggdrasil.peers`
133+
- **Purpose**: IPv6 mesh networking (`200::/7`), only in the full network module (not network-light)
134+
135+
### Mycelium
136+
137+
- **Peer list**: sourced from zos-config `mycelium.peers`
138+
- **Purpose**: End-to-end encrypted mesh networking (`400::/7`)
139+
- **Used by**: both full network and network-light modules
140+
141+
## Local Services
142+
143+
### Redis
144+
145+
- **Default**: `unix:///var/run/redis.sock` or `redis://localhost:6379`
146+
- **Purpose**: IPC message bus (zbus), event queuing, stats aggregation
147+
148+
### Node HTTP API
149+
150+
- **Endpoint**: `http://[{node_ipv6}]:2021/api/v1/`
151+
- **Purpose**: Node management API accessible over Yggdrasil/Mycelium mesh
152+
- **Retry**: `go-retryablehttp` default client (1 retry)
153+
154+
## Summary
155+
156+
| Service | Protocol | Redundancy | Retry | Purpose |
157+
| ---------- | --------------- | -------------------- | --------------------------------------------- | ------------------------------------ |
158+
| TFChain | WSS | 2-5 endpoints | Exponential backoff, 5s window | Blockchain, contracts, node registry |
159+
| RMB Relay | WSS | 1 endpoint | External SDK (WebSocket reconnect) | P2P messaging |
160+
| Hub | HTTPS/Redis/ZDB | 1 endpoint (V3 + V4) | 5 retries, 20s timeout | Package distribution |
161+
| GraphQL | HTTPS | 2-3 endpoints | Sequential URL fallback only | Grid metadata queries |
162+
| Activation | HTTPS | 2-3 endpoints | Exponential backoff, 5s window + URL fallback | Account activation |
163+
| Registrar | HTTP/HTTPS | 1 endpoint | Exponential backoff, indefinite | Node registration |
164+
| KYC | HTTPS | 1 endpoint | 5 retries, 10s timeout | Identity verification |
165+
| GeoIP | HTTPS | 4 endpoints | 5 retries + URL fallback | Location detection |
166+
| ZOS Config | HTTPS | 1 endpoint (GitHub) | 5 retries + 6hr cache fallback | Runtime configuration |
167+
| Yggdrasil | TCP/TLS | Peer mesh | Peer reconnection | IPv6 overlay network |
168+
| Mycelium | TCP | Peer mesh | Peer reconnection | Encrypted overlay network |

0 commit comments

Comments
 (0)