Skip to content

Commit 46dc603

Browse files
authored
adds primitives docs (#104)
Signed-off-by: Ashraf Fouda <ashraf.m.fouda@gmail.com>
1 parent 4ad2c1c commit 46dc603

File tree

1 file changed

+221
-0
lines changed

1 file changed

+221
-0
lines changed
Lines changed: 221 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,221 @@
1+
# Primitives Package
2+
3+
## Introduction
4+
5+
The primitives package implements the workload managers for all deployable workload types in 0-OS. It acts as the glue between the [provision engine](../provision/readme.md) and the underlying system daemons (storage, networking, VM, container, etc.), which it reaches over [zbus](https://github.com/threefoldtech/zbus).
6+
7+
Each workload type has a dedicated manager that knows how to provision, deprovision, and optionally update or pause that specific resource. The top-level `NewPrimitivesProvisioner` function wires all managers together into a single `provision.Provisioner` that the engine dispatches to by workload type.
8+
9+
## Architecture
10+
11+
```
12+
provision engine
13+
|
14+
mapProvisioner
15+
(dispatches by type)
16+
|
17+
+--------+--------+--+--+--------+--------+
18+
| | | | | |
19+
zmount volume network vm zdb gateway ...
20+
| | | | | |
21+
v v v v v v
22+
StorageStub NetStub VMStub ContainerStub GatewayStub
23+
(zbus) (zbus) (zbus) (zbus) (zbus)
24+
```
25+
26+
## Manager Interface
27+
28+
Every workload manager must implement the `provision.Manager` interface:
29+
30+
```go
31+
type Manager interface {
32+
Provision(ctx context.Context, wl *gridtypes.WorkloadWithID) (interface{}, error)
33+
Deprovision(ctx context.Context, wl *gridtypes.WorkloadWithID) error
34+
}
35+
```
36+
37+
Managers may optionally implement additional interfaces:
38+
39+
| Interface | Methods | Purpose |
40+
|-----------|---------|---------|
41+
| `Initializer` | `Initialize(ctx)` | Called once before the engine starts (e.g., GPU VFIO binding, ZDB container recovery) |
42+
| `Updater` | `Update(ctx, wl)` | Live update of workload parameters (e.g., resize volume, change ZDB password) |
43+
| `Pauser` | `Pause(ctx, wl)` / `Resume(ctx, wl)` | Suspend/resume a workload without deleting it |
44+
45+
## Workload Type Registration
46+
47+
All managers are registered in `provisioner.go`:
48+
49+
```go
50+
func NewPrimitivesProvisioner(zbus zbus.Client) provision.Provisioner {
51+
managers := map[gridtypes.WorkloadType]provision.Manager{
52+
zos.ZMountType: zmount.NewManager(zbus),
53+
zos.ZLogsType: zlogs.NewManager(zbus),
54+
zos.QuantumSafeFSType: qsfs.NewManager(zbus),
55+
zos.ZDBType: zdb.NewManager(zbus),
56+
zos.NetworkType: network.NewManager(zbus),
57+
zos.PublicIPType: pubip.NewManager(zbus),
58+
zos.PublicIPv4Type: pubip.NewManager(zbus),
59+
zos.ZMachineType: vm.NewManager(zbus),
60+
zos.NetworkLightType: netlight.NewManager(zbus),
61+
zos.ZMachineLightType: vmlight.NewManager(zbus),
62+
zos.VolumeType: volume.NewManager(zbus),
63+
zos.GatewayNameProxyType: gateway.NewNameManager(zbus),
64+
zos.GatewayFQDNProxyType: gateway.NewFQDNManager(zbus),
65+
}
66+
return provision.NewMapProvisioner(managers)
67+
}
68+
```
69+
70+
## Workload Types
71+
72+
### ZMount (`zmount`)
73+
74+
Allocates a raw virtual disk (sparse file with COW disabled) via `StorageModule.DiskCreate`. Used as boot disk or additional data disk for VMs.
75+
76+
- **Supports**: Provision, Deprovision, Update (grow only, not while VM is running)
77+
- **Storage**: SSD pools only
78+
- **zbus stubs**: `StorageModuleStub`, `VMModuleStub`
79+
80+
### Volume (`volume`)
81+
82+
Creates a btrfs subvolume with quota via `StorageModule.VolumeCreate`. Unlike a zmount (block device), a volume is a filesystem path that can be bind-mounted into VMs as a shared directory (virtio-fs).
83+
84+
- **Supports**: Provision, Deprovision, Update (grow only)
85+
- **Storage**: SSD pools only
86+
- **zbus stubs**: `StorageModuleStub`
87+
88+
### Network (`network`)
89+
90+
Creates a WireGuard-based private network resource via `Networker.CreateNR`. Carries the full `zos.Network` config including WireGuard peers and subnet allocations.
91+
92+
- **Supports**: Provision, Deprovision, Update (upsert semantics)
93+
- **zbus stubs**: `NetworkerStub`
94+
95+
### Network Light (`network-light`)
96+
97+
Creates a lightweight Mycelium-only network via `NetworkerLight.Create`. Used by light VMs that only need overlay networking.
98+
99+
- **Supports**: Provision, Deprovision, Update (upsert semantics)
100+
- **zbus stubs**: `NetworkerLightStub`
101+
102+
### ZMachine (`vm`)
103+
104+
The most complex workload type. Provisions either:
105+
106+
- **Container VM** (flist without `/image.raw`): Mounts flist read-write with a btrfs volume overlay, injects cloud-container kernel + initrd, boots with VirtioFS.
107+
- **Full VM** (flist with `/image.raw`): Writes the disk image to the first ZMount, boots directly from disk using `hypervisor-fw`.
108+
109+
Network interfaces are set up as tap devices: private network (WireGuard), optional Yggdrasil (planetary), optional Mycelium, optional public IPv4/IPv6. GPU passthrough is supported on rented nodes via VFIO.
110+
111+
- **Supports**: Provision, Deprovision, Initialize (GPU VFIO binding), Pause/Resume (VM lock)
112+
- **zbus stubs**: `VMModuleStub`, `FlisterStub`, `StorageModuleStub`, `NetworkerStub`
113+
114+
### ZMachine Light (`vm-light`)
115+
116+
Same as ZMachine but uses `NetworkerLightStub`. Only supports Mycelium and private network interfaces (no Yggdrasil, no public IP).
117+
118+
- **Supports**: Provision, Deprovision, Initialize, Pause/Resume
119+
- **zbus stubs**: `VMModuleStub`, `FlisterStub`, `StorageModuleStub`, `NetworkerLightStub`
120+
121+
### ZDB (`zdb`)
122+
123+
Manages [0-db](https://github.com/threefoldtech/0-DB) namespaces. Multiple ZDB namespaces share a single container per storage device. On provision:
124+
125+
1. Finds a container with free space, or allocates a new HDD device and starts a new container (via flist mount + `ContainerModule.Run`)
126+
2. Connects to the ZDB unix socket, creates the namespace with the requested mode/password/size
127+
3. Returns IPs (public, yggdrasil, mycelium) + port 9900
128+
129+
- **Supports**: Provision, Deprovision, Initialize (restart crashed containers, upgrade flist), Update (resize, change password/public — not mode), Pause/Resume (namespace lock)
130+
- **Storage**: HDD pools only
131+
- **zbus stubs**: `StorageModuleStub`, `FlisterStub`, `ContainerModuleStub`, `NetworkerStub` or `NetworkerLightStub`
132+
133+
### QSFS (`qsfs`)
134+
135+
Mounts a [Quantum Safe Filesystem](https://github.com/threefoldtech/quantum-storage) via `QSFSDModule.Mount`. Returns a mount path and metrics endpoint.
136+
137+
- **Supports**: Provision, Deprovision, Update
138+
- **zbus stubs**: `QSFSDStub`
139+
140+
### Public IP (`pubip`)
141+
142+
Allocates and configures a public IPv4 and/or IPv6 address for a VM. Selects IPs from the contract's reserved pool, computes IPv6 SLAAC address from the node's public prefix, and sets up nftables filter rules.
143+
144+
`PublicIPv4Type` is kept for backward compatibility and maps to the same manager.
145+
146+
- **Supports**: Provision, Deprovision
147+
- **zbus stubs**: `NetworkerStub`, `SubstrateGatewayStub`
148+
149+
### Gateway Name Proxy (`gateway/name`)
150+
151+
Sets up a reverse proxy where a subdomain is allocated by the grid. Calls `Gateway.SetNamedProxy` and returns the assigned FQDN.
152+
153+
- **Supports**: Provision, Deprovision
154+
- **zbus stubs**: `GatewayStub`
155+
156+
### Gateway FQDN Proxy (`gateway/fqdn`)
157+
158+
Sets up a reverse proxy for a user-owned FQDN. Calls `Gateway.SetFQDNProxy`.
159+
160+
- **Supports**: Provision, Deprovision
161+
- **zbus stubs**: `GatewayStub`
162+
163+
### ZLogs (`zlogs`)
164+
165+
Attaches a log stream from a running VM to an external destination. Finds the referenced ZMachine workload, extracts its network namespace, then calls `VMModule.StreamCreate`.
166+
167+
- **Supports**: Provision, Deprovision
168+
- **zbus stubs**: `VMModuleStub`, `NetworkerStub`
169+
170+
## Helper Packages
171+
172+
### vmgpu
173+
174+
Shared GPU utility used by both `vm` and `vm-light`:
175+
176+
- `InitGPUs()`: Loads VFIO kernel modules, unbinds boot VGA if needed, binds all GPUs in each IoMMU group to the `vfio-pci` driver.
177+
- `ExpandGPUs(gpus)`: For each requested GPU, returns all PCI devices in the same IoMMU group that must be passed through together (excludes PCI bridges and audio controllers).
178+
179+
## Statistics Interceptor
180+
181+
`Statistics` wraps the inner `Provisioner` as middleware. Before provisioning, it checks whether the node has enough capacity (memory, primarily) to satisfy the workload's requirements. It computes usable memory as:
182+
183+
```
184+
usable = total_ram - max(theoretical_reserved, actual_used)
185+
```
186+
187+
Where `theoretical_reserved` is the sum of all active workload MRU claims. This prevents over-commitment even when VMs haven't yet used their full allocation.
188+
189+
It also injects the current consumed capacity into the context so downstream managers can access it via `primitives.GetCapacity(ctx)`.
190+
191+
`NewStatisticsStream` provides a streaming interface (`pkg.Statistics`) that:
192+
- Streams capacity updates every 2 minutes
193+
- Reports total/used/system capacity, deployment counts, open TCP connections
194+
- Lists GPUs with their allocation status (which contract is using each GPU)
195+
196+
## Key Patterns
197+
198+
### Idempotent Provision
199+
200+
Most managers check if a resource already exists before creating it. If it does, they return `provision.ErrNoActionNeeded` and the engine skips writing a new transaction. This makes re-provisioning on reboot safe.
201+
202+
### Workload ID as Resource Name
203+
204+
All resources are named `wl.ID.String()` (format: `<twin>-<contractID>-<name>`), making them globally unique and deterministic across reboots.
205+
206+
### Full vs Light Stack
207+
208+
The codebase has two parallel stacks:
209+
210+
| | Full | Light |
211+
|--|------|-------|
212+
| Network | `NetworkType` (WireGuard) | `NetworkLightType` (Mycelium only) |
213+
| VM | `ZMachineType` | `ZMachineLightType` |
214+
| Stubs | `NetworkerStub` | `NetworkerLightStub` |
215+
| Features | WireGuard + Yggdrasil + Mycelium + Public IP | Mycelium only |
216+
217+
ZDB detects which stack to use via `kernel.GetParams().IsLight()`.
218+
219+
### Provision Order
220+
221+
The engine provisions workloads in type order (networks before VMs, storage before VMs) and deprovisions in reverse order. Within the same type, ZMount and Volume workloads are sorted largest-first.

0 commit comments

Comments
 (0)