|
| 1 | +# Primitives Package |
| 2 | + |
| 3 | +## Introduction |
| 4 | + |
| 5 | +The primitives package implements the workload managers for all deployable workload types in 0-OS. It acts as the glue between the [provision engine](../provision/readme.md) and the underlying system daemons (storage, networking, VM, container, etc.), which it reaches over [zbus](https://github.com/threefoldtech/zbus). |
| 6 | + |
| 7 | +Each workload type has a dedicated manager that knows how to provision, deprovision, and optionally update or pause that specific resource. The top-level `NewPrimitivesProvisioner` function wires all managers together into a single `provision.Provisioner` that the engine dispatches to by workload type. |
| 8 | + |
| 9 | +## Architecture |
| 10 | + |
| 11 | +``` |
| 12 | + provision engine |
| 13 | + | |
| 14 | + mapProvisioner |
| 15 | + (dispatches by type) |
| 16 | + | |
| 17 | + +--------+--------+--+--+--------+--------+ |
| 18 | + | | | | | | |
| 19 | + zmount volume network vm zdb gateway ... |
| 20 | + | | | | | | |
| 21 | + v v v v v v |
| 22 | + StorageStub NetStub VMStub ContainerStub GatewayStub |
| 23 | + (zbus) (zbus) (zbus) (zbus) (zbus) |
| 24 | +``` |
| 25 | + |
| 26 | +## Manager Interface |
| 27 | + |
| 28 | +Every workload manager must implement the `provision.Manager` interface: |
| 29 | + |
| 30 | +```go |
| 31 | +type Manager interface { |
| 32 | + Provision(ctx context.Context, wl *gridtypes.WorkloadWithID) (interface{}, error) |
| 33 | + Deprovision(ctx context.Context, wl *gridtypes.WorkloadWithID) error |
| 34 | +} |
| 35 | +``` |
| 36 | + |
| 37 | +Managers may optionally implement additional interfaces: |
| 38 | + |
| 39 | +| Interface | Methods | Purpose | |
| 40 | +|-----------|---------|---------| |
| 41 | +| `Initializer` | `Initialize(ctx)` | Called once before the engine starts (e.g., GPU VFIO binding, ZDB container recovery) | |
| 42 | +| `Updater` | `Update(ctx, wl)` | Live update of workload parameters (e.g., resize volume, change ZDB password) | |
| 43 | +| `Pauser` | `Pause(ctx, wl)` / `Resume(ctx, wl)` | Suspend/resume a workload without deleting it | |
| 44 | + |
| 45 | +## Workload Type Registration |
| 46 | + |
| 47 | +All managers are registered in `provisioner.go`: |
| 48 | + |
| 49 | +```go |
| 50 | +func NewPrimitivesProvisioner(zbus zbus.Client) provision.Provisioner { |
| 51 | + managers := map[gridtypes.WorkloadType]provision.Manager{ |
| 52 | + zos.ZMountType: zmount.NewManager(zbus), |
| 53 | + zos.ZLogsType: zlogs.NewManager(zbus), |
| 54 | + zos.QuantumSafeFSType: qsfs.NewManager(zbus), |
| 55 | + zos.ZDBType: zdb.NewManager(zbus), |
| 56 | + zos.NetworkType: network.NewManager(zbus), |
| 57 | + zos.PublicIPType: pubip.NewManager(zbus), |
| 58 | + zos.PublicIPv4Type: pubip.NewManager(zbus), |
| 59 | + zos.ZMachineType: vm.NewManager(zbus), |
| 60 | + zos.NetworkLightType: netlight.NewManager(zbus), |
| 61 | + zos.ZMachineLightType: vmlight.NewManager(zbus), |
| 62 | + zos.VolumeType: volume.NewManager(zbus), |
| 63 | + zos.GatewayNameProxyType: gateway.NewNameManager(zbus), |
| 64 | + zos.GatewayFQDNProxyType: gateway.NewFQDNManager(zbus), |
| 65 | + } |
| 66 | + return provision.NewMapProvisioner(managers) |
| 67 | +} |
| 68 | +``` |
| 69 | + |
| 70 | +## Workload Types |
| 71 | + |
| 72 | +### ZMount (`zmount`) |
| 73 | + |
| 74 | +Allocates a raw virtual disk (sparse file with COW disabled) via `StorageModule.DiskCreate`. Used as boot disk or additional data disk for VMs. |
| 75 | + |
| 76 | +- **Supports**: Provision, Deprovision, Update (grow only, not while VM is running) |
| 77 | +- **Storage**: SSD pools only |
| 78 | +- **zbus stubs**: `StorageModuleStub`, `VMModuleStub` |
| 79 | + |
| 80 | +### Volume (`volume`) |
| 81 | + |
| 82 | +Creates a btrfs subvolume with quota via `StorageModule.VolumeCreate`. Unlike a zmount (block device), a volume is a filesystem path that can be bind-mounted into VMs as a shared directory (virtio-fs). |
| 83 | + |
| 84 | +- **Supports**: Provision, Deprovision, Update (grow only) |
| 85 | +- **Storage**: SSD pools only |
| 86 | +- **zbus stubs**: `StorageModuleStub` |
| 87 | + |
| 88 | +### Network (`network`) |
| 89 | + |
| 90 | +Creates a WireGuard-based private network resource via `Networker.CreateNR`. Carries the full `zos.Network` config including WireGuard peers and subnet allocations. |
| 91 | + |
| 92 | +- **Supports**: Provision, Deprovision, Update (upsert semantics) |
| 93 | +- **zbus stubs**: `NetworkerStub` |
| 94 | + |
| 95 | +### Network Light (`network-light`) |
| 96 | + |
| 97 | +Creates a lightweight Mycelium-only network via `NetworkerLight.Create`. Used by light VMs that only need overlay networking. |
| 98 | + |
| 99 | +- **Supports**: Provision, Deprovision, Update (upsert semantics) |
| 100 | +- **zbus stubs**: `NetworkerLightStub` |
| 101 | + |
| 102 | +### ZMachine (`vm`) |
| 103 | + |
| 104 | +The most complex workload type. Provisions either: |
| 105 | + |
| 106 | +- **Container VM** (flist without `/image.raw`): Mounts flist read-write with a btrfs volume overlay, injects cloud-container kernel + initrd, boots with VirtioFS. |
| 107 | +- **Full VM** (flist with `/image.raw`): Writes the disk image to the first ZMount, boots directly from disk using `hypervisor-fw`. |
| 108 | + |
| 109 | +Network interfaces are set up as tap devices: private network (WireGuard), optional Yggdrasil (planetary), optional Mycelium, optional public IPv4/IPv6. GPU passthrough is supported on rented nodes via VFIO. |
| 110 | + |
| 111 | +- **Supports**: Provision, Deprovision, Initialize (GPU VFIO binding), Pause/Resume (VM lock) |
| 112 | +- **zbus stubs**: `VMModuleStub`, `FlisterStub`, `StorageModuleStub`, `NetworkerStub` |
| 113 | + |
| 114 | +### ZMachine Light (`vm-light`) |
| 115 | + |
| 116 | +Same as ZMachine but uses `NetworkerLightStub`. Only supports Mycelium and private network interfaces (no Yggdrasil, no public IP). |
| 117 | + |
| 118 | +- **Supports**: Provision, Deprovision, Initialize, Pause/Resume |
| 119 | +- **zbus stubs**: `VMModuleStub`, `FlisterStub`, `StorageModuleStub`, `NetworkerLightStub` |
| 120 | + |
| 121 | +### ZDB (`zdb`) |
| 122 | + |
| 123 | +Manages [0-db](https://github.com/threefoldtech/0-DB) namespaces. Multiple ZDB namespaces share a single container per storage device. On provision: |
| 124 | + |
| 125 | +1. Finds a container with free space, or allocates a new HDD device and starts a new container (via flist mount + `ContainerModule.Run`) |
| 126 | +2. Connects to the ZDB unix socket, creates the namespace with the requested mode/password/size |
| 127 | +3. Returns IPs (public, yggdrasil, mycelium) + port 9900 |
| 128 | + |
| 129 | +- **Supports**: Provision, Deprovision, Initialize (restart crashed containers, upgrade flist), Update (resize, change password/public — not mode), Pause/Resume (namespace lock) |
| 130 | +- **Storage**: HDD pools only |
| 131 | +- **zbus stubs**: `StorageModuleStub`, `FlisterStub`, `ContainerModuleStub`, `NetworkerStub` or `NetworkerLightStub` |
| 132 | + |
| 133 | +### QSFS (`qsfs`) |
| 134 | + |
| 135 | +Mounts a [Quantum Safe Filesystem](https://github.com/threefoldtech/quantum-storage) via `QSFSDModule.Mount`. Returns a mount path and metrics endpoint. |
| 136 | + |
| 137 | +- **Supports**: Provision, Deprovision, Update |
| 138 | +- **zbus stubs**: `QSFSDStub` |
| 139 | + |
| 140 | +### Public IP (`pubip`) |
| 141 | + |
| 142 | +Allocates and configures a public IPv4 and/or IPv6 address for a VM. Selects IPs from the contract's reserved pool, computes IPv6 SLAAC address from the node's public prefix, and sets up nftables filter rules. |
| 143 | + |
| 144 | +`PublicIPv4Type` is kept for backward compatibility and maps to the same manager. |
| 145 | + |
| 146 | +- **Supports**: Provision, Deprovision |
| 147 | +- **zbus stubs**: `NetworkerStub`, `SubstrateGatewayStub` |
| 148 | + |
| 149 | +### Gateway Name Proxy (`gateway/name`) |
| 150 | + |
| 151 | +Sets up a reverse proxy where a subdomain is allocated by the grid. Calls `Gateway.SetNamedProxy` and returns the assigned FQDN. |
| 152 | + |
| 153 | +- **Supports**: Provision, Deprovision |
| 154 | +- **zbus stubs**: `GatewayStub` |
| 155 | + |
| 156 | +### Gateway FQDN Proxy (`gateway/fqdn`) |
| 157 | + |
| 158 | +Sets up a reverse proxy for a user-owned FQDN. Calls `Gateway.SetFQDNProxy`. |
| 159 | + |
| 160 | +- **Supports**: Provision, Deprovision |
| 161 | +- **zbus stubs**: `GatewayStub` |
| 162 | + |
| 163 | +### ZLogs (`zlogs`) |
| 164 | + |
| 165 | +Attaches a log stream from a running VM to an external destination. Finds the referenced ZMachine workload, extracts its network namespace, then calls `VMModule.StreamCreate`. |
| 166 | + |
| 167 | +- **Supports**: Provision, Deprovision |
| 168 | +- **zbus stubs**: `VMModuleStub`, `NetworkerStub` |
| 169 | + |
| 170 | +## Helper Packages |
| 171 | + |
| 172 | +### vmgpu |
| 173 | + |
| 174 | +Shared GPU utility used by both `vm` and `vm-light`: |
| 175 | + |
| 176 | +- `InitGPUs()`: Loads VFIO kernel modules, unbinds boot VGA if needed, binds all GPUs in each IoMMU group to the `vfio-pci` driver. |
| 177 | +- `ExpandGPUs(gpus)`: For each requested GPU, returns all PCI devices in the same IoMMU group that must be passed through together (excludes PCI bridges and audio controllers). |
| 178 | + |
| 179 | +## Statistics Interceptor |
| 180 | + |
| 181 | +`Statistics` wraps the inner `Provisioner` as middleware. Before provisioning, it checks whether the node has enough capacity (memory, primarily) to satisfy the workload's requirements. It computes usable memory as: |
| 182 | + |
| 183 | +``` |
| 184 | +usable = total_ram - max(theoretical_reserved, actual_used) |
| 185 | +``` |
| 186 | + |
| 187 | +Where `theoretical_reserved` is the sum of all active workload MRU claims. This prevents over-commitment even when VMs haven't yet used their full allocation. |
| 188 | + |
| 189 | +It also injects the current consumed capacity into the context so downstream managers can access it via `primitives.GetCapacity(ctx)`. |
| 190 | + |
| 191 | +`NewStatisticsStream` provides a streaming interface (`pkg.Statistics`) that: |
| 192 | +- Streams capacity updates every 2 minutes |
| 193 | +- Reports total/used/system capacity, deployment counts, open TCP connections |
| 194 | +- Lists GPUs with their allocation status (which contract is using each GPU) |
| 195 | + |
| 196 | +## Key Patterns |
| 197 | + |
| 198 | +### Idempotent Provision |
| 199 | + |
| 200 | +Most managers check if a resource already exists before creating it. If it does, they return `provision.ErrNoActionNeeded` and the engine skips writing a new transaction. This makes re-provisioning on reboot safe. |
| 201 | + |
| 202 | +### Workload ID as Resource Name |
| 203 | + |
| 204 | +All resources are named `wl.ID.String()` (format: `<twin>-<contractID>-<name>`), making them globally unique and deterministic across reboots. |
| 205 | + |
| 206 | +### Full vs Light Stack |
| 207 | + |
| 208 | +The codebase has two parallel stacks: |
| 209 | + |
| 210 | +| | Full | Light | |
| 211 | +|--|------|-------| |
| 212 | +| Network | `NetworkType` (WireGuard) | `NetworkLightType` (Mycelium only) | |
| 213 | +| VM | `ZMachineType` | `ZMachineLightType` | |
| 214 | +| Stubs | `NetworkerStub` | `NetworkerLightStub` | |
| 215 | +| Features | WireGuard + Yggdrasil + Mycelium + Public IP | Mycelium only | |
| 216 | + |
| 217 | +ZDB detects which stack to use via `kernel.GetParams().IsLight()`. |
| 218 | + |
| 219 | +### Provision Order |
| 220 | + |
| 221 | +The engine provisions workloads in type order (networks before VMs, storage before VMs) and deprovisions in reverse order. Within the same type, ZMount and Volume workloads are sorted largest-first. |
0 commit comments