Skip to content

Commit 5de0d0c

Browse files
committed
rcloud: Add experimental private cloud infrastructure for bare metal
This introduces rcloud - an experimental Rust-based REST API server and Terraform provider for managing virtual machines on bare metal servers using libvirt and kdevops guestfs base images. NixOS support can be added later. Problem Statement: For a long time one of the most difficult things to support on kdevops is users with different distributions and the different libvirt setting requirements. While one possibility is to build distribution packages (rpms/debs) for a pre-defined kdevops setup, another alternative is to abstract away guest management entirely and let users interact with a private local cloud solution via REST API and Terraform. Existing options evaluated: * OpenStack - heavy, complex, dated architecture * Ubicloud - Ruby-based rcloud provides a lightweight, Rust-based alternative designed specifically for kernel development and testing workflows. Administrator Setup: System administrators install rcloud once per bare metal server: ```bash make defconfig-rcloud make make rcloud ``` This deploys: - rcloud REST API server (systemd service on port 8765) - Terraform provider for Infrastructure as Code - Integration with kdevops guestfs base images - Prometheus metrics endpoint for monitoring The admin is responsible for creating base images that users will consume: ```bash make rcloud-base-images ``` User Testing Workflow: Once rcloud is installed by an admin, users can provision VMs without requiring root access or libvirt configuration knowledge: ```bash make defconfig-rcloud-guest-test make make bringup ``` This enables: - REST API access for VM lifecycle management (create, start, stop, destroy) - Terraform-based VM provisioning - Base image discovery and selection - Health monitoring and status reporting Implementation Details: The rcloud implementation leverages: - REST API for VM lifecycle management - kdevops guestfs base images (no rebuilding required) - libvirt for underlying VM management - Rust for performance, safety, and reliability - Prometheus metrics endpoint for observability - Systemd integration for service management All Rust code is formatted using Linux kernel rustfmt standards from .rustfmt.toml (added in install-rust-deps commit) and verified with cargo clippy for code quality. Documentation: - Architecture and design: workflows/rcloud/design.md - Testing guide: workflows/rcloud/docs/testing.md - Authentication (future): workflows/rcloud/docs/authentication.md Generated-by: Claude AI Acked-by: Chuck Lever <[email protected]> Signed-off-by: Luis Chamberlain <[email protected]>
1 parent e5bbda9 commit 5de0d0c

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

60 files changed

+8036
-2
lines changed

Makefile

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -179,6 +179,7 @@ endif # WORKFLOW_KOTD_ENABLE
179179
DEFAULT_DEPS += $(DEFAULT_DEPS_REQS_EXTRA_VARS)
180180

181181
include scripts/install-menuconfig-deps.Makefile
182+
include scripts/install-rcloud-deps.Makefile
182183

183184
include Makefile.btrfs_progs
184185

@@ -197,6 +198,10 @@ endif # CONFIG_HYPERVISOR_TUNING
197198
include Makefile.linux-mirror
198199
include Makefile.docker-mirror
199200

201+
ifeq (y,$(CONFIG_RCLOUD))
202+
include workflows/rcloud/Makefile
203+
endif
204+
200205
ifeq (y,$(CONFIG_KDEVOPS_DISTRO_REG_METHOD_TWOLINE))
201206
DEFAULT_DEPS += playbooks/secret.yml
202207
endif

README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -377,6 +377,7 @@ want to just use the kernel that comes with your Linux distribution.
377377
* [kdevops CXL docs](docs/cxl.md)
378378
* [kdevops NFS docs](docs/nfs.md)
379379
* [kdevops selftests docs](docs/selftests.md)
380+
* [kdevops rcloud docs](workflows/rcloud/docs/TESTING.md)
380381
* [kdevops reboot-limit docs](docs/reboot-limit.md)
381382
* [kdevops AI workflow docs](docs/ai/README.md)
382383
* [kdevops vLLM workflow docs](workflows/vllm/)

defconfigs/rcloud

Lines changed: 53 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,53 @@
1+
# rcloud server-only configuration
2+
#
3+
# PURPOSE: Production deployment of rcloud REST API server
4+
#
5+
# Use this when:
6+
# - Deploying rcloud to a dedicated server
7+
# - Base images already exist (created elsewhere or separately)
8+
# - You only need the rcloud server component
9+
#
10+
# For testing/development, use 'defconfig-rcloud-guest-test' instead,
11+
# which includes both base image creation AND rcloud server setup.
12+
#
13+
# This defconfig:
14+
# - Installs the rcloud REST API server
15+
# - Enables the Terraform provider
16+
# - Configures the systemd service
17+
# - Does NOT create base images (assumes they exist)
18+
# - Does NOT create test VMs (rcloud creates them on demand via API)
19+
#
20+
# After 'make rcloud', the API will be available at:
21+
# http://localhost:8765/api/v1/health
22+
#
23+
# Test with:
24+
# make rcloud-status # Check health
25+
# curl http://localhost:8765/api/v1/vms # List VMs
26+
# terraform apply # Use Terraform provider
27+
28+
# Skip interactive bringup (rcloud server doesn't need test VMs)
29+
CONFIG_SKIP_BRINGUP=y
30+
31+
# Disable test workflows (this is a cloud infrastructure server)
32+
CONFIG_WORKFLOWS=n
33+
34+
# Enable guestfs for base image management
35+
CONFIG_GUESTFS=y
36+
37+
# Use Debian nocloud variant for local VM testing
38+
# The nocloud variant doesn't have cloud-init and is designed for bare metal/local use
39+
# The generic variant requires cloud-init which virt-builder removes, breaking networking
40+
CONFIG_GUESTFS_DEBIAN_TRIXIE_NOCLOUD_AMD64=y
41+
42+
# Don't copy host APT sources to guest base images
43+
# Base images should use standard Debian repositories for reliability
44+
# Custom repositories can be added during per-user VM customization
45+
# Note: Must disable GUESTFS_DEBIAN_COPY_HOST_SOURCES (not COPY_SOURCES directly)
46+
# because Kconfig 'select' cannot be overridden
47+
CONFIG_GUESTFS_DEBIAN_COPY_HOST_SOURCES=n
48+
49+
# Enable rcloud REST API server
50+
CONFIG_RCLOUD=y
51+
CONFIG_RCLOUD_SERVER_BIND="127.0.0.1:8765"
52+
CONFIG_RCLOUD_WORKERS=4
53+
CONFIG_RCLOUD_ENABLE_TERRAFORM_PROVIDER=y

defconfigs/rcloud-guest-test

Lines changed: 58 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,58 @@
1+
# rcloud user/client configuration
2+
#
3+
# PURPOSE: Provision VMs through rcloud REST API using Terraform
4+
#
5+
# This defconfig is for USERS who want to provision VMs through an existing
6+
# rcloud server. The admin must have already set up the rcloud server using
7+
# defconfig-rcloud.
8+
#
9+
# Typical workflow:
10+
# # Admin setup (one time):
11+
# make defconfig-rcloud
12+
# make
13+
# make rcloud # Sets up rcloud server and base images
14+
#
15+
# # User workflow (on same system or remotely):
16+
# make defconfig-rcloud-guest-test
17+
# make
18+
# make bringup # Provisions VMs through rcloud API using Terraform
19+
#
20+
# This treats rcloud as a local cloud provider, just like AWS or Azure.
21+
# Users provision VMs through the standard Terraform workflow.
22+
23+
# Basic system configuration
24+
CONFIG_KDEVOPS_HOSTS_PREFIX="rcloud-test"
25+
26+
# Number of VMs to create
27+
CONFIG_KDEVOPS_NODES=1
28+
29+
# VM resources (requested from rcloud)
30+
CONFIG_LIBVIRT_MACHINE_TYPE_Q35=y
31+
CONFIG_LIBVIRT_HOST_PASSTHROUGH=y
32+
CONFIG_LIBVIRT_MEMORY_MB=4096
33+
CONFIG_LIBVIRT_VCPUS=2
34+
35+
# Disk configuration
36+
CONFIG_LIBVIRT_EXTRA_STORAGE_DRIVE_NVME=y
37+
CONFIG_LIBVIRT_NVME_DISK_SIZE_GIB=50
38+
39+
# Disable test workflows (rcloud provides the infrastructure)
40+
CONFIG_WORKFLOWS=n
41+
42+
# Use Terraform to provision VMs through rcloud API
43+
# This treats rcloud as a local cloud provider, similar to AWS/Azure
44+
CONFIG_TERRAFORM=y
45+
CONFIG_TERRAFORM_RCLOUD=y
46+
CONFIG_TERRAFORM_RCLOUD_API_URL="http://localhost:8765"
47+
# Use nocloud variant to match the base image created by defconfig-rcloud
48+
# The nocloud variant doesn't have cloud-init and works better for local testing
49+
CONFIG_TERRAFORM_RCLOUD_BASE_IMAGE="debian-13-nocloud-amd64-daily.raw"
50+
51+
# IMPORTANT: SSH key configuration for rcloud service
52+
# The rcloud service cannot access files in ~/.ssh/ due to directory permissions.
53+
# You MUST configure the SSH key path to a location accessible by the rcloud service.
54+
# Recommended: Store keys in your kdevops directory, for example:
55+
# CONFIG_TERRAFORM_SSH_CONFIG_PUBKEY_FILE="/path/to/kdevops/kdevops_terraform.pub"
56+
# The private key path will automatically derive from the public key path (removes .pub suffix).
57+
# After loading this defconfig, run 'make menuconfig' and update the SSH paths under:
58+
# "Bring up methods" -> "Terraform ssh configuration" -> "SSH public key file"

kconfigs/workflows/Kconfig

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -614,3 +614,5 @@ config KDEVOPS_WORKFLOW_NAME
614614
endif
615615

616616
endif # WORKFLOWS
617+
618+
source "workflows/rcloud/Kconfig"

playbooks/install-rcloud-deps.yml

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
---
2+
- name: Install rcloud build dependencies
3+
hosts: localhost
4+
roles:
5+
- role: install-rcloud-deps

playbooks/rcloud.yml

Lines changed: 107 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,107 @@
1+
---
2+
- name: Install and configure rcloud REST API server
3+
hosts: localhost
4+
become: yes
5+
become_method: sudo
6+
tasks:
7+
- name: Ensure rcloud binary was built
8+
stat:
9+
path: "{{ playbook_dir }}/../workflows/rcloud/target/release/rcloud"
10+
register: rcloud_binary
11+
failed_when: not rcloud_binary.stat.exists
12+
13+
- name: Install rcloud binary
14+
copy:
15+
src: "{{ playbook_dir }}/../workflows/rcloud/target/release/rcloud"
16+
dest: /usr/local/bin/rcloud
17+
mode: '0755'
18+
owner: root
19+
group: root
20+
21+
- name: Create rcloud system user
22+
user:
23+
name: rcloud
24+
system: yes
25+
shell: /usr/sbin/nologin
26+
home: /var/lib/rcloud
27+
create_home: yes
28+
29+
- name: Create rcloud configuration directory
30+
file:
31+
path: /etc/rcloud
32+
state: directory
33+
mode: '0755'
34+
owner: root
35+
group: root
36+
37+
- name: Create systemd service file
38+
copy:
39+
dest: /etc/systemd/system/rcloud.service
40+
mode: '0644'
41+
owner: root
42+
group: root
43+
content: |
44+
[Unit]
45+
Description=rcloud REST API server for VM management
46+
Documentation=https://github.com/linux-kdevops/kdevops
47+
After=network.target libvirtd.service
48+
Requires=libvirtd.service
49+
50+
[Service]
51+
Type=simple
52+
User=rcloud
53+
Group=rcloud
54+
WorkingDirectory={{ topdir_path }}
55+
Environment="KDEVOPS_ROOT={{ topdir_path }}"
56+
Environment="RUST_LOG=info"
57+
Environment="RCLOUD_STORAGE_POOL_PATH={{ kdevops_storage_pool_path | default(libvirt_storage_pool_path) }}"
58+
Environment="RCLOUD_BASE_IMAGES_DIR={{ guestfs_base_image_dir }}"
59+
Environment="RCLOUD_LIBVIRT_URI={{ libvirt_uri | default('qemu:///system') }}"
60+
Environment="RCLOUD_NETWORK_BRIDGE={{ libvirt_bridge_name | default('default') }}"
61+
ExecStart=/usr/local/bin/rcloud
62+
Restart=on-failure
63+
RestartSec=5s
64+
65+
# Security hardening
66+
NoNewPrivileges=true
67+
PrivateTmp=true
68+
ProtectSystem=strict
69+
ProtectHome=true
70+
ReadWritePaths=/var/lib/rcloud {{ kdevops_storage_pool_path | default(libvirt_storage_pool_path) }}
71+
ReadOnlyPaths={{ topdir_path }} {{ kdevops_storage_pool_path | default(libvirt_storage_pool_path) }}/guestfs
72+
73+
[Install]
74+
WantedBy=multi-user.target
75+
76+
- name: Add rcloud user to libvirt-qemu group
77+
user:
78+
name: rcloud
79+
groups: "{{ libvirt_qemu_group | default('libvirt-qemu') }}"
80+
append: yes
81+
82+
- name: Reload systemd daemon
83+
systemd:
84+
daemon_reload: yes
85+
86+
- name: Enable rcloud service
87+
systemd:
88+
name: rcloud
89+
enabled: yes
90+
91+
- name: Extract port from bind address
92+
set_fact:
93+
rcloud_port: "{{ (rcloud_server_bind | default('127.0.0.1:8765')).split(':')[1] }}"
94+
95+
- name: Display service status instructions
96+
debug:
97+
msg:
98+
- "rcloud service installed successfully"
99+
- "Start the service with: sudo systemctl start rcloud"
100+
- "Check status with: sudo systemctl status rcloud"
101+
- "View logs with: sudo journalctl -u rcloud -f"
102+
- ""
103+
- "API will be available at: http://{{ rcloud_server_bind | default('127.0.0.1:8765') }}"
104+
- ""
105+
- "Test with:"
106+
- " curl http://localhost:{{ rcloud_port }}/api/v1/health"
107+
- " curl http://localhost:{{ rcloud_port }}/api/v1/status"

playbooks/roles/base_image/templates/virt-builder.j2

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -34,12 +34,16 @@ sm-unregister
3434
{% if guestfs_debian is defined and guestfs_debian %}
3535
{# Ugh, debian has to be told to bring up the network and regenerate ssh keys #}
3636
{# Hope we get that interface name right! #}
37-
install isc-dhcp-client,ifupdown
37+
install isc-dhcp-client,ifupdown,openssh-server
3838
mkdir /etc/network/interfaces.d/
3939
append-line /etc/network/interfaces.d/enp1s0:auto enp1s0
4040
append-line /etc/network/interfaces.d/enp1s0:allow-hotplug enp1s0
4141
append-line /etc/network/interfaces.d/enp1s0:iface enp1s0 inet dhcp
42-
firstboot-command systemctl disable systemd-networkd-wait-online.service
42+
run-command systemctl disable systemd-networkd.service
43+
run-command systemctl disable systemd-networkd-wait-online.service
44+
run-command systemctl mask systemd-networkd.service
45+
run-command systemctl enable networking.service
46+
run-command systemctl enable [email protected]
4347
firstboot-command systemctl stop ssh
4448
firstboot-command DEBIAN_FRONTEND=noninteractive DEBCONF_NONINTERACTIVE_SEEN=true dpkg-reconfigure -p low --force openssh-server
4549
firstboot-command systemctl start ssh
Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,16 @@
1+
# rcloud Terraform provider configuration
2+
rcloud_api_url = "{{ terraform_rcloud_api_url }}"
3+
rcloud_base_image = "{{ terraform_rcloud_base_image }}"
4+
5+
# SSH configuration
6+
ssh_config_pubkey_file = "{{ kdevops_terraform_ssh_config_pubkey_file }}"
7+
ssh_config_privkey_file = "{{ kdevops_terraform_ssh_config_privkey_file }}"
8+
ssh_config_user = "{{ kdevops_terraform_ssh_config_user }}"
9+
ssh_config = "{{ sshconfig }}"
10+
ssh_config_port = {{ ansible_cfg_ssh_port }}
11+
# Use unique SSH config file per directory to avoid conflicts
12+
ssh_config_name = "{{ kdevops_ssh_config_prefix }}{{ topdir_path_sha256sum[:8] }}"
13+
14+
ssh_config_update = {{ kdevops_terraform_ssh_config_update | lower }}
15+
ssh_config_use_strict_settings = {{ kdevops_terraform_ssh_config_update_strict | lower }}
16+
ssh_config_backup = {{ kdevops_terraform_ssh_config_update_backup | lower }}
Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
---
2+
- name: Install rcloud-specific build dependencies
3+
become: true
4+
become_method: sudo
5+
ansible.builtin.apt:
6+
name:
7+
- libvirt-dev
8+
state: present
9+
update_cache: false
10+
tags: ["rcloud", "deps"]

0 commit comments

Comments
 (0)