Skip to content

Commit f16df34

Browse files
committed
docs: add boot process documentation
Add documentation of Bottlerocket's boot sequence, including systemd target progression, service dependencies, and synchronization mechanisms. Signed-off-by: Sean P. Kelly <[email protected]>
1 parent 9b7adf3 commit f16df34

File tree

1 file changed

+382
-0
lines changed

1 file changed

+382
-0
lines changed

docs/boot-process.md

Lines changed: 382 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,382 @@
1+
# Bottlerocket Boot Process
2+
3+
This document describes how Bottlerocket boots, focusing on the systemd target progression and service dependencies.
4+
5+
**Keywords:** boot, systemd, targets, preconfigured, configured, multi-user, fipscheck, drivers, sysinit, services, dependencies, ordering, API system, bootstrap containers, settings, startup, initialization, kernel modules
6+
7+
## Overview
8+
9+
Bottlerocket's boot sequence progresses through six main stages, each represented by a systemd target:
10+
11+
```
12+
sysinit.target
13+
14+
fipscheck.target (FIPS mode only)
15+
16+
drivers.target (kernel module loading)
17+
18+
preconfigured.target (API system initialization)
19+
20+
configured.target (bootstrap containers)
21+
22+
multi-user.target (workload services)
23+
```
24+
25+
Each stage must complete before the next begins. Services use systemd dependencies (`After=`, `Requires=`, `Wants=`) to coordinate within and between stages.
26+
27+
## Boot Stages
28+
29+
### Stage 0: sysinit.target
30+
31+
Standard systemd initialization. Most services implicitly depend on this through `DefaultDependencies=yes` (the default).
32+
33+
Early services that need to run before normal dependency chains use `DefaultDependencies=no`:
34+
35+
- Filesystem preparation (`prepare-var.service`, `prepare-boot.service`, etc.)
36+
- Data store migration (`migrator.service`)
37+
38+
### Stage 1: fipscheck.target
39+
40+
**Purpose:** Verify cryptographic module integrity when FIPS mode is enabled.
41+
42+
**When it runs:** Only when the kernel command line includes `fips=1`.
43+
44+
**Key services:**
45+
46+
- `check-kernel-integrity.service` - Verifies kernel integrity
47+
- `check-fips-modules.service` - Loads and tests the `tcrypt` module
48+
- Creates `/etc/.fips-module-check-passed` sentinel file on success
49+
- Blocks boot if FIPS checks fail
50+
51+
**Transition:** Completes before `drivers.target` begins.
52+
53+
### Stage 2: drivers.target
54+
55+
**Purpose:** Load kernel modules and hardware drivers.
56+
57+
**When it runs:** Always runs, after `basic.target` and before `preconfigured.target`.
58+
59+
**Key services:**
60+
61+
- `load-neuron-inf1-modules.service` - Loads AWS Neuron Inf1 kernel modules
62+
- `load-neuron-latest-modules.service` - Loads AWS Neuron Latest kernel modules
63+
64+
**Dependencies:**
65+
66+
- Runs after `basic.target`
67+
- Runs before `preconfigured.target`
68+
- Required by `preconfigured.target` and `multi-user.target`
69+
70+
**Note:** Some driver loading services (e.g., NVIDIA GPU drivers) are required by `preconfigured.target` directly rather than using `drivers.target`.
71+
72+
**Transition:** Completes before `preconfigured.target` begins.
73+
74+
### Stage 3: preconfigured.target
75+
76+
**Purpose:** Initialize the API system and apply all boot-time configuration.
77+
78+
**What "preconfigured" means:** The system has:
79+
80+
- A populated data store with default and user-provided settings
81+
- A running API server
82+
- All configuration files generated from settings
83+
84+
This is the most complex boot stage. Services run in a specific order to build up the system configuration:
85+
86+
#### 3.1 Data Store Setup
87+
88+
**migrator** (`migrator.service`)
89+
90+
- **When:** Runs with `DefaultDependencies=no`, before everything else
91+
- **What:** Updates data store schema if the OS version changed
92+
- **Dependencies:** Required by `apiserver.service`, `storewolf.service`, and `preconfigured.target`
93+
94+
**storewolf** (`storewolf.service`)
95+
96+
- **When:** After `migrator.service`
97+
- **What:** Creates data store directories and populates default settings
98+
- **Details:**
99+
- Reads defaults from variant-specific `defaults.d` directories
100+
- Writes settings to _pending_ state in "bottlerocket-launch" transaction
101+
- Settings not available to other services until committed
102+
- **Dependencies:** Required by `preconfigured.target`
103+
104+
#### 3.2 API Server
105+
106+
**apiserver** (`apiserver.service`)
107+
108+
- **When:** After `storewolf.service`
109+
- **What:** Starts the API server on Unix socket `/run/api.sock`
110+
- **Details:** Allows reading/writing settings via API
111+
- **Dependencies:** Wanted by `preconfigured.target`
112+
113+
#### 3.3 Settings Population
114+
115+
**early-boot-config** (`early-boot-config.service`)
116+
117+
- **When:** After `network-online.target`, `apiserver.service`, `storewolf.service`
118+
- **What:** Applies user data settings (cloud-init equivalent)
119+
- **Details:**
120+
- Only runs on first boot (checks for `/var/lib/bottlerocket/early-boot-config.ran`)
121+
- Fetches user data from platform metadata service (e.g., EC2 IMDS)
122+
- PATCHes settings to API in _pending_ state (not committed)
123+
- **Dependencies:** Required by `preconfigured.target`
124+
125+
**sundog** (`sundog.service`)
126+
127+
- **When:** After `network-online.target`, `apiserver.service`, `early-boot-config.service`
128+
- **What:** Generates dynamic settings that can't be determined until runtime
129+
- **Details:**
130+
- Examples: primary IP address, cluster DNS settings
131+
- Runs `settings-committer` first to access user data settings
132+
- PATCHes generated settings to API in _pending_ state
133+
- **Dependencies:** Required by `preconfigured.target`
134+
- **Subcomponent:** `pluto.service` generates Kubernetes-specific settings
135+
136+
#### 3.4 Configuration Application
137+
138+
**settings-applier** (`settings-applier.service`)
139+
140+
- **When:** After `storewolf.service`, `sundog.service`, `early-boot-config.service`, `apiserver.service`
141+
- **What:** Writes all configuration files based on settings
142+
- **Details:**
143+
- Runs `settings-committer` to commit the "bottlerocket-launch" transaction
144+
- Runs `thar-be-settings --all` to generate all config files
145+
- This is when pending settings become live
146+
- **Dependencies:** Required by `preconfigured.target`
147+
148+
#### 3.5 Stage Transition
149+
150+
**activate-configured** (`activate-configured.service`)
151+
152+
- **When:** After `preconfigured.target` completes
153+
- **What:** Transitions to `configured.target`
154+
- **Details:**
155+
- Sets systemd default target to `configured.target`
156+
- Starts `configured.target` asynchronously
157+
- **Dependencies:** Wanted by `preconfigured.target`
158+
159+
### Stage 4: configured.target
160+
161+
**Purpose:** Run bootstrap containers that perform additional system configuration.
162+
163+
**What "configured" means:** The system has:
164+
165+
- Completed all API-based configuration
166+
- Run any user-defined bootstrap containers
167+
- Applied any additional configuration from bootstrap containers
168+
169+
**Key services:**
170+
171+
**bootstrap-containers@** (`[email protected]`)
172+
173+
- **When:** After `host-containerd.service`, before `configured.target`
174+
- **What:** Runs bootstrap containers defined in settings
175+
- **Details:**
176+
- Template unit instantiated for each configured bootstrap container
177+
- Only runs once per container (checks for `/run/bootstrap-containers/%i.ran`)
178+
- Containers have access to host filesystem at `/.bottlerocket/rootfs`
179+
- Boot blocks until all bootstrap containers complete
180+
- Useful for: installing software, modifying files, running setup scripts
181+
- **Dependencies:** Runs before `configured.target`
182+
183+
**activate-multi-user** (`activate-multi-user.service`)
184+
185+
- **When:** After `configured.target` and `reboot-if-required.service`
186+
- **What:** Transitions to `multi-user.target`
187+
- **Details:**
188+
- Sets systemd default target to `multi-user.target`
189+
- Starts `multi-user.target` asynchronously
190+
- **Dependencies:** Wanted by `configured.target`
191+
192+
### Stage 5: multi-user.target
193+
194+
**Purpose:** Start workload services (kubelet, ECS agent, etc.).
195+
196+
**What "multi-user" means:** The system is fully configured and ready to run workloads.
197+
198+
**Key services:**
199+
200+
- `kubelet.service` (Kubernetes variants)
201+
- `ecs.service` (ECS variants)
202+
- `[email protected]` (admin container)
203+
- `[email protected]` (control container)
204+
205+
**Dependencies:**
206+
207+
- Requires `basic.target` and `configured.target`
208+
- This ensures all configuration is complete before workloads start
209+
210+
## Service Dependency Patterns
211+
212+
### Ordering Dependencies
213+
214+
- `After=` - This service starts after the specified units
215+
- `Before=` - This service starts before the specified units
216+
217+
### Requirement Dependencies
218+
219+
- `Requires=` - This service requires the specified units (hard dependency)
220+
- `Wants=` - This service wants the specified units (soft dependency)
221+
- `RequiredBy=` - Reverse of `Requires=` (specified in `[Install]` section)
222+
- `WantedBy=` - Reverse of `Wants=` (specified in `[Install]` section)
223+
224+
### Early Boot Services
225+
226+
Services that need to run very early use `DefaultDependencies=no` to avoid the standard dependency chain:
227+
228+
- `migrator.service`
229+
- `prepare-*.service` (filesystem preparation)
230+
- `activate-preconfigured.service`
231+
232+
## Synchronization Mechanisms
233+
234+
### Systemd Targets
235+
236+
Targets serve as synchronization points. A target is "reached" when all services required by or wanted by that target have completed.
237+
238+
Target relationships:
239+
240+
```
241+
drivers.target:
242+
Requires: basic.target
243+
RequiredBy: preconfigured.target, multi-user.target
244+
245+
preconfigured.target:
246+
Requires: basic.target
247+
RequiredBy: configured.target, multi-user.target
248+
249+
configured.target:
250+
Requires: preconfigured.target
251+
RequiredBy: multi-user.target
252+
253+
multi-user.target:
254+
Requires: basic.target, configured.target
255+
```
256+
257+
### Sentinel Files
258+
259+
Services use sentinel files to track state across reboots:
260+
261+
- `/var/lib/bottlerocket/early-boot-config.ran` - Prevents `early-boot-config.service` from running after first boot
262+
- `/run/bootstrap-containers/<name>.ran` - Prevents bootstrap containers from re-running
263+
- `/etc/.fips-module-check-passed` - Marks FIPS check completion
264+
265+
Services use `ConditionPathExists=` or `ConditionPathExists=!` to check for these files.
266+
267+
### Transaction Commits
268+
269+
The API system uses transactions to ensure atomic updates:
270+
271+
1. Services write settings to _pending_ state during boot
272+
2. Settings are grouped in the "bottlerocket-launch" transaction
273+
3. `settings-committer` commits the transaction, making settings live
274+
4. `settings-applier` then generates configuration files from live settings
275+
276+
This ensures all boot-time settings are applied together, preventing partial configuration.
277+
278+
## Boot Flow Diagram
279+
280+
```
281+
┌─────────────────────────────────────────────────────────────────┐
282+
│ sysinit.target │
283+
│ - Standard systemd initialization │
284+
│ - prepare-var.service, prepare-boot.service (early filesystem) │
285+
└────────────────────────────┬────────────────────────────────────┘
286+
287+
┌─────────────────────────────────────────────────────────────────┐
288+
│ fipscheck.target (FIPS mode only) │
289+
│ - check-kernel-integrity.service │
290+
│ - check-fips-modules.service │
291+
└────────────────────────────┬────────────────────────────────────┘
292+
293+
┌─────────────────────────────────────────────────────────────────┐
294+
│ drivers.target │
295+
│ - load-neuron-inf1-modules.service (AWS Neuron Inf1) │
296+
│ - load-neuron-latest-modules.service (AWS Neuron Latest) │
297+
└────────────────────────────┬────────────────────────────────────┘
298+
299+
┌─────────────────────────────────────────────────────────────────┐
300+
│ preconfigured.target │
301+
│ │
302+
│ 1. migrator.service (data store migration) │
303+
│ 2. storewolf.service (data store creation) │
304+
│ 3. apiserver.service (API server) │
305+
│ 4. early-boot-config.service (user data, first boot only) │
306+
│ 5. sundog.service (dynamic settings) │
307+
│ 6. settings-applier.service (commit & apply settings) │
308+
│ │
309+
│ Result: API system running, all settings applied │
310+
└────────────────────────────┬────────────────────────────────────┘
311+
312+
activate-configured.service
313+
314+
┌─────────────────────────────────────────────────────────────────┐
315+
│ configured.target │
316+
│ │
317+
│ - bootstrap-containers@*.service (user-defined setup) │
318+
│ │
319+
│ Result: Additional configuration complete │
320+
└────────────────────────────┬────────────────────────────────────┘
321+
322+
activate-multi-user.service
323+
324+
┌─────────────────────────────────────────────────────────────────┐
325+
│ multi-user.target │
326+
│ │
327+
│ - kubelet.service / ecs.service (workload orchestrator) │
328+
│ - [email protected] (admin container) │
329+
│ - [email protected] (control container) │
330+
│ │
331+
│ Result: System ready for workloads │
332+
└─────────────────────────────────────────────────────────────────┘
333+
```
334+
335+
## Debugging Boot Issues
336+
337+
### Check Target Status
338+
339+
```bash
340+
# Check if a target has been reached
341+
systemctl is-active drivers.target
342+
systemctl is-active preconfigured.target
343+
systemctl is-active configured.target
344+
systemctl is-active multi-user.target
345+
346+
# See what's blocking a target
347+
systemctl list-dependencies drivers.target
348+
systemctl list-dependencies preconfigured.target
349+
systemctl list-dependencies --reverse preconfigured.target
350+
```
351+
352+
### Check Service Status
353+
354+
```bash
355+
# See all failed services
356+
systemctl --failed
357+
358+
# Check specific service
359+
systemctl status migrator.service
360+
systemctl status apiserver.service
361+
362+
# View service logs
363+
journalctl -u migrator.service
364+
journalctl -u apiserver.service
365+
```
366+
367+
### Boot Timeline
368+
369+
To see the boot timeline:
370+
371+
```bash
372+
systemd-analyze
373+
systemd-analyze blame
374+
systemd-analyze critical-chain
375+
```
376+
377+
## Related Documentation
378+
379+
- [API System](../sources/api/README.md) - Detailed API component documentation
380+
- [Bootstrap Containers](../sources/api/bootstrap-containers/README.md) - Bootstrap container usage
381+
- [Early Boot Config](../sources/early-boot-config/README.md) - User data configuration
382+
- [Settings System](../sources/api/thar-be-settings/README.md) - Configuration file generation

0 commit comments

Comments
 (0)