-
Notifications
You must be signed in to change notification settings - Fork 53
docs: add boot process documentation #754
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,382 @@ | ||
| # Bottlerocket Boot Process | ||
|
|
||
| This document describes how Bottlerocket boots, focusing on the systemd target progression and service dependencies. | ||
|
|
||
| **Keywords:** boot, systemd, targets, preconfigured, configured, multi-user, fipscheck, drivers, sysinit, services, dependencies, ordering, API system, bootstrap containers, settings, startup, initialization, kernel modules | ||
|
|
||
| ## Overview | ||
|
|
||
| Bottlerocket's boot sequence progresses through six main stages, each represented by a systemd target: | ||
|
|
||
| ``` | ||
| sysinit.target | ||
| ↓ | ||
| fipscheck.target (FIPS mode only) | ||
|
Comment on lines
+12
to
+14
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This isn't the correct order, but the reason why is subtle. There's a Since all the units that are needed by that target are |
||
| ↓ | ||
| drivers.target (kernel module loading) | ||
| ↓ | ||
| preconfigured.target (API system initialization) | ||
| ↓ | ||
| configured.target (bootstrap containers) | ||
| ↓ | ||
| multi-user.target (workload services) | ||
| ``` | ||
|
|
||
| Each stage must complete before the next begins. Services use systemd dependencies (`After=`, `Requires=`, `Wants=`) to coordinate within and between stages. | ||
|
|
||
| ## Boot Stages | ||
|
|
||
| ### Stage 0: sysinit.target | ||
|
|
||
| Standard systemd initialization. Most services implicitly depend on this through `DefaultDependencies=yes` (the default). | ||
|
|
||
| Early services that need to run before normal dependency chains use `DefaultDependencies=no`: | ||
|
|
||
| - Filesystem preparation (`prepare-var.service`, `prepare-boot.service`, etc.) | ||
| - Data store migration (`migrator.service`) | ||
|
|
||
| ### Stage 1: fipscheck.target | ||
|
|
||
| **Purpose:** Verify cryptographic module integrity when FIPS mode is enabled. | ||
|
|
||
| **When it runs:** Only when the kernel command line includes `fips=1`. | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Not correct because of the bootconfig override, which is also where |
||
|
|
||
| **Key services:** | ||
|
|
||
| - `check-kernel-integrity.service` - Verifies kernel integrity | ||
| - `check-fips-modules.service` - Loads and tests the `tcrypt` module | ||
| - Creates `/etc/.fips-module-check-passed` sentinel file on success | ||
| - Blocks boot if FIPS checks fail | ||
|
|
||
| **Transition:** Completes before `drivers.target` begins. | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In terms of coordinated state transitions, we have these:
These are our "runlevels" or discrete stages. Targets don't work like runlevels, they just activate in response to something else causing them to be enqueued. |
||
|
|
||
| ### Stage 2: drivers.target | ||
|
|
||
| **Purpose:** Load kernel modules and hardware drivers. | ||
|
|
||
| **When it runs:** Always runs, after `basic.target` and before `preconfigured.target`. | ||
|
Comment on lines
+53
to
+57
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It might be better to write this in terms of what "pulls" on various targets. Many of them are pulled in parallel (sysinit.target, drivers.target, network-online.target) by the units in preconfigured. |
||
|
|
||
| **Key services:** | ||
|
|
||
| - `load-neuron-inf1-modules.service` - Loads AWS Neuron Inf1 kernel modules | ||
| - `load-neuron-latest-modules.service` - Loads AWS Neuron Latest kernel modules | ||
|
|
||
| **Dependencies:** | ||
|
|
||
| - Runs after `basic.target` | ||
| - Runs before `preconfigured.target` | ||
| - Required by `preconfigured.target` and `multi-user.target` | ||
|
|
||
| **Note:** Some driver loading services (e.g., NVIDIA GPU drivers) are required by `preconfigured.target` directly rather than using `drivers.target`. | ||
|
|
||
| **Transition:** Completes before `preconfigured.target` begins. | ||
|
|
||
| ### Stage 3: preconfigured.target | ||
|
|
||
| **Purpose:** Initialize the API system and apply all boot-time configuration. | ||
|
|
||
| **What "preconfigured" means:** The system has: | ||
|
|
||
| - A populated data store with default and user-provided settings | ||
| - A running API server | ||
| - All configuration files generated from settings | ||
|
|
||
| This is the most complex boot stage. Services run in a specific order to build up the system configuration: | ||
|
|
||
| #### 3.1 Data Store Setup | ||
|
|
||
| **migrator** (`migrator.service`) | ||
|
|
||
| - **When:** Runs with `DefaultDependencies=no`, before everything else | ||
| - **What:** Updates data store schema if the OS version changed | ||
| - **Dependencies:** Required by `apiserver.service`, `storewolf.service`, and `preconfigured.target` | ||
|
|
||
| **storewolf** (`storewolf.service`) | ||
|
|
||
| - **When:** After `migrator.service` | ||
| - **What:** Creates data store directories and populates default settings | ||
| - **Details:** | ||
| - Reads defaults from variant-specific `defaults.d` directories | ||
| - Writes settings to _pending_ state in "bottlerocket-launch" transaction | ||
| - Settings not available to other services until committed | ||
| - **Dependencies:** Required by `preconfigured.target` | ||
|
|
||
| #### 3.2 API Server | ||
|
|
||
| **apiserver** (`apiserver.service`) | ||
|
|
||
| - **When:** After `storewolf.service` | ||
| - **What:** Starts the API server on Unix socket `/run/api.sock` | ||
| - **Details:** Allows reading/writing settings via API | ||
| - **Dependencies:** Wanted by `preconfigured.target` | ||
|
|
||
| #### 3.3 Settings Population | ||
|
|
||
| **early-boot-config** (`early-boot-config.service`) | ||
|
|
||
| - **When:** After `network-online.target`, `apiserver.service`, `storewolf.service` | ||
| - **What:** Applies user data settings (cloud-init equivalent) | ||
| - **Details:** | ||
| - Only runs on first boot (checks for `/var/lib/bottlerocket/early-boot-config.ran`) | ||
| - Fetches user data from platform metadata service (e.g., EC2 IMDS) | ||
| - PATCHes settings to API in _pending_ state (not committed) | ||
| - **Dependencies:** Required by `preconfigured.target` | ||
|
|
||
| **sundog** (`sundog.service`) | ||
|
|
||
| - **When:** After `network-online.target`, `apiserver.service`, `early-boot-config.service` | ||
| - **What:** Generates dynamic settings that can't be determined until runtime | ||
| - **Details:** | ||
| - Examples: primary IP address, cluster DNS settings | ||
| - Runs `settings-committer` first to access user data settings | ||
| - PATCHes generated settings to API in _pending_ state | ||
| - **Dependencies:** Required by `preconfigured.target` | ||
| - **Subcomponent:** `pluto.service` generates Kubernetes-specific settings | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I wouldn't really call |
||
|
|
||
| #### 3.4 Configuration Application | ||
|
|
||
| **settings-applier** (`settings-applier.service`) | ||
|
|
||
| - **When:** After `storewolf.service`, `sundog.service`, `early-boot-config.service`, `apiserver.service` | ||
| - **What:** Writes all configuration files based on settings | ||
| - **Details:** | ||
| - Runs `settings-committer` to commit the "bottlerocket-launch" transaction | ||
| - Runs `thar-be-settings --all` to generate all config files | ||
| - This is when pending settings become live | ||
| - **Dependencies:** Required by `preconfigured.target` | ||
|
|
||
| #### 3.5 Stage Transition | ||
|
|
||
| **activate-configured** (`activate-configured.service`) | ||
|
|
||
| - **When:** After `preconfigured.target` completes | ||
| - **What:** Transitions to `configured.target` | ||
| - **Details:** | ||
| - Sets systemd default target to `configured.target` | ||
| - Starts `configured.target` asynchronously | ||
| - **Dependencies:** Wanted by `preconfigured.target` | ||
|
|
||
| ### Stage 4: configured.target | ||
|
|
||
| **Purpose:** Run bootstrap containers that perform additional system configuration. | ||
|
|
||
| **What "configured" means:** The system has: | ||
|
|
||
| - Completed all API-based configuration | ||
| - Run any user-defined bootstrap containers | ||
| - Applied any additional configuration from bootstrap containers | ||
|
|
||
| **Key services:** | ||
|
|
||
| **bootstrap-containers@** (`[email protected]`) | ||
|
|
||
| - **When:** After `host-containerd.service`, before `configured.target` | ||
| - **What:** Runs bootstrap containers defined in settings | ||
| - **Details:** | ||
| - Template unit instantiated for each configured bootstrap container | ||
| - Only runs once per container (checks for `/run/bootstrap-containers/%i.ran`) | ||
| - Containers have access to host filesystem at `/.bottlerocket/rootfs` | ||
| - Boot blocks until all bootstrap containers complete | ||
| - Useful for: installing software, modifying files, running setup scripts | ||
| - **Dependencies:** Runs before `configured.target` | ||
|
|
||
| **activate-multi-user** (`activate-multi-user.service`) | ||
|
|
||
| - **When:** After `configured.target` and `reboot-if-required.service` | ||
| - **What:** Transitions to `multi-user.target` | ||
| - **Details:** | ||
| - Sets systemd default target to `multi-user.target` | ||
| - Starts `multi-user.target` asynchronously | ||
| - **Dependencies:** Wanted by `configured.target` | ||
|
|
||
| ### Stage 5: multi-user.target | ||
|
|
||
| **Purpose:** Start workload services (kubelet, ECS agent, etc.). | ||
|
|
||
| **What "multi-user" means:** The system is fully configured and ready to run workloads. | ||
|
|
||
| **Key services:** | ||
|
|
||
| - `kubelet.service` (Kubernetes variants) | ||
| - `ecs.service` (ECS variants) | ||
| - `[email protected]` (admin container) | ||
| - `[email protected]` (control container) | ||
|
|
||
| **Dependencies:** | ||
|
|
||
| - Requires `basic.target` and `configured.target` | ||
| - This ensures all configuration is complete before workloads start | ||
|
Comment on lines
+205
to
+208
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Kind of, but we never actually enqueue |
||
|
|
||
| ## Service Dependency Patterns | ||
|
|
||
| ### Ordering Dependencies | ||
|
|
||
| - `After=` - This service starts after the specified units | ||
| - `Before=` - This service starts before the specified units | ||
|
|
||
| ### Requirement Dependencies | ||
|
|
||
| - `Requires=` - This service requires the specified units (hard dependency) | ||
| - `Wants=` - This service wants the specified units (soft dependency) | ||
| - `RequiredBy=` - Reverse of `Requires=` (specified in `[Install]` section) | ||
| - `WantedBy=` - Reverse of `Wants=` (specified in `[Install]` section) | ||
|
Comment on lines
+210
to
+222
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I find the hard / soft dependency language insufficiently precise. The "systemd as job queue" formulation would say:
|
||
|
|
||
| ### Early Boot Services | ||
|
|
||
| Services that need to run very early use `DefaultDependencies=no` to avoid the standard dependency chain: | ||
|
|
||
| - `migrator.service` | ||
| - `prepare-*.service` (filesystem preparation) | ||
| - `activate-preconfigured.service` | ||
|
Comment on lines
+224
to
+230
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. While this is true, I don't think it's helpful - most of these services are special and have unique reasons for |
||
|
|
||
| ## Synchronization Mechanisms | ||
|
|
||
| ### Systemd Targets | ||
|
|
||
| Targets serve as synchronization points. A target is "reached" when all services required by or wanted by that target have completed. | ||
|
|
||
| Target relationships: | ||
|
|
||
| ``` | ||
| drivers.target: | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
In general I don't really see targets as a synchronization point. They are more of an abstraction over a bunch of units - "I need everything to bring the network online or to make the TPM2 device to also go into the queue when you put my job in." They let you synchronize what you enqueue but not when, exactly - "when" is just "the same instant you enqueue some other job". |
||
| Requires: basic.target | ||
| RequiredBy: preconfigured.target, multi-user.target | ||
|
|
||
| preconfigured.target: | ||
| Requires: basic.target | ||
| RequiredBy: configured.target, multi-user.target | ||
|
|
||
| configured.target: | ||
| Requires: preconfigured.target | ||
| RequiredBy: multi-user.target | ||
|
|
||
| multi-user.target: | ||
| Requires: basic.target, configured.target | ||
| ``` | ||
|
|
||
| ### Sentinel Files | ||
|
|
||
| Services use sentinel files to track state across reboots: | ||
|
|
||
| - `/var/lib/bottlerocket/early-boot-config.ran` - Prevents `early-boot-config.service` from running after first boot | ||
| - `/run/bootstrap-containers/<name>.ran` - Prevents bootstrap containers from re-running | ||
| - `/etc/.fips-module-check-passed` - Marks FIPS check completion | ||
|
|
||
| Services use `ConditionPathExists=` or `ConditionPathExists=!` to check for these files. | ||
|
Comment on lines
+257
to
+265
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. True, though I dislike this pattern and it's more of a last resort ideally. If we had a |
||
|
|
||
| ### Transaction Commits | ||
|
|
||
| The API system uses transactions to ensure atomic updates: | ||
|
|
||
| 1. Services write settings to _pending_ state during boot | ||
| 2. Settings are grouped in the "bottlerocket-launch" transaction | ||
| 3. `settings-committer` commits the transaction, making settings live | ||
| 4. `settings-applier` then generates configuration files from live settings | ||
|
|
||
| This ensures all boot-time settings are applied together, preventing partial configuration. | ||
|
|
||
| ## Boot Flow Diagram | ||
|
|
||
| ``` | ||
| ┌─────────────────────────────────────────────────────────────────┐ | ||
| │ sysinit.target │ | ||
| │ - Standard systemd initialization │ | ||
| │ - prepare-var.service, prepare-boot.service (early filesystem) │ | ||
| └────────────────────────────┬────────────────────────────────────┘ | ||
| ↓ | ||
| ┌─────────────────────────────────────────────────────────────────┐ | ||
| │ fipscheck.target (FIPS mode only) │ | ||
| │ - check-kernel-integrity.service │ | ||
| │ - check-fips-modules.service │ | ||
| └────────────────────────────┬────────────────────────────────────┘ | ||
| ↓ | ||
| ┌─────────────────────────────────────────────────────────────────┐ | ||
| │ drivers.target │ | ||
| │ - load-neuron-inf1-modules.service (AWS Neuron Inf1) │ | ||
| │ - load-neuron-latest-modules.service (AWS Neuron Latest) │ | ||
| └────────────────────────────┬────────────────────────────────────┘ | ||
| ↓ | ||
| ┌─────────────────────────────────────────────────────────────────┐ | ||
| │ preconfigured.target │ | ||
| │ │ | ||
| │ 1. migrator.service (data store migration) │ | ||
| │ 2. storewolf.service (data store creation) │ | ||
| │ 3. apiserver.service (API server) │ | ||
| │ 4. early-boot-config.service (user data, first boot only) │ | ||
| │ 5. sundog.service (dynamic settings) │ | ||
| │ 6. settings-applier.service (commit & apply settings) │ | ||
| │ │ | ||
| │ Result: API system running, all settings applied │ | ||
| └────────────────────────────┬────────────────────────────────────┘ | ||
| ↓ | ||
| activate-configured.service | ||
| ↓ | ||
| ┌─────────────────────────────────────────────────────────────────┐ | ||
| │ configured.target │ | ||
| │ │ | ||
| │ - bootstrap-containers@*.service (user-defined setup) │ | ||
| │ │ | ||
| │ Result: Additional configuration complete │ | ||
| └────────────────────────────┬────────────────────────────────────┘ | ||
| ↓ | ||
| activate-multi-user.service | ||
| ↓ | ||
| ┌─────────────────────────────────────────────────────────────────┐ | ||
| │ multi-user.target │ | ||
| │ │ | ||
| │ - kubelet.service / ecs.service (workload orchestrator) │ | ||
| │ - [email protected] (admin container) │ | ||
| │ - [email protected] (control container) │ | ||
| │ │ | ||
| │ Result: System ready for workloads │ | ||
| └─────────────────────────────────────────────────────────────────┘ | ||
| ``` | ||
|
|
||
| ## Debugging Boot Issues | ||
|
|
||
| ### Check Target Status | ||
|
|
||
| ```bash | ||
| # Check if a target has been reached | ||
| systemctl is-active drivers.target | ||
| systemctl is-active preconfigured.target | ||
| systemctl is-active configured.target | ||
| systemctl is-active multi-user.target | ||
|
|
||
| # See what's blocking a target | ||
| systemctl list-dependencies drivers.target | ||
| systemctl list-dependencies preconfigured.target | ||
| systemctl list-dependencies --reverse preconfigured.target | ||
| ``` | ||
|
|
||
| ### Check Service Status | ||
|
|
||
| ```bash | ||
| # See all failed services | ||
| systemctl --failed | ||
|
|
||
| # Check specific service | ||
| systemctl status migrator.service | ||
| systemctl status apiserver.service | ||
|
|
||
| # View service logs | ||
| journalctl -u migrator.service | ||
| journalctl -u apiserver.service | ||
| ``` | ||
|
|
||
| ### Boot Timeline | ||
|
|
||
| To see the boot timeline: | ||
|
|
||
| ```bash | ||
| systemd-analyze | ||
| systemd-analyze blame | ||
| systemd-analyze critical-chain | ||
| ``` | ||
|
|
||
| ## Related Documentation | ||
|
|
||
| - [API System](../sources/api/README.md) - Detailed API component documentation | ||
| - [Bootstrap Containers](../sources/api/bootstrap-containers/README.md) - Bootstrap container usage | ||
| - [Early Boot Config](../sources/early-boot-config/README.md) - User data configuration | ||
| - [Settings System](../sources/api/thar-be-settings/README.md) - Configuration file generation | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would probably add
local-fs.targetas one of the most important prerequisites tosysinit.target.Quite a lot of Bottlerocket-specific work happens for local storage setup 😀 while the
sysinitphase is rather vanilla.