@@ -124,6 +124,77 @@ is also possible to permanently configure the default partition from
124124Infix using the [ Bootloader Configuration] ( #configuration ) .
125125
126126
127+ ## System Boot
128+
129+ After the system firmware (BIOS or and [ boot loader] ( boot.md ) start
130+ Linux the following happens. The various failure modes, e.g., missing
131+ password in VPD, are detailed later in this section.
132+
133+ ![ System boot flowchart] ( img/fail-secure.svg )
134+
135+ 1 . Before mounting ` /cfg ` and ` /var ` partitions, hosting read-writable
136+ data like ` startup-config ` and container images, the system first
137+ checks if a factory reset has been requested by the user, if so it
138+ wipes the contents of these partitions
139+ 2 . Linux boots with a device tree which is used for detecting generic
140+ make and model of the device, e.g., number of interfaces. It may
141+ also reference an EEPROM with [ Vital Product Data] ( vpd.md ) . That is
142+ where the base MAC address and per-device password hash is stored.
143+ (Generic builds use the same MAC address and password)
144+ 3 . On every boot the system's ` factory-config ` and ` failure-config ` are
145+ generated from the YANG[ ^ 2 ] models of the current firmware version.
146+ This ensures that a factory reset device can always boot, and that
147+ there is a working fail safe, or rather * fail secure* , mode
148+ 4 . On first power-on, and after a factory reset, the system does not
149+ have a ` startup-config ` , in which case ` factory-config ` is copied
150+ to ` startup-config ` -- if a per-product specific version exists it
151+ is preferred over the generated one
152+ 5 . Provided the integrity of the ` startup-config ` is OK, a system
153+ service loads and activates the configuration
154+
155+ ### Failure Modes
156+
157+ So, what happens if any of the steps above fail?
158+
159+ ** VPD Fail**
160+
161+ The per-device password cannot be read, or is corrupt, so the system
162+ ` factory-config ` and ` failure-config ` are not generated:
163+
164+ 1 . First boot, or after factory reset: ` startup-config ` cannot be
165+ created or loaded, and ` failure-config ` cannot be loaded. The
166+ system ends up in an unrecoverable state, i.e., ** RMA[ ^ 3 ] Mode**
167+ 2 . The system has booted (at least) once with correct VPD and password
168+ and already has a ` startup-config ` . Provided the ` startup-config `
169+ is OK (see below), it is loaded and system boots successfully
170+
171+ In both cases, external factory reset modes/button will not help, and
172+ in the second case will cause the device to fail on the next boot.
173+
174+ > The second case does not yet have any warning or event that can be
175+ > detected from the outside. This is planned for a later release.
176+
177+ ** Broken startup-config**
178+
179+ If loading ` startup-config ` fails for some reason, e.g., invalid JSON
180+ syntax, failed validation against the system's YANG model, or a bug in
181+ the system's ` confd ` service, the * Fail Secure Mode* is triggered and
182+ ` failure-config ` is loaded (unless VPD Failure, see above).
183+
184+ > [ !TIP]
185+ > Please see the [ Branding & Releases] ( branding.md ) document for how to
186+ > provide per-product ` failure-config ` , or ` factory-config ` to suit your
187+ > product's preferences.
188+
189+ * Fail Secure Mode* is a fail-safe mode provided for debugging the
190+ system. The default[ ^ 4 ] creates a setup of isolated interfaces with
191+ communication only to the management CPU, SSH and console login using
192+ the device's factory reset password, IP connectivity only using IPv6
193+ link-local, and device discovery protocols: LLDP, mDNS-SD. The login
194+ and shell prompt are set to ` failure-c0-ff-ee ` , the last three octets of
195+ the device's base MAC address.
196+
197+
127198System Upgrade
128199--------------
129200
@@ -337,6 +408,18 @@ If `var` is not available, Infix will still persist `/var/lib` using
337408` cfg ` as the backing storage.
338409
339410[ ^ 1 ] : See [ Upgrade & Boot Order] ( upgrade.md ) for more information.
411+ [ ^ 2 ] : YANG is a modeling language from IETF, replacing that used for
412+ SNMP (MIB), used to describe the subsystems and properties of
413+ the system.
414+ [ ^ 3 ] : Return Merchandise Authorization (RMA), i.e., broken beyond repair
415+ by end-user and eligible for return to manufacturer.
416+ [ ^ 4 ] : Customer specific builds can define their own ` failure-config ` .
417+ It may be the same as ` factory-config ` , with the hostname set to
418+ ` failure ` , or a dedicated configuration that isolates interfaces, or
419+ even disables ports, to ensure that the device does not cause any
420+ security problems on the network. E.g., start forwarding traffic
421+ between previously isolated VLANs.
422+
340423
341424[ 2 ] : netboot.md
342425[ FIT ] : https://u-boot.readthedocs.io/en/latest/usage/fit.html
0 commit comments