Skip to content

Commit 3505357

Browse files
committed
Use new -c option on faux-mgs component-active-slot command
Also: Update scripts test plan, scripts todo list, and remove Hubris issue #2093 workaround.
1 parent 9c9225b commit 3505357

File tree

5 files changed

+92
-360
lines changed

5 files changed

+92
-360
lines changed

scripts/README.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -57,6 +57,8 @@ and personal workflows.
5757
`faux-mgs` Rust integration:
5858
* `faux_mgs(["arg0", .., "argN"]) -> map`: Runs any `faux-mgs` command
5959
internally (using `--json=pretty`) and returns the result as a Rhai map.
60+
This map will contain either an `Ok` or `Err` field, reflecting the
61+
command's success or failure.
6062
*Do not call this directly in test scripts; use wrappers from `util.rhai`.*
6163
* `new_archive(path) -> ArchiveInspector`: Loads a Hubris archive (.zip).
6264
* `ArchiveInspector[<zip_path>]`: Access files within the archive (returns
@@ -90,7 +92,7 @@ import `${script_dir}/util` as util;
9092
- `rot_boot_info()`: Gets formatted RoT Boot Info.
9193
- `check_update_in_progress(component)`: Checks SP/RoT update status.
9294
- `update_rot_image_file(slot, path, label)`: Updates RoT image.
93-
- `set_rot_boot_preference(slot, use_transient, label)`: Sets RoT pref.
95+
- `rot_boot_preference(slot, action, use_transient, label)`: Sets or clears RoT preference. When clearing (`action = util::PREF_CLEAR`), uses the `component-active-slot -c` command if supported by the firmware, with automatic fallback to reset workaround for version compatibility.
9496
- `reset_rot_and_get_rbi(desc, label)`: Resets RoT and gets RBI.
9597
- `update_sp_image(path)`: Updates SP image.
9698
- `reset_sp()`: Resets SP.

scripts/TEST_PLAN.md

Lines changed: 40 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -1,28 +1,28 @@
11
# Validation Test Plan
22

33
This document outlines a full set of tests to validate the `upgrade-rollback.rhai` test harness. It is split into two parts:
4-
1. **Part 1**: Tests the current, real-world scenario where the new `baseline` firmware has features that the older `under-test` firmware lacks.
4+
1. **Part 1**: Tests the current, real-world scenario where the new `under-test` firmware has features that the older `baseline` firmware lacks.
55
2. **Part 2**: Describes tests for a future state where both `baseline` and `under-test` firmware are fully compliant with the transient boot preference feature.
66

77
## Prerequisites and Setup
88

99
### 1. Environment Variables
1010

11-
Before running these tests, for convenience, set two environment variables to point to your local Hubris build repositories:
11+
Before running these tests, for convenience, set environment variables to point to your local Hubris build repositories. For example:
1212
```bash
1313
export REPO_BL=/path/to/your/baseline/hubris
1414
export REPO_UT=/path/to/your/under-test/hubris
15+
export UT_WORKTREE=${REPO_UT}
1516
```
1617

1718
These repositories must have SP and RoT Hubris build products in their
1819
respective `target/` directories.
1920

20-
Examine and edit the `scripts/targets.json` file or make your own if you need
21-
to use images from other locations.
21+
**Note for other users**: The `scripts/targets.json` file uses these environment variables (e.g., `UT_WORKTREE`) to locate firmware images. If simple environment variable overrides are not convenient, then you will want your own configuration file like `scripts/targets.json` that reflects your local test environment.
2222

2323
### 2. The `FMR` Wrapper Script
2424

25-
The test commands use a helper script named `FMR`, which is a wrapper around the main `cargo run --bin faux-mgs` command. Its purpose is to simplify running tests by automatically including common arguments.
25+
The test commands use a helper script named `FMR` (faux-mgs with Rhai scripting), which is a wrapper around the main `cargo run --bin faux-mgs` command. Its purpose is to simplify running tests by automatically including common arguments.
2626

2727
* **Functionality**: The script automatically adds required arguments like `--features=rhaiscript`, `--json=pretty`, timeouts, and attempts to discover the correct network `--interface` setting.
2828
* **Log Levels**: The name used to call the script sets the log level for the test run. For example, `FMR-info` sets `--log-level=info`, while `FMR-trace` sets `--log-level=trace`.
@@ -37,8 +37,35 @@ Each numbered test case should be run from a known-clean state. Before starting
3737
```bash
3838
FMR-info reset-component rot
3939
FMR-info reset-component rot
40+
41+
This ensures that any version of RoT firmware being used has no pending Hubris
42+
image preferences in effect.
43+
44+
### 3. Copy and customize `scripts/targets.json` for your environment
45+
46+
```bash
47+
TARGETS=targets-$(uname-n).json
48+
cp scripts/targets.json $TARGETS
49+
# Edit $TARGETS appropriately
4050
```
4151
52+
Note that the `upgrade-rollback.rhai` script has a `-b` and `-u` options to
53+
override the baseline and under-test paths in `scripts/targets.json`, so if that
54+
is the only thing you want to change you can just use those CLI flags.
55+
56+
57+
---
58+
59+
## Version Compatibility and Graceful Degradation
60+
61+
The test scripts include robust version compatibility handling for the `--cancel-pending` feature:
62+
63+
* **Preferred Method**: When supported, the scripts use `faux-mgs component-active-slot -c` to directly clear pending persistent preferences.
64+
* **Fallback Method**: When the SP firmware doesn't support the command (indicated by a "WrongVersion" error), the scripts automatically fall back to the RoT reset workaround.
65+
* **Seamless Operation**: This compatibility layer ensures tests work across different firmware versions without manual intervention.
66+
67+
During the transition period where some devices have updated firmware and others don't, the test suite will automatically use the appropriate method for each device.
68+
4269
---
4370
4471
## Part 1: Testing Asymmetric Feature Support (Current State)
@@ -50,8 +77,7 @@ FMR-info reset-component rot
5077
* **Purpose**: To verify that the primary upgrade and rollback functionality works correctly without using any of the new features.
5178
* **Command**:
5279
```bash
53-
FMR-info rhai scripts/upgrade-rollback.rhai -c scripts/targets.json \
54-
-b $REPO_BL -u $REPO_UT
80+
./FMR-info rhai scripts/upgrade-rollback.rhai -c $TARGETS
5581
```
5682
* **Expected Outcome**: The script should complete successfully with an exit code of 0. It will upgrade to the `under-test` image and then roll back to the `baseline` image using persistent updates.
5783
@@ -60,8 +86,7 @@ FMR-info reset-component rot
6086
* **Purpose**: To verify the script correctly handles the feature asymmetry when the transient update path is requested.
6187
* **Command**:
6288
```bash
63-
FMR-info rhai scripts/upgrade-rollback.rhai -c scripts/targets.json \
64-
-b $REPO_BL -u $REPO_UT -t
89+
./FMR-info rhai scripts/upgrade-rollback.rhai -c $TARGETS -t
6590
```
6691
* **Expected Outcome**: The script should complete successfully with an exit code of 0. The log should show:
6792
* **Upgrade**: The active `baseline` firmware does not support the feature. The script will log a warning and use a persistent update.
@@ -72,8 +97,7 @@ FMR-info reset-component rot
7297
* **Purpose**: To verify the logic that runs (or skips) the `test_and_recover...` negative test based on feature support.
7398
* **Command**:
7499
```bash
75-
FMR-info rhai scripts/upgrade-rollback.rhai -c scripts/targets.json \
76-
-b $REPO_BL -u $REPO_UT -N
100+
./FMR-info rhai scripts/upgrade-rollback.rhai -c $TARGETS -N
77101
```
78102
* **Expected Outcome**: The script will **fail with exit code 1**. This is the correct behavior.
79103
* **Upgrade**: The `baseline` firmware is active and does not support the transient feature. The script will detect this and, because the test is for the `ut` branch, it will log a `FATAL` error stating the `under-test` image must support the feature. This check is known to be flawed for this specific asymmetric case but correctly protects against regressions.
@@ -83,20 +107,16 @@ FMR-info reset-component rot
83107
* **Purpose**: To verify that the test harness can recover from a pre-existing `pending_persistent` preference fault.
84108
* **Command**:
85109
```bash
86-
FMR-info rhai scripts/upgrade-rollback.rhai -c scripts/targets.json \
87-
-b $REPO_BL -u $REPO_UT \
88-
--inject-fault=pending --hubris-2093
110+
./FMR-info rhai scripts/upgrade-rollback.rhai -c $TARGETS --inject-fault=pending
89111
```
90-
* **Expected Outcome**: The script should run the "pending" fault injection test and exit with code 0. The log will show the sanitizer detecting the fault and using the reset-based workaround to clear it before the main test flow runs successfully.
112+
* **Expected Outcome**: The script should run the "pending" fault injection test and exit with code 0. The log will show the sanitizer detecting the fault and attempting to use the `faux-mgs component-active-slot -c` command to clear it. If the firmware supports the command, it will clear the fault directly. If there's a version mismatch (e.g., "WrongVersion { sp: 19, request: 20 }"), the system will fall back to the RoT reset workaround and still complete successfully.
91113
92114
### Test 1.5: Fault Injection - Conflicting `transient` Preference
93115
94116
* **Purpose**: To verify the test harness correctly handles the inability to inject a fault into non-compliant firmware.
95117
* **Command**:
96118
```bash
97-
FMR-info rhai scripts/upgrade-rollback.rhai -c scripts/targets.json \
98-
-b $REPO_BL -u $REPO_UT \
99-
--inject-fault=transient
119+
./FMR-info rhai scripts/upgrade-rollback.rhai -c $TARGETS --inject-fault=transient
100120
```
101121
* **Expected Outcome**: The script is **expected to fail with exit code 1**. This is the correct outcome. The log will show:
102122
1. The script first installs the `baseline` (`master`) firmware.
@@ -115,8 +135,7 @@ FMR-info reset-component rot
115135
* **Purpose**: To verify that when both images are compliant, the script uses the transient update path for both the upgrade and the rollback.
116136
* **Command**:
117137
```bash
118-
FMR-info rhai scripts/upgrade-rollback.rhai -c scripts/targets.json \
119-
-b $REPO_BL -u $REPO_UT -t
138+
./FMR-info rhai scripts/upgrade-rollback.rhai -c $TARGETS -t
120139
```
121140
* **Expected Outcome**: The script should complete successfully with an exit code of 0. The log should show a transient update is used for **both** the upgrade to `ut` and the subsequent rollback to `base`.
122141
@@ -125,7 +144,6 @@ FMR-info reset-component rot
125144
* **Purpose**: To verify that the negative test runs successfully in both directions when all firmware is compliant.
126145
* **Command**:
127146
```bash
128-
FMR-info rhai scripts/upgrade-rollback.rhai -c scripts/targets.json \
129-
-b $REPO_BL -u $REPO_UT -N
147+
./FMR-info rhai scripts/upgrade-rollback.rhai -c $TARGETS -N
130148
```
131149
* **Expected Outcome**: The script should complete successfully with an exit code of 0. The `test_and_recover_from_preferred_slot_update_failure` function should be executed and pass for the `ut` branch during the upgrade, and then be executed and pass **again** for the `base` branch during the rollback.

scripts/TODO.md

Lines changed: 3 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -4,10 +4,9 @@ This document tracks known issues, planned features, and refactoring opportuniti
44

55
## High Priority / Bugs & Workarounds
66

7-
* **Remove `--hubris-2093` Workaround**
8-
* **Issue**: The `lpc55-update-server` firmware has a bug where setting a persistent preference does not correctly clear a pre-existing pending preference. This is tracked as "Hubris issue #2093".
9-
* **Workaround**: The `sanitize_boot_preferences` function in `update-helper.rhai` uses a reset to reliably clear a pending preference when the `--hubris-2093` flag is active.
10-
* **Action**: Once the firmware bug is fixed, the workaround logic should be removed from `sanitize_boot_preferences` and the `--hubris-2093` flag should be removed from `upgrade-rollback.rhai`. The "ideal" logic path should become the only path.
7+
* **Hubris #2093 Workaround Status**
8+
* **Issue**: The `lpc55-update-server` firmware had a bug where setting a persistent preference did not correctly clear a pre-existing pending preference. This was tracked as "Hubris issue #2093".
9+
* **Current Status**: The `faux-mgs component-active-slot -c` command is now implemented and provides the preferred solution. However, the workaround logic is retained in `sanitize_boot_preferences` to handle version compatibility - when the SP firmware doesn't support the new `-c` command (version mismatch), the system gracefully falls back to the RoT reset workaround. This ensures compatibility across firmware versions during the transition period.
1110

1211
* **Fix `faux-mgs` Error Reporting for `reset-component`**
1312
* **Issue**: When the SP debugger is attached, the `reset-component sp` command fails. However, the `faux-mgs` Rust code does not gracefully package the detailed error message (`watchdog: RoT error: the SP programming dongle is connected`) into the JSON passed to Rhai. It returns a generic error.

scripts/update-helper.rhai

Lines changed: 43 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -317,18 +317,18 @@ fn rot_supports_transient_boot_preference() {
317317
debug(`error|Cannot get active RoT slot. Error: ${r.error}`);
318318
return false;
319319
}
320-
if r.ok.transient != () {
321-
// Evidence shows that the feature is supported.
322-
// Don't alter the state
320+
if r.ok.transient != () || r.ok.pending_persistent != () {
321+
// Indirect evidence since reporting non-() values
322+
// was implemented in the same commit as transient image selection.
323+
// State is not altered.
323324
return true;
324325
}
325-
// For compliant firmware, this command succeeds and ensures the transient
326-
// preference is cleared. For non-compliant firmware, it will fail.
327-
let pref_check_result = util::rot_boot_preference(r.ok.active, util::PREF_SET, true, "transient_support_test");
326+
// Now we know that if the feature is supported, then it isn't being used.
327+
// If the feature isn't supported, then this call will fail:
328+
let pref_check_result = util::rot_boot_preference(r.ok.active, util::PREF_SET, true, "transient_support_test_set");
328329
if pref_check_result?.ok == true {
329-
// Setting the transient preference to the active slot is a no-op
330-
// but shows that the feature is supported.
331330
debug("info|transient boot preference feature is supported.");
331+
let _r = util::rot_boot_preference(r.ok.active, util::PREF_CLEAR, true, "transient_support_test_clear");
332332
return true;
333333
}
334334
debug("warn|transient boot preference feature is not supported.");
@@ -529,15 +529,30 @@ fn rot_validate_final_persistent_boot_state(
529529
}
530530

531531
fn rot_validate_direct_persistent_boot_state(
532-
rbi, target_update_slot, target_label
532+
rbi, target_update_slot, target_label, images
533533
) {
534534
debug("info|--- rot_validate_direct_persistent_boot_state (update-helper) ---");
535-
if rbi.active != target_update_slot {
536-
debug(`error|Validation FAILED: Unexpected active slot for '${target_label}'.`);
535+
let active_gitc = util::caboose_value("rot", `${rbi.active}`, "GITC");
536+
let expected_gitc_entries = images.by_gitc?.get(active_gitc);
537+
538+
if active_gitc == () {
539+
debug(`error|Validation FAILED: Could not get GITC for active slot ${rbi.active}.`);
540+
return false;
541+
}
542+
543+
// Check if the active GITC belongs to the target_label (e.g., 'base_rot_a' or 'base_rot_b')
544+
// The `images.by_gitc` map stores an array of labels for each GITC.
545+
// We need to check if any of the labels for the active GITC match the target branch.
546+
if expected_gitc_entries != () && (
547+
`${target_label}_rot_a` in expected_gitc_entries ||
548+
`${target_label}_rot_b` in expected_gitc_entries
549+
) {
550+
debug(`info|Validation PASSED: RoT active GITC (${active_gitc}) matches expected for '${target_label}'.`);
551+
return true;
552+
} else {
553+
debug(`error|Validation FAILED: Active GITC (${active_gitc}) does not match expected for '${target_label}'. RBI: ${rbi}`);
537554
return false;
538555
}
539-
debug(`info|Validation PASSED: RoT correctly booted for '${target_label}'.`);
540-
return true;
541556
}
542557

543558
/// Attempts to power cycle the DUT and log its RoT state afterwards.
@@ -617,7 +632,8 @@ fn update_rot_hubris(
617632
path_b,
618633
use_transient,
619634
target_label,
620-
conf
635+
conf,
636+
images
621637
) {
622638
debug(`info|update_rot_hubris target=${target_label}`);
623639
debug(`info|transient=${use_transient}`);
@@ -774,10 +790,19 @@ fn update_rot_hubris(
774790
}
775791
} else {
776792
// Direct Persistent Boot Flow
793+
// After setting persistent preference, the device should boot into the target slot.
794+
// We perform one reset and check if the preference took effect.
795+
if (rbi.active != target_slot) {
796+
debug(`error|Validation FAILED: Persistent preference for slot ${target_slot} did not take effect after first reset. Active slot is ${rbi.active}. This indicates a firmware issue.`);
797+
if (conf?.rot_hubris_power_cycle_on_failure == true) {
798+
power_cycle_dut(conf, "RoT persistent preference not applied after reset");
799+
}
800+
return false;
801+
}
802+
777803
if (!rot_validate_direct_persistent_boot_state(
778-
rbi, target_slot, target_label
804+
rbi, target_slot, target_label, images
779805
)) {
780-
// No specific error type from this local helper yet to condition power cycle
781806
if (conf?.rot_hubris_power_cycle_on_failure == true) {
782807
power_cycle_dut(conf, "RoT direct persistent boot validation failed");
783808
}
@@ -860,7 +885,7 @@ fn ensure_initial_baseline_state(conf, images) {
860885
debug(`info|Device needs baseline flashing. SP: ${flash_sp}, RoT: ${flash_rot}`);
861886
if flash_rot {
862887
debug("info|Updating RoT to baseline (persistent).");
863-
if !update_rot_hubris(conf.base.rot_a, conf.base.rot_b, false, "baseline_setup", conf) {
888+
if !update_rot_hubris(conf.base.rot_a, conf.base.rot_b, false, "base", conf, images) {
864889
debug("error|Failed to update RoT to baseline.");
865890
return false;
866891
}
@@ -1087,15 +1112,6 @@ fn sanitize_boot_preferences(conf) {
10871112
} else {
10881113
debug("info|sanitize_boot_preferences: Successfully cleared pending persistent preference.");
10891114
}
1090-
1091-
// Final verification
1092-
let final_rbi = util::rot_boot_info();
1093-
if final_rbi?.error != () || final_rbi.pending_persistent_boot_preference != () {
1094-
debug(`error|sanitize_boot_preferences: Failed to verify pending pref was cleared. RBI: ${final_rbi}`);
1095-
return false;
1096-
}
1097-
} else {
1098-
debug("info|No pending persistent preference");
10991115
}
11001116

11011117
debug("info|Boot preferences sanitized successfully.");
@@ -1106,7 +1122,7 @@ fn sanitize_boot_preferences(conf) {
11061122
/// the SP having a pending update.
11071123
/// Returns:
11081124
/// bool: `true` if no debugger and pending update detected, `false` otherwise.
1109-
//
1125+
///
11101126
/// Fixing Hubris issue 2066 will give us more definitive information to use in
11111127
/// testing.
11121128
fn check_for_sp_debugger_and_sp_pending_update() {

0 commit comments

Comments
 (0)