|
| 1 | +# Validation Test Plan |
| 2 | + |
| 3 | +This document outlines a full set of tests to validate the `upgrade-rollback.rhai` test harness. It is split into two parts: |
| 4 | +1. **Part 1**: Tests the current, real-world scenario where the new `baseline` firmware has features that the older `under-test` firmware lacks. |
| 5 | +2. **Part 2**: Describes tests for a future state where both `baseline` and `under-test` firmware are fully compliant with the transient boot preference feature. |
| 6 | + |
| 7 | +## Prerequisites and Setup |
| 8 | + |
| 9 | +### 1. Environment Variables |
| 10 | + |
| 11 | +Before running these tests, for convenience, set two environment variables to point to your local Hubris build repositories: |
| 12 | +```bash |
| 13 | +export REPO_BL=/path/to/your/baseline/hubris |
| 14 | +export REPO_UT=/path/to/your/under-test/hubris |
| 15 | +``` |
| 16 | + |
| 17 | +These repositories must have SP and RoT Hubris build products in their |
| 18 | +respective `target/` directories. |
| 19 | + |
| 20 | +Examine and edit the `scripts/targets.json` file or make your own if you need |
| 21 | +to use images from other locations. |
| 22 | + |
| 23 | +### 2. The `FMR` Wrapper Script |
| 24 | + |
| 25 | +The test commands use a helper script named `FMR`, which is a wrapper around the main `cargo run --bin faux-mgs` command. Its purpose is to simplify running tests by automatically including common arguments. |
| 26 | + |
| 27 | +* **Functionality**: The script automatically adds required arguments like `--features=rhaiscript`, `--json=pretty`, timeouts, and attempts to discover the correct network `--interface` setting. |
| 28 | +* **Log Levels**: The name used to call the script sets the log level for the test run. For example, `FMR-info` sets `--log-level=info`, while `FMR-trace` sets `--log-level=trace`. |
| 29 | +* **Setup**: To create the convenient `FMR-info`, `FMR-debug`, etc. symlinks in your working directory, you can run the following command from the repository root: |
| 30 | + ```bash |
| 31 | + ./scripts/FMR link |
| 32 | + ``` |
| 33 | + |
| 34 | +### 3. Clean State |
| 35 | + |
| 36 | +Each numbered test case should be run from a known-clean state. Before starting a test, please perform two RoT resets: |
| 37 | +```bash |
| 38 | +FMR-info reset-component rot |
| 39 | +FMR-info reset-component rot |
| 40 | +``` |
| 41 | + |
| 42 | +--- |
| 43 | + |
| 44 | +## Part 1: Testing Asymmetric Feature Support (Current State) |
| 45 | + |
| 46 | +**Assumption**: For this set of tests, `$REPO_BL` points to an **old** firmware build that **does not** support the transient boot preference feature, and `$REPO_UT` points to a **new** build that does. |
| 47 | + |
| 48 | +### Test 1.1: Standard Workflow (Golden Path) |
| 49 | + |
| 50 | +* **Purpose**: To verify that the primary upgrade and rollback functionality works correctly without using any of the new features. |
| 51 | +* **Command**: |
| 52 | + ```bash |
| 53 | + FMR-info rhai scripts/upgrade-rollback.rhai -c scripts/targets.json \ |
| 54 | + -b $REPO_BL -u $REPO_UT |
| 55 | + ``` |
| 56 | +* **Expected Outcome**: The script should complete successfully with an exit code of 0. It will upgrade to the `under-test` image and then roll back to the `baseline` image using persistent updates. |
| 57 | + |
| 58 | +### Test 1.2: Transient Boot Path (`-t` flag) |
| 59 | + |
| 60 | +* **Purpose**: To verify the script correctly handles the feature asymmetry when the transient update path is requested. |
| 61 | +* **Command**: |
| 62 | + ```bash |
| 63 | + FMR-info rhai scripts/upgrade-rollback.rhai -c scripts/targets.json \ |
| 64 | + -b $REPO_BL -u $REPO_UT -t |
| 65 | + ``` |
| 66 | +* **Expected Outcome**: The script should complete successfully with an exit code of 0. The log should show: |
| 67 | + * **Upgrade**: The active `baseline` firmware does not support the feature. The script will log a warning and use a persistent update. |
| 68 | + * **Rollback**: The now-active `under-test` firmware supports the feature. The script will correctly use a transient update for the rollback. |
| 69 | + |
| 70 | +### Test 1.3: Negative Test (`-N`) Workflow |
| 71 | + |
| 72 | +* **Purpose**: To verify the logic that runs (or skips) the `test_and_recover...` negative test based on feature support. |
| 73 | +* **Command**: |
| 74 | + ```bash |
| 75 | + FMR-info rhai scripts/upgrade-rollback.rhai -c scripts/targets.json \ |
| 76 | + -b $REPO_BL -u $REPO_UT -N |
| 77 | + ``` |
| 78 | +* **Expected Outcome**: The script will **fail with exit code 1**. This is the correct behavior. |
| 79 | + * **Upgrade**: The `baseline` firmware is active and does not support the transient feature. The script will detect this and, because the test is for the `ut` branch, it will log a `FATAL` error stating the `under-test` image must support the feature. This check is known to be flawed for this specific asymmetric case but correctly protects against regressions. |
| 80 | + |
| 81 | +### Test 1.4: Fault Injection - Conflicting `pending` Preference |
| 82 | + |
| 83 | +* **Purpose**: To verify that the test harness can recover from a pre-existing `pending_persistent` preference fault. |
| 84 | +* **Command**: |
| 85 | + ```bash |
| 86 | + FMR-info rhai scripts/upgrade-rollback.rhai -c scripts/targets.json \ |
| 87 | + -b $REPO_BL -u $REPO_UT \ |
| 88 | + --inject-fault=pending --hubris-2093 |
| 89 | + ``` |
| 90 | +* **Expected Outcome**: The script should run the "pending" fault injection test and exit with code 0. The log will show the sanitizer detecting the fault and using the reset-based workaround to clear it before the main test flow runs successfully. |
| 91 | + |
| 92 | +### Test 1.5: Fault Injection - Conflicting `transient` Preference |
| 93 | + |
| 94 | +* **Purpose**: To verify the test harness correctly handles the inability to inject a fault into non-compliant firmware. |
| 95 | +* **Command**: |
| 96 | + ```bash |
| 97 | + FMR-info rhai scripts/upgrade-rollback.rhai -c scripts/targets.json \ |
| 98 | + -b $REPO_BL -u $REPO_UT \ |
| 99 | + --inject-fault=transient |
| 100 | + ``` |
| 101 | +* **Expected Outcome**: The script is **expected to fail with exit code 1**. This is the correct outcome. The log will show: |
| 102 | + 1. The script first installs the `baseline` (`master`) firmware. |
| 103 | + 2. It then attempts to run the `transient` fault injection test. |
| 104 | + 3. The `helper::inject_conflicting_transient_preference()` function will fail because the active `baseline` firmware does not support the transient preference command. |
| 105 | + 4. The script will log an error like "Failed to inject transient preference fault" and exit. This proves the test harness correctly identifies that the fault cannot be created. |
| 106 | +
|
| 107 | +--- |
| 108 | +
|
| 109 | +## Part 2: Testing Symmetric Feature Support (Future State) |
| 110 | +
|
| 111 | +**Assumption**: For this set of tests, assume **both** `$REPO_BL` and `$REPO_UT` point to firmware builds that support the transient boot preference feature. |
| 112 | +
|
| 113 | +### Test 2.1: Transient Boot Path (`-t` flag) |
| 114 | +
|
| 115 | +* **Purpose**: To verify that when both images are compliant, the script uses the transient update path for both the upgrade and the rollback. |
| 116 | +* **Command**: |
| 117 | + ```bash |
| 118 | + FMR-info rhai scripts/upgrade-rollback.rhai -c scripts/targets.json \ |
| 119 | + -b $REPO_BL -u $REPO_UT -t |
| 120 | + ``` |
| 121 | +* **Expected Outcome**: The script should complete successfully with an exit code of 0. The log should show a transient update is used for **both** the upgrade to `ut` and the subsequent rollback to `base`. |
| 122 | +
|
| 123 | +### Test 2.2: Negative Test (`-N`) Workflow |
| 124 | +
|
| 125 | +* **Purpose**: To verify that the negative test runs successfully in both directions when all firmware is compliant. |
| 126 | +* **Command**: |
| 127 | + ```bash |
| 128 | + FMR-info rhai scripts/upgrade-rollback.rhai -c scripts/targets.json \ |
| 129 | + -b $REPO_BL -u $REPO_UT -N |
| 130 | + ``` |
| 131 | +* **Expected Outcome**: The script should complete successfully with an exit code of 0. The `test_and_recover_from_preferred_slot_update_failure` function should be executed and pass for the `ut` branch during the upgrade, and then be executed and pass **again** for the `base` branch during the rollback. |
0 commit comments