Skip to content

Commit d58c579

Browse files
authored
TQ: Support upgrade from LRTQ (#9716)
This commit introduces a Nexus lockstep API and omdb support for triggering upgrade from LRTQ. In order to do so, we need to enable the `trust_quorum::NodeTask` to read the lrtq state from disk if exists and there is no existing trust quorum state on disk. Notably, LRTQ upgrade did not require modifying the trust quorum background task as prepare and commit operations are polled identically between a normal reconfiguration and an LRTQ upgrade. This was tested on a4x2. However, testing it required a slight deviation from what would be required on a real upgrade in production. There are no LRTQ ledgers until RSS runs, and the `trust_quorum::NodeTask` starts before RSS. Therefore it doesn't see the shares because it only loads them on startup. On a real system in the field the shares would exist as soon as the sled-agent was ugpraded and the `NodeTask` would see them. To rectify this via manual testing, I waited until RSS completed and then restarted sled-agent on all nodes. I then issued an lrtq-upgrade via omdb and it worked. Note that for the above strategy to work, trust quorum RSS must be disabled so that a real trust quorum configuration is not generated during RSS. This is done via the following constant, which remains unset: ```rust pub const TRUST_QUORUM_INTEGRATION_ENABLED: bool = false; ``` Since that constant doesn't stop trust quorum from running, but just prohibits RSS setup initializing a trust quorum, we are allowed to upgrade out of LRTQ. At that point we have a real trust quorum configuration and can also proceed to add sleds. I did that successfully as well. What follows are some logged commands to show all this working in a4x2. It looks like I lost the scrollback for issuing the lrtq-upgrade, but it was done with the following command in omdb on `g0`: ``` omdb nexus trust-quorum lrtq-upgrade -w ``` Here is the trust quorum configuration after the upgrade committed. ``` root@oxz_switch:~# omdb nexus trust-quorum get-config 62c2f638-c330-421e-8b4a-7f097a22281e latest note: Nexus URL not specified. Will pick one from DNS. note: using DNS from system config (typically /etc/resolv.conf) note: (if this is not right, use --dns-server to specify an alternate DNS server) note: using Nexus URL http://[fd00:17:1:d01::6]:12232 TrustQuorumConfig { rack_id: 62c2f638-c330-421e-8b4a-7f097a22281e (rack), epoch: Epoch( 2, ), last_committed_epoch: None, state: Committed, threshold: Threshold( 2, ), commit_crash_tolerance: 0, coordinator: BaseboardId { part_number: "913-0000019", serial_number: "20000000", }, encrypted_rack_secrets: Some( EncryptedRackSecrets { salt: Salt( [ 163, 19, 118, 99, 229, 14, 116, 81, 210, 117, 180, 69, 101, 181, 254, 44, 38, 169, 149, 63, 59, 40, 63, 189, 164, 106, 222, 196, 112, 25, 179, 107, ], ), data: [ 241, 144, 251, 158, 0, 19, 155, 183, 228, 30, 218, 227, 212, 100, 159, 158, 160, 13, 199, 185, 20, 142, 61, 26, 217, 92, 247, 170, 110, 38, 238, 91, 75, 78, 71, 65, 54, 93, 208, 90, 44, 2, 185, 10, 62, 167, 222, 57, 198, 217, 174, 172, 70, 145, 22, 206, ], }, ), members: { BaseboardId { part_number: "913-0000019", serial_number: "20000000", }: TrustQuorumMemberData { state: Committed, share_digest: Some( sha3 digest: 2a25477bc8d7623c81ac24e970f381636e3f84c46e41f6f36ca14e7f6011cf1, ), time_prepared: Some( 2026-01-23T21:52:19.347850Z, ), time_committed: Some( 2026-01-23T21:54:27.774935Z, ), }, BaseboardId { part_number: "913-0000019", serial_number: "20000001", }: TrustQuorumMemberData { state: Committed, share_digest: Some( sha3 digest: 5c670dba76c8d5248eac3b1cd6e4bfb88ca12c57bc090ad43ce155b11dcc74, ), time_prepared: Some( 2026-01-23T21:52:19.324154Z, ), time_committed: Some( 2026-01-23T21:54:27.774935Z, ), }, BaseboardId { part_number: "913-0000019", serial_number: "20000003", }: TrustQuorumMemberData { state: Committed, share_digest: Some( sha3 digest: d21a25bbd233b5dab31469659e4efa8a79a3ca7face94719f2df3e7565877c36, ), time_prepared: Some( 2026-01-23T21:52:19.337036Z, ), time_committed: Some( 2026-01-23T21:55:13.633304Z, ), }, }, time_created: 2026-01-23T21:51:17.122318Z, time_committing: Some( 2026-01-23T21:52:19.360465Z, ), time_committed: Some( 2026-01-23T21:55:13.679196Z, ), time_aborted: None, abort_reason: None, } ``` And here is the oxide CLI command to add a sled: ``` ➜ oxide.rs git:(main) ✗ echo '{"sled_ids": [{"part_number": "913-0000019","serial_number": "20000002"}]}' | target/debug/oxide --profile recovery api /v1/system/hardware/racks/62c2f638-c330-421e-8b4a-7f097a22281e/membership/add --method POST --input - { "members": [ { "part_number": "913-0000019", "serial_number": "20000000" }, { "part_number": "913-0000019", "serial_number": "20000001" }, { "part_number": "913-0000019", "serial_number": "20000002" }, { "part_number": "913-0000019", "serial_number": "20000003" } ], "rack_id": "62c2f638-c330-421e-8b4a-7f097a22281e", "state": "in_progress", "time_aborted": null, "time_committed": null, "time_created": "2026-01-23T22:15:53.974119Z", "unacknowledged_members": [ { "part_number": "913-0000019", "serial_number": "20000000" }, { "part_number": "913-0000019", "serial_number": "20000001" }, { "part_number": "913-0000019", "serial_number": "20000002" }, { "part_number": "913-0000019", "serial_number": "20000003" } ], "version": 3 } ``` Polling for a bit, gives the committed status of this add-sled in the CLI: ``` ➜ oxide.rs git:(main) ✗ target/debug/oxide --profile recovery api '/v1/system/hardware/racks/62c2f638-c330-421e-8b4a-7f097a22281e/membership' { "members": [ { "part_number": "913-0000019", "serial_number": "20000000" }, { "part_number": "913-0000019", "serial_number": "20000001" }, { "part_number": "913-0000019", "serial_number": "20000002" }, { "part_number": "913-0000019", "serial_number": "20000003" } ], "rack_id": "62c2f638-c330-421e-8b4a-7f097a22281e", "state": "committed", "time_aborted": null, "time_committed": "2026-01-23T22:17:33.936086Z", "time_created": "2026-01-23T22:15:53.974119Z", "unacknowledged_members": [], "version": 3 } ``` And here is the same thing with more detail in omdb: ``` root@oxz_switch:~# omdb nexus trust-quorum get-config 62c2f638-c330-421e-8b4a-7f097a22281e latest note: Nexus URL not specified. Will pick one from DNS. note: using DNS from system config (typically /etc/resolv.conf) note: (if this is not right, use --dns-server to specify an alternate DNS server) note: using Nexus URL http://[fd00:17:1:d01::6]:12232 TrustQuorumConfig { rack_id: 62c2f638-c330-421e-8b4a-7f097a22281e (rack), epoch: Epoch( 3, ), last_committed_epoch: Some( Epoch( 2, ), ), state: Committed, threshold: Threshold( 3, ), commit_crash_tolerance: 1, coordinator: BaseboardId { part_number: "913-0000019", serial_number: "20000001", }, encrypted_rack_secrets: Some( EncryptedRackSecrets { salt: Salt( [ 68, 134, 154, 136, 2, 76, 247, 184, 235, 215, 228, 69, 93, 48, 142, 161, 133, 127, 137, 173, 52, 16, 184, 194, 114, 38, 73, 215, 80, 207, 255, 114, ], ), data: [ 6, 67, 96, 7, 231, 106, 134, 234, 229, 116, 209, 76, 162, 172, 175, 139, 200, 74, 202, 28, 127, 55, 44, 61, 166, 60, 135, 156, 53, 42, 66, 189, 92, 56, 7, 93, 205, 125, 98, 20, 233, 99, 128, 208, 223, 134, 64, 32, 137, 248, 119, 159, 192, 57, 142, 127, 109, 162, 254, 177, 86, 112, 21, 115, 251, 94, 51, 24, 135, 242, 113, 127, 71, 241, 50, 32, 185, 218, 240, 1, 178, 200, 71, 173, 88, 120, 254, 177, 146, 205, 16, 133, 246, 184, 212, 118, ], }, ), members: { BaseboardId { part_number: "913-0000019", serial_number: "20000000", }: TrustQuorumMemberData { state: Committed, share_digest: Some( sha3 digest: 829cb1c74cde3390a3d8e0abd81399357fc4fd2d19e7cc4cb6ca582d5c1792d, ), time_prepared: Some( 2026-01-23T22:16:29.159832Z, ), time_committed: Some( 2026-01-23T22:17:33.885064Z, ), }, BaseboardId { part_number: "913-0000019", serial_number: "20000001", }: TrustQuorumMemberData { state: Committed, share_digest: Some( sha3 digest: 97f0b02f4ab2e1fa54d34c939658e5869c97f3ef417e4a7aecf314697cfa, ), time_prepared: Some( 2026-01-23T22:16:28.146229Z, ), time_committed: Some( 2026-01-23T22:17:32.825838Z, ), }, BaseboardId { part_number: "913-0000019", serial_number: "20000002", }: TrustQuorumMemberData { state: Committed, share_digest: Some( sha3 digest: 1c33b428dfb3f345f497868a79a4b73e7252c848ab8ae8d56f2a9137120f8d7, ), time_prepared: Some( 2026-01-23T22:16:27.764304Z, ), time_committed: Some( 2026-01-23T22:17:32.825838Z, ), }, BaseboardId { part_number: "913-0000019", serial_number: "20000003", }: TrustQuorumMemberData { state: Committed, share_digest: Some( sha3 digest: ad6dc4ccc51bf2aeef6fd49bc4c7930d43a946fbc781ecb703428f14f66f11d, ), time_prepared: Some( 2026-01-23T22:16:28.837407Z, ), time_committed: Some( 2026-01-23T22:17:32.825838Z, ), }, }, time_created: 2026-01-23T22:15:53.974119Z, time_committing: Some( 2026-01-23T22:16:29.183462Z, ), time_committed: Some( 2026-01-23T22:17:33.936086Z, ), time_aborted: None, abort_reason: None, } ```
1 parent 4abb72f commit d58c579

File tree

17 files changed

+486
-47
lines changed

17 files changed

+486
-47
lines changed

Cargo.lock

Lines changed: 2 additions & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

bootstore/src/schemes/v0/storage.rs

Lines changed: 31 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,12 +11,14 @@
1111
//! 2. A network config blob required for pre-rack-unlock configuration
1212
//!
1313
14+
use crate::schemes::v0::SharePkgCommon;
15+
1416
use super::{Fsm, FsmConfig, State};
1517
use camino::Utf8PathBuf;
1618
use omicron_common::ledger::{Ledger, Ledgerable};
1719
use serde::{Deserialize, Serialize};
1820
use sled_hardware_types::Baseboard;
19-
use slog::{Logger, info};
21+
use slog::{Logger, info, warn};
2022

2123
/// A persistent version of `Fsm::State`
2224
#[derive(Debug, Clone, PartialEq, Eq, Serialize, Deserialize)]
@@ -84,6 +86,34 @@ impl PersistentFsmState {
8486
(Fsm::new_uninitialized(node_id, config), 0)
8587
}
8688
}
89+
90+
/// Load the ledger for use by trust quorum
91+
pub async fn load_for_trust_quorum_upgrade(
92+
log: &Logger,
93+
paths: Vec<Utf8PathBuf>,
94+
) -> Option<SharePkgCommon> {
95+
let ledger = Ledger::<PersistentFsmState>::new(&log, paths).await?;
96+
let persistent_state = ledger.into_inner();
97+
info!(
98+
log,
99+
"Loaded LRTQ PersistentFsmState from ledger in state {} with generation {}",
100+
persistent_state.state.name(),
101+
persistent_state.generation
102+
);
103+
104+
match persistent_state.state {
105+
State::Uninitialized | State::Learning => {
106+
warn!(
107+
log,
108+
"Unexpected LRTQ state: {}. No share available.",
109+
persistent_state.state.name()
110+
);
111+
None
112+
}
113+
State::InitialMember { pkg, .. } => Some(pkg.common.clone()),
114+
State::Learned { pkg } => Some(pkg.common.clone()),
115+
}
116+
}
87117
}
88118

89119
/// Network configuration required to bring up the control plane

clients/nexus-lockstep-client/src/lib.rs

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -67,6 +67,7 @@ progenitor::generate_api!(
6767
ReconfiguratorConfigView = nexus_types::deployment::ReconfiguratorConfigView,
6868
RecoverySiloConfig = sled_agent_types_versions::latest::rack_init::RecoverySiloConfig,
6969
SledAgentUpdateStatus = nexus_types::internal_api::views::SledAgentUpdateStatus,
70+
TrustQuorumConfig = nexus_types::trust_quorum::TrustQuorumConfig,
7071
UpdateStatus = nexus_types::internal_api::views::UpdateStatus,
7172
ZoneStatus = nexus_types::internal_api::views::ZoneStatus,
7273
ZpoolName = omicron_common::zpool_name::ZpoolName,

clients/sled-agent-client/src/lib.rs

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -69,6 +69,7 @@ progenitor::generate_api!(
6969
Inventory = sled_agent_types_versions::latest::inventory::Inventory,
7070
InventoryDisk = sled_agent_types_versions::latest::inventory::InventoryDisk,
7171
InventoryZpool = sled_agent_types_versions::latest::inventory::InventoryZpool,
72+
LrtqUpgradeMsg = trust_quorum_types::messages::LrtqUpgradeMsg,
7273
MacAddr = omicron_common::api::external::MacAddr,
7374
MupdateOverrideBootInventory = sled_agent_types_versions::latest::inventory::MupdateOverrideBootInventory,
7475
Name = omicron_common::api::external::Name,

dev-tools/omdb/Cargo.toml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -85,6 +85,7 @@ supports-color.workspace = true
8585
tabled.workspace = true
8686
textwrap.workspace = true
8787
tokio = { workspace = true, features = ["full"] }
88+
trust-quorum-types.workspace = true
8889
tufaceous-artifact.workspace = true
8990
unicode-width.workspace = true
9091
update-engine.workspace = true

dev-tools/omdb/src/bin/omdb/nexus.rs

Lines changed: 100 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -85,6 +85,7 @@ use omicron_uuid_kinds::DemoSagaUuid;
8585
use omicron_uuid_kinds::GenericUuid;
8686
use omicron_uuid_kinds::ParseError;
8787
use omicron_uuid_kinds::PhysicalDiskUuid;
88+
use omicron_uuid_kinds::RackUuid;
8889
use omicron_uuid_kinds::SledUuid;
8990
use omicron_uuid_kinds::SupportBundleUuid;
9091
use quiesce::QuiesceArgs;
@@ -97,6 +98,7 @@ use slog_error_chain::InlineErrorChain;
9798
use std::collections::BTreeMap;
9899
use std::collections::BTreeSet;
99100
use std::fs::OpenOptions;
101+
use std::num::ParseIntError;
100102
use std::os::unix::fs::PermissionsExt;
101103
use std::str::FromStr;
102104
use std::sync::Arc;
@@ -108,6 +110,7 @@ use tabled::settings::Padding;
108110
use tabled::settings::object::Columns;
109111
use tokio::io::AsyncWriteExt;
110112
use tokio::sync::OnceCell;
113+
use trust_quorum_types::types::Epoch;
111114
use update_engine::EventBuffer;
112115
use update_engine::ExecutionStatus;
113116
use update_engine::ExecutionTerminalInfo;
@@ -165,6 +168,8 @@ enum NexusCommands {
165168
/// interact with support bundles
166169
#[command(visible_alias = "sb")]
167170
SupportBundles(SupportBundleArgs),
171+
/// interact with the trust quorum
172+
TrustQuorum(TrustQuorumArgs),
168173
/// show running artifact versions
169174
UpdateStatus(UpdateStatusArgs),
170175
}
@@ -566,6 +571,43 @@ enum SupportBundleCommands {
566571
Inspect(SupportBundleInspectArgs),
567572
}
568573

574+
#[derive(Debug, Args)]
575+
struct TrustQuorumArgs {
576+
#[command(subcommand)]
577+
command: TrustQuorumCommands,
578+
}
579+
580+
#[derive(Debug, Subcommand)]
581+
enum TrustQuorumCommands {
582+
GetConfig(TrustQuorumConfigArgs),
583+
LrtqUpgrade,
584+
}
585+
586+
#[derive(Debug, Clone, Copy, Args)]
587+
struct TrustQuorumConfigArgs {
588+
rack_id: RackUuid,
589+
epoch: TrustQuorumEpochOrLatest,
590+
}
591+
592+
#[derive(Debug, Clone, Copy)]
593+
pub(crate) enum TrustQuorumEpochOrLatest {
594+
Latest,
595+
Epoch(Epoch),
596+
}
597+
598+
impl FromStr for TrustQuorumEpochOrLatest {
599+
type Err = ParseIntError;
600+
601+
fn from_str(s: &str) -> Result<Self, Self::Err> {
602+
if matches!(s, "latest" | "current") {
603+
Ok(Self::Latest)
604+
} else {
605+
let i: u64 = s.parse()?;
606+
Ok(Self::Epoch(Epoch(i)))
607+
}
608+
}
609+
}
610+
569611
#[derive(Debug, Args)]
570612
struct SupportBundleDeleteArgs {
571613
id: SupportBundleUuid,
@@ -860,6 +902,15 @@ impl NexusArgs {
860902
NexusCommands::SupportBundles(SupportBundleArgs {
861903
command: SupportBundleCommands::Inspect(args),
862904
}) => cmd_nexus_support_bundles_inspect(&client, args).await,
905+
NexusCommands::TrustQuorum(TrustQuorumArgs {
906+
command: TrustQuorumCommands::GetConfig(args),
907+
}) => cmd_nexus_trust_quorum_get_config(&client, args).await,
908+
NexusCommands::TrustQuorum(TrustQuorumArgs {
909+
command: TrustQuorumCommands::LrtqUpgrade,
910+
}) => {
911+
let token = omdb.check_allow_destructive()?;
912+
cmd_nexus_trust_quorum_lrtq_upgrade(&client, token).await
913+
}
863914
NexusCommands::UpdateStatus(args) => {
864915
cmd_nexus_update_status(&client, args).await
865916
}
@@ -4452,6 +4503,55 @@ async fn cmd_nexus_support_bundles_list(
44524503
Ok(())
44534504
}
44544505

4506+
async fn cmd_nexus_trust_quorum_get_config(
4507+
client: &nexus_lockstep_client::Client,
4508+
args: &TrustQuorumConfigArgs,
4509+
) -> Result<(), anyhow::Error> {
4510+
let config = match args.epoch {
4511+
TrustQuorumEpochOrLatest::Latest => client
4512+
.trust_quorum_get_config(&args.rack_id.as_untyped_uuid(), None)
4513+
.await
4514+
.with_context(|| {
4515+
format!(
4516+
"getting latest trust quorum config for rack {}",
4517+
args.rack_id
4518+
)
4519+
})?,
4520+
TrustQuorumEpochOrLatest::Epoch(epoch) => client
4521+
.trust_quorum_get_config(
4522+
&args.rack_id.as_untyped_uuid(),
4523+
Some(epoch.0),
4524+
)
4525+
.await
4526+
.with_context(|| {
4527+
format!(
4528+
"getting trust quorum config for rack {}, epoch {}",
4529+
args.rack_id, epoch
4530+
)
4531+
})?,
4532+
}
4533+
.into_inner();
4534+
4535+
println!("{config:#?}");
4536+
4537+
Ok(())
4538+
}
4539+
4540+
async fn cmd_nexus_trust_quorum_lrtq_upgrade(
4541+
client: &nexus_lockstep_client::Client,
4542+
_destruction_token: DestructiveOperationToken,
4543+
) -> Result<(), anyhow::Error> {
4544+
let epoch = client
4545+
.trust_quorum_lrtq_upgrade()
4546+
.await
4547+
.context("lrtq upgrade")?
4548+
.into_inner();
4549+
4550+
println!("Started LRTQ upgrade at epoch {epoch}");
4551+
4552+
Ok(())
4553+
}
4554+
44554555
/// Runs `omdb nexus support-bundles create`
44564556
async fn cmd_nexus_support_bundles_create(
44574557
client: &nexus_lockstep_client::Client,

dev-tools/omdb/tests/usage_errors.out

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -903,6 +903,7 @@ Commands:
903903
sagas view sagas, create and complete demo sagas
904904
sleds interact with sleds
905905
support-bundles interact with support bundles [aliases: sb]
906+
trust-quorum interact with the trust quorum
906907
update-status show running artifact versions
907908
help Print this message or the help of the given subcommand(s)
908909

nexus/db-queries/src/db/datastore/trust_quorum.rs

Lines changed: 56 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -21,6 +21,7 @@ use nexus_db_model::DbTypedUuid;
2121
use nexus_db_model::HwBaseboardId;
2222
use nexus_db_model::TrustQuorumConfiguration as DbTrustQuorumConfiguration;
2323
use nexus_db_model::TrustQuorumMember as DbTrustQuorumMember;
24+
use nexus_types::trust_quorum::IsLrtqUpgrade;
2425
use nexus_types::trust_quorum::ProposedTrustQuorumConfig;
2526
use nexus_types::trust_quorum::{
2627
TrustQuorumConfig, TrustQuorumConfigState, TrustQuorumMemberData,
@@ -435,10 +436,17 @@ impl DataStore {
435436
)
436437
.await?;
437438

438-
// Ensure that epochs are sequential
439+
// Ensure that epochs are sequential or this is the inital attempt at an
440+
// LRTQ upgrade.
441+
//
442+
// In the latter case the proposed epoch will be 2, as LRTQ has an epoch
443+
// of 1 that is encoded as a ZFS dataset property.
439444
let latest_epoch = latest_config.as_ref().map(|c| c.epoch);
440445
bail_unless!(
441-
latest_epoch == proposed.epoch.previous(),
446+
latest_epoch == proposed.epoch.previous()
447+
|| (latest_epoch.is_none()
448+
&& proposed.is_lrtq_upgrade == IsLrtqUpgrade::Yes
449+
&& proposed.epoch == Epoch(2)),
442450
"Epochs for trust quorum configurations must be sequential. \
443451
Current epoch = {:?}, Proposed Epoch = {:?}",
444452
latest_epoch,
@@ -1537,7 +1545,7 @@ mod tests {
15371545
.await
15381546
.unwrap();
15391547

1540-
// Last committed epoch is incoreect (should be 1)
1548+
// Last committed epoch is incorrect (should be 1)
15411549
let bad_config = ProposedTrustQuorumConfig {
15421550
rack_id,
15431551
epoch: Epoch(2),
@@ -1595,6 +1603,51 @@ mod tests {
15951603
logctx.cleanup_successful();
15961604
}
15971605

1606+
#[tokio::test]
1607+
async fn test_tq_insert_initial_lrtq_upgrade() {
1608+
let logctx = test_setup_log("test_tq_update_prepare_and_commit");
1609+
let db = TestDatabase::new_with_datastore(&logctx.log).await;
1610+
let (opctx, datastore) = (db.opctx(), db.datastore());
1611+
1612+
let hw_ids = insert_hw_baseboard_ids(&db).await;
1613+
let rack_id = RackUuid::new_v4();
1614+
let members: BTreeSet<_> =
1615+
hw_ids.iter().cloned().map(BaseboardId::from).collect();
1616+
1617+
// Propse a an LRTQ upgrade and successfully insert it
1618+
let config = ProposedTrustQuorumConfig {
1619+
rack_id,
1620+
epoch: Epoch(2),
1621+
is_lrtq_upgrade: IsLrtqUpgrade::Yes,
1622+
members: members.clone(),
1623+
};
1624+
1625+
// Insert should succeed
1626+
datastore.tq_insert_latest_config(opctx, config.clone()).await.unwrap();
1627+
1628+
// Read the config back and check that it's preparing for LRTQ upgrade
1629+
// with no acks.
1630+
let read_config = datastore
1631+
.tq_get_latest_config(opctx, rack_id)
1632+
.await
1633+
.expect("no error")
1634+
.expect("returned config");
1635+
1636+
// The read config should be preparing
1637+
assert_eq!(read_config.epoch, config.epoch);
1638+
assert_eq!(
1639+
read_config.state,
1640+
TrustQuorumConfigState::PreparingLrtqUpgrade
1641+
);
1642+
assert!(read_config.encrypted_rack_secrets.is_none());
1643+
assert!(read_config.members.iter().all(|(_, info)| {
1644+
info.state == TrustQuorumMemberState::Unacked
1645+
}));
1646+
1647+
db.terminate().await;
1648+
logctx.cleanup_successful();
1649+
}
1650+
15981651
#[tokio::test]
15991652
async fn test_tq_update_prepare_and_commit() {
16001653
let logctx = test_setup_log("test_tq_update_prepare_and_commit");

nexus/lockstep-api/Cargo.toml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -16,4 +16,5 @@ omicron-uuid-kinds.workspace = true
1616
omicron-workspace-hack.workspace = true
1717
schemars.workspace = true
1818
serde.workspace = true
19+
trust-quorum-types.workspace = true
1920
uuid.workspace = true

nexus/lockstep-api/src/lib.rs

Lines changed: 25 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -52,6 +52,7 @@ use omicron_uuid_kinds::*;
5252
use schemars::JsonSchema;
5353
use serde::Deserialize;
5454
use serde::Serialize;
55+
use trust_quorum_types::types::Epoch;
5556
use uuid::Uuid;
5657

5758
const RACK_INITIALIZATION_REQUEST_MAX_BYTES: usize = 10 * 1024 * 1024;
@@ -564,15 +565,29 @@ pub trait NexusLockstepApi {
564565
rqctx: RequestContext<Self::Context>,
565566
) -> Result<HttpResponseOk<QuiesceStatus>, HttpError>;
566567

567-
/// Retrieve the latest ongoing rack cluster membership change
568+
/// Retrieve the trust quorum configuration for the given epoch, or latest
569+
// if no epoch is given
568570
#[endpoint {
569571
method = GET,
570-
path = "/trust-quorum/{rack_id}/config/latest",
572+
path = "/trust-quorum/config/{rack_id}",
571573
}]
572-
async fn trust_quorum_get_latest_config(
574+
async fn trust_quorum_get_config(
573575
rqctx: RequestContext<Self::Context>,
574-
path_params: Path<RackPathParam>,
575-
) -> Result<HttpResponseOk<Option<TrustQuorumConfig>>, HttpError>;
576+
path_params: Path<params::RackMembershipConfigPathParams>,
577+
query_params: Query<TrustQuorumEpochQueryParam>,
578+
) -> Result<HttpResponseOk<TrustQuorumConfig>, HttpError>;
579+
580+
/// Initiate an LRTQ upgrade
581+
///
582+
/// Return the epoch of the proposed configuration, so it can be polled
583+
/// asynchronously.
584+
#[endpoint {
585+
method = POST,
586+
path = "/trust-quorum/lrtq-upgrade"
587+
}]
588+
async fn trust_quorum_lrtq_upgrade(
589+
rqctx: RequestContext<Self::Context>,
590+
) -> Result<HttpResponseOk<Epoch>, HttpError>;
576591
}
577592

578593
/// Path parameters for Rack requests.
@@ -621,3 +636,8 @@ pub struct SledId {
621636
pub struct VersionPathParam {
622637
pub version: u32,
623638
}
639+
640+
#[derive(Clone, Debug, Deserialize, Serialize, JsonSchema)]
641+
pub struct TrustQuorumEpochQueryParam {
642+
pub epoch: Option<Epoch>,
643+
}

0 commit comments

Comments
 (0)