Skip to content

Commit 7eec6c5

Browse files
[feat] Multicast groups
Introduces end-to-end multicast group support across control plane and sled-agent, integrated with IP pool extensions required for supporting multicast workflows. This work enables project-scoped multicast groups with lifecycle-driven dataplane programming and exposes an API for operating multicast groups over instances. Highlights: - DB: new multicast_group tables; member lifecycle management - API: multicast group/member CRUD; source IP validation; VPC/project hierarchy integration with default VNI fallback - Control plane: RPW reconcilers for groups/members; sagas for dataplane updates atomically at the group level; instance lifecycle hooks and piggybacking - Dataplane: Dendrite DPD switch programming via trait abstraction; DPD client used in tests - Sled agent: multicast-aware instance management; network interface configuration for multicast traffic; cross-version testing; OPTE stubs present - Tests: comprehensive integration suites under nexus/tests/integration_tests/multicast/ Components: - Database schema: external and underlay multicast groups; member/instance association tables - Control plane modules: multicast group management, member lifecycle, dataplane abstraction; RPW reconcilers to ensure convergence - API layer: endpoints and validation; default-VNI semantics when VPC not provided - Sled agent: OPTE stubs and compatibility shims for older agents Workflows Implemented: 1. Instance lifecycle integration: - "Create" -> resolve VPC/VNI (or default), validate source IPs, create memberships, enqueue group ensure RPW - "Start" -> program dataplane via ensure/update sagas; activate member flows after switch ack - "Stop" -> deactivate dataplane membership; retain DB membership for fast restart - "Delete" -> remove instance memberships; group deletion is explicit - "Migrate" -> deactivate on source sled; activate on target; idempotent with ordering guarantees - Restart/recovery -> RPWs reconcile desired state; compensations clean up partial programming 2. RPW reconciliation: - ensure dataplane switches match database state - handle sled migrations and state transitions - Eventual consistency with retry logic Migrations: - Apply schema changes in schema/crdb/multicast-support/up01.sql (and update dbinit.sql) - Bump schema versions accordingly API/Compatibility: - OpenAPI updated: openapi/nexus.json, openapi/sled-agent/sled-agent-5.0.0-89f1f7.json - Regenerate clients where applicable References: - RFD 488: https://rfd.shared.oxide.computer/rfd/488 - IP Pool extensions: #9084 - Dendrite PRs (based on recency): * oxidecomputer/dendrite#132 * oxidecomputer/dendrite#109 * oxidecomputer/dendrite#14 Follow-ups include: - OPTE integration - commtest extension - omdb commands are tracked in issues - pool and group stats
1 parent 98bf71e commit 7eec6c5

File tree

112 files changed

+31598
-207
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

112 files changed

+31598
-207
lines changed

Cargo.lock

Lines changed: 6 additions & 3 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

common/src/api/external/mod.rs

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -952,6 +952,8 @@ pub enum ResourceType {
952952
LldpLinkConfig,
953953
LoopbackAddress,
954954
MetricProducer,
955+
MulticastGroup,
956+
MulticastGroupMember,
955957
NatEntry,
956958
Oximeter,
957959
PhysicalDisk,
@@ -2510,6 +2512,12 @@ impl Vni {
25102512
/// The VNI for the builtin services VPC.
25112513
pub const SERVICES_VNI: Self = Self(100);
25122514

2515+
/// VNI default if no VPC is provided for a multicast group.
2516+
///
2517+
/// This is a low-numbered VNI, to avoid colliding with user VNIs.
2518+
/// However, it is not in the Oxide-reserved yet.
2519+
pub const DEFAULT_MULTICAST_VNI: Self = Self(77);
2520+
25132521
/// Oxide reserves a slice of initial VNIs for its own use.
25142522
pub const MIN_GUEST_VNI: u32 = 1024;
25152523

dev-tools/omdb/tests/env.out

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -124,6 +124,10 @@ task: "metrics_producer_gc"
124124
unregisters Oximeter metrics producers that have not renewed their lease
125125

126126

127+
task: "multicast_group_reconciler"
128+
reconciles multicast group state with dendrite switch configuration
129+
130+
127131
task: "nat_garbage_collector"
128132
prunes soft-deleted NAT entries from nat_entry table based on a
129133
predetermined retention policy
@@ -332,6 +336,10 @@ task: "metrics_producer_gc"
332336
unregisters Oximeter metrics producers that have not renewed their lease
333337

334338

339+
task: "multicast_group_reconciler"
340+
reconciles multicast group state with dendrite switch configuration
341+
342+
335343
task: "nat_garbage_collector"
336344
prunes soft-deleted NAT entries from nat_entry table based on a
337345
predetermined retention policy
@@ -527,6 +535,10 @@ task: "metrics_producer_gc"
527535
unregisters Oximeter metrics producers that have not renewed their lease
528536

529537

538+
task: "multicast_group_reconciler"
539+
reconciles multicast group state with dendrite switch configuration
540+
541+
530542
task: "nat_garbage_collector"
531543
prunes soft-deleted NAT entries from nat_entry table based on a
532544
predetermined retention policy

dev-tools/omdb/tests/successes.out

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -349,6 +349,10 @@ task: "metrics_producer_gc"
349349
unregisters Oximeter metrics producers that have not renewed their lease
350350

351351

352+
task: "multicast_group_reconciler"
353+
reconciles multicast group state with dendrite switch configuration
354+
355+
352356
task: "nat_garbage_collector"
353357
prunes soft-deleted NAT entries from nat_entry table based on a
354358
predetermined retention policy
@@ -648,6 +652,12 @@ task: "metrics_producer_gc"
648652
started at <REDACTED_TIMESTAMP> (<REDACTED DURATION>s ago) and ran for <REDACTED DURATION>ms
649653
warning: unknown background task: "metrics_producer_gc" (don't know how to interpret details: Object {"expiration": String("<REDACTED_TIMESTAMP>"), "pruned": Array []})
650654

655+
task: "multicast_group_reconciler"
656+
configured period: every <REDACTED_DURATION>m
657+
last completed activation: <REDACTED ITERATIONS>, triggered by <TRIGGERED_BY_REDACTED>
658+
started at <REDACTED_TIMESTAMP> (<REDACTED DURATION>s ago) and ran for <REDACTED DURATION>ms
659+
warning: unknown background task: "multicast_group_reconciler" (don't know how to interpret details: Object {"errors": Array [String("failed to create multicast dataplane client: Internal Error: failed to build DPD clients")], "groups_created": Number(0), "groups_deleted": Number(0), "groups_verified": Number(0), "members_deleted": Number(0), "members_processed": Number(0)})
660+
651661
task: "phantom_disks"
652662
configured period: every <REDACTED_DURATION>s
653663
last completed activation: <REDACTED ITERATIONS>, triggered by <TRIGGERED_BY_REDACTED>
@@ -1166,6 +1176,12 @@ task: "metrics_producer_gc"
11661176
started at <REDACTED_TIMESTAMP> (<REDACTED DURATION>s ago) and ran for <REDACTED DURATION>ms
11671177
warning: unknown background task: "metrics_producer_gc" (don't know how to interpret details: Object {"expiration": String("<REDACTED_TIMESTAMP>"), "pruned": Array []})
11681178

1179+
task: "multicast_group_reconciler"
1180+
configured period: every <REDACTED_DURATION>m
1181+
last completed activation: <REDACTED ITERATIONS>, triggered by <TRIGGERED_BY_REDACTED>
1182+
started at <REDACTED_TIMESTAMP> (<REDACTED DURATION>s ago) and ran for <REDACTED DURATION>ms
1183+
warning: unknown background task: "multicast_group_reconciler" (don't know how to interpret details: Object {"errors": Array [String("failed to create multicast dataplane client: Internal Error: failed to build DPD clients")], "groups_created": Number(0), "groups_deleted": Number(0), "groups_verified": Number(0), "members_deleted": Number(0), "members_processed": Number(0)})
1184+
11691185
task: "phantom_disks"
11701186
configured period: every <REDACTED_DURATION>s
11711187
last completed activation: <REDACTED ITERATIONS>, triggered by <TRIGGERED_BY_REDACTED>

end-to-end-tests/src/instance_launch.rs

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -80,6 +80,7 @@ async fn instance_launch() -> Result<()> {
8080
auto_restart_policy: Default::default(),
8181
anti_affinity_groups: Vec::new(),
8282
cpu_platform: None,
83+
multicast_groups: Vec::new(),
8384
})
8485
.send()
8586
.await?;

illumos-utils/src/opte/illumos.rs

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -52,6 +52,11 @@ pub enum Error {
5252
#[error("Tried to update external IPs on non-existent port ({0}, {1:?})")]
5353
ExternalIpUpdateMissingPort(uuid::Uuid, NetworkInterfaceKind),
5454

55+
#[error(
56+
"Tried to update multicast groups on non-existent port ({0}, {1:?})"
57+
)]
58+
MulticastUpdateMissingPort(uuid::Uuid, NetworkInterfaceKind),
59+
5560
#[error("Could not find Primary NIC")]
5661
NoPrimaryNic,
5762

illumos-utils/src/opte/mod.rs

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -31,6 +31,7 @@ use oxide_vpc::api::RouterTarget;
3131
pub use oxide_vpc::api::Vni;
3232
use oxnet::IpNet;
3333
pub use port::Port;
34+
pub use port_manager::MulticastGroupCfg;
3435
pub use port_manager::PortCreateParams;
3536
pub use port_manager::PortManager;
3637
pub use port_manager::PortTicket;
@@ -71,7 +72,7 @@ impl Gateway {
7172
}
7273
}
7374

74-
/// Convert a nexus `IpNet` to an OPTE `IpCidr`.
75+
/// Convert a nexus [IpNet] to an OPTE [IpCidr].
7576
fn net_to_cidr(net: IpNet) -> IpCidr {
7677
match net {
7778
IpNet::V4(net) => IpCidr::Ip4(Ipv4Cidr::new(
@@ -85,9 +86,10 @@ fn net_to_cidr(net: IpNet) -> IpCidr {
8586
}
8687
}
8788

88-
/// Convert a nexus `RouterTarget` to an OPTE `RouterTarget`.
89+
/// Convert a nexus [shared::RouterTarget] to an OPTE [RouterTarget].
8990
///
90-
/// This is effectively a `From` impl, but defined for two out-of-crate types.
91+
/// This is effectively a [`From`] impl, but defined for two
92+
/// out-of-crate types.
9193
/// We map internet gateways that target the (single) "system" VPC IG to
9294
/// `InternetGateway(None)`. Everything else is mapped directly, translating IP
9395
/// address types as needed.

illumos-utils/src/opte/non_illumos.rs

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -46,6 +46,11 @@ pub enum Error {
4646
#[error("Tried to update external IPs on non-existent port ({0}, {1:?})")]
4747
ExternalIpUpdateMissingPort(uuid::Uuid, NetworkInterfaceKind),
4848

49+
#[error(
50+
"Tried to update multicast groups on non-existent port ({0}, {1:?})"
51+
)]
52+
MulticastUpdateMissingPort(uuid::Uuid, NetworkInterfaceKind),
53+
4954
#[error("Could not find Primary NIC")]
5055
NoPrimaryNic,
5156

illumos-utils/src/opte/port_manager.rs

Lines changed: 90 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -62,6 +62,18 @@ use std::sync::atomic::AtomicU64;
6262
use std::sync::atomic::Ordering;
6363
use uuid::Uuid;
6464

65+
/// IPv4 multicast address range (224.0.0.0/4).
66+
/// See RFC 5771 (IPv4 Multicast Address Assignments):
67+
/// <https://www.rfc-editor.org/rfc/rfc5771>
68+
#[allow(dead_code)]
69+
const IPV4_MULTICAST_RANGE: &str = "224.0.0.0/4";
70+
71+
/// IPv6 multicast address range (ff00::/8).
72+
/// See RFC 4291 (IPv6 Addressing Architecture):
73+
/// <https://www.rfc-editor.org/rfc/rfc4291>
74+
#[allow(dead_code)]
75+
const IPV6_MULTICAST_RANGE: &str = "ff00::/8";
76+
6577
/// Stored routes (and usage count) for a given VPC/subnet.
6678
#[derive(Debug, Default, Clone)]
6779
struct RouteSet {
@@ -70,6 +82,21 @@ struct RouteSet {
7082
active_ports: usize,
7183
}
7284

85+
/// Configuration for multicast groups on an OPTE port.
86+
///
87+
/// TODO: This type should be moved to [oxide_vpc::api] when OPTE dependencies
88+
/// are updated, following the same pattern as other VPC configuration types
89+
/// like [ExternalIpCfg], [IpCfg], etc.
90+
///
91+
/// TODO: Eventually remove.
92+
#[derive(Debug, Clone, PartialEq)]
93+
pub struct MulticastGroupCfg {
94+
/// The multicast group IP address (IPv4 or IPv6).
95+
pub group_ip: IpAddr,
96+
/// For Source-Specific Multicast (SSM), list of source addresses.
97+
pub sources: Vec<IpAddr>,
98+
}
99+
73100
#[derive(Debug)]
74101
struct PortManagerInner {
75102
log: Logger,
@@ -595,7 +622,7 @@ impl PortManager {
595622
}
596623

597624
/// Set Internet Gateway mappings for all external IPs in use
598-
/// by attached `NetworkInterface`s.
625+
/// by attached [NetworkInterface]s.
599626
///
600627
/// Returns whether the internal mappings were changed.
601628
pub fn set_eip_gateways(&self, mappings: ExternalIpGatewayMap) -> bool {
@@ -751,6 +778,68 @@ impl PortManager {
751778
Ok(())
752779
}
753780

781+
/// Validate multicast group memberships for an OPTE port.
782+
///
783+
/// This method validates multicast group configurations but does not yet
784+
/// configure OPTE port-level multicast group membership. The actual
785+
/// multicast forwarding is currently handled by the reconciler + DPD
786+
/// at the dataplane switch level.
787+
///
788+
/// TODO: Once OPTE kernel module supports multicast group APIs, this method
789+
/// should be updated accordingly to configure the port for specific
790+
/// multicast group memberships.
791+
pub fn multicast_groups_ensure(
792+
&self,
793+
nic_id: Uuid,
794+
nic_kind: NetworkInterfaceKind,
795+
multicast_groups: &[MulticastGroupCfg],
796+
) -> Result<(), Error> {
797+
let ports = self.inner.ports.lock().unwrap();
798+
let port = ports.get(&(nic_id, nic_kind)).ok_or_else(|| {
799+
Error::MulticastUpdateMissingPort(nic_id, nic_kind)
800+
})?;
801+
802+
debug!(
803+
self.inner.log,
804+
"Validating multicast group configuration for OPTE port";
805+
"port_name" => port.name(),
806+
"nic_id" => ?nic_id,
807+
"groups" => ?multicast_groups,
808+
);
809+
810+
// Validate multicast group configurations
811+
for group in multicast_groups {
812+
if !group.group_ip.is_multicast() {
813+
error!(
814+
self.inner.log,
815+
"Invalid multicast IP address";
816+
"group_ip" => %group.group_ip,
817+
"port_name" => port.name(),
818+
);
819+
return Err(Error::InvalidPortIpConfig);
820+
}
821+
}
822+
823+
// TODO: Configure firewall rules to allow multicast traffic.
824+
// Add exceptions in source/dest MAC/L3 addr checking for multicast
825+
// addreses matching known groups, only doing cidr-checking on the
826+
// multicasst destination side.
827+
828+
info!(
829+
self.inner.log,
830+
"OPTE port configured for multicast traffic";
831+
"port_name" => port.name(),
832+
"ipv4_range" => IPV4_MULTICAST_RANGE,
833+
"ipv6_range" => IPV6_MULTICAST_RANGE,
834+
"multicast_groups" => multicast_groups.len(),
835+
);
836+
837+
// TODO: Configure OPTE port for specific multicast group membership
838+
// once APIs are available.
839+
840+
Ok(())
841+
}
842+
754843
pub fn firewall_rules_ensure(
755844
&self,
756845
vni: external::Vni,

nexus-config/src/nexus_config.rs

Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -439,6 +439,8 @@ pub struct BackgroundTaskConfig {
439439
pub webhook_deliverator: WebhookDeliveratorConfig,
440440
/// configuration for SP ereport ingester task
441441
pub sp_ereport_ingester: SpEreportIngesterConfig,
442+
/// configuration for multicast group reconciler task
443+
pub multicast_group_reconciler: MulticastGroupReconcilerConfig,
442444
}
443445

444446
#[serde_as]
@@ -836,6 +838,21 @@ impl Default for SpEreportIngesterConfig {
836838
}
837839
}
838840

841+
#[serde_as]
842+
#[derive(Clone, Debug, Deserialize, Eq, PartialEq, Serialize)]
843+
pub struct MulticastGroupReconcilerConfig {
844+
/// period (in seconds) for periodic activations of the background task that
845+
/// reconciles multicast group state with dendrite switch configuration
846+
#[serde_as(as = "DurationSeconds<u64>")]
847+
pub period_secs: Duration,
848+
}
849+
850+
impl Default for MulticastGroupReconcilerConfig {
851+
fn default() -> Self {
852+
Self { period_secs: Duration::from_secs(60) }
853+
}
854+
}
855+
839856
/// Configuration for a nexus server
840857
#[derive(Clone, Debug, Deserialize, PartialEq, Serialize)]
841858
pub struct PackageConfig {
@@ -1126,6 +1143,7 @@ mod test {
11261143
webhook_deliverator.first_retry_backoff_secs = 45
11271144
webhook_deliverator.second_retry_backoff_secs = 46
11281145
sp_ereport_ingester.period_secs = 47
1146+
multicast_group_reconciler.period_secs = 60
11291147
[default_region_allocation_strategy]
11301148
type = "random"
11311149
seed = 0
@@ -1359,6 +1377,10 @@ mod test {
13591377
period_secs: Duration::from_secs(47),
13601378
disable: false,
13611379
},
1380+
multicast_group_reconciler:
1381+
MulticastGroupReconcilerConfig {
1382+
period_secs: Duration::from_secs(60),
1383+
},
13621384
},
13631385
default_region_allocation_strategy:
13641386
crate::nexus_config::RegionAllocationStrategy::Random {
@@ -1453,6 +1475,7 @@ mod test {
14531475
alert_dispatcher.period_secs = 42
14541476
webhook_deliverator.period_secs = 43
14551477
sp_ereport_ingester.period_secs = 44
1478+
multicast_group_reconciler.period_secs = 60
14561479
14571480
[default_region_allocation_strategy]
14581481
type = "random"

0 commit comments

Comments
 (0)