Skip to content

Conversation

@zeeshanlakhani
Copy link
Collaborator

@zeeshanlakhani zeeshanlakhani commented Nov 30, 2025

Egress multicast (instances sending to external receivers) is not in MVP scope, so the MVLAN field for VLAN-tagged upstream traffic is unnecessary, as it probably won't be attached specifically to a group, and is worth revisiting.

Changes:

  • Drop mvlan column from multicast_group table (schema v213)
  • Remove mvlan from Rust structs, SQL queries, and API views
  • Add TODO scope documentation

When egress support lands, VLAN tagging will be reintroduced with proper uplink port configuration.

Notes:

This PR also addresses permission models, object deletion, and error handling questions related to
reserved addresses  presented in @askfongjojo's testing Google Doc (default IP Pools are covered
in a follow-up, stacked PR).

In thinking through the *Groups* API, permission scopes, and flexibility, @rcgoodfellow mentioned this consideration:

> Do we need an explicit notion of a group object at all? Or can
> instances simply allocate/deallocate group IPs from pools, and there is
> no explicit management of group objects.

With Fleet admins having access control to create pools and link silos to a pool, we arrived at the idea
of replacing the current explicit multicast group CRUD with an implicit lifecycle, where groups are created
upon the first member join and deleted when the last member leaves.

**Note**: Most of the PR's changes are test-related due to moving away from the explicit multicast group(s) lifecycle.

Auth Model:
  - Discovery (fleet-scoped):
    - Read/list groups and list members: any authenticated user in the same fleet.
  - Membership (project-scoped):
    - Join/leave requires Instance::Modify on the specific instance.
  - Creation control:
    - Implicit group creation only when the s silo is linked to a suitable multicast pool (by name or by explicit IP in that pool).

Behavior:
  - Implicit lifecycle:
    - Create on first join (idempotent); delete when last member leaves (atomic mark-for-removal, reconciler schedules cleanup).
  - Addressing and validation:
    - Implicit allocation from the s linked multicast pools.
    - SSM/ASM semantics enforced:
      - IPv4 SSM 232/8 and IPv6 ff3x::/32
  - Error handling: - Reserved/invalid multicast ranges rejected at pool/range add time.

API:
  - Primary flows:
    - Group-centric member management: POST/DELETE /v1/multicast-groups/{group}/members
    - Instance-centric join/leave: PUT/DELETE /v1/instances/{instance}/multicast-groups/{group}
  - Discovery endpoints remain for list/view; there is no explicit group create/update/delete.
  - This is a *breaking* change, but multicast is not yet enabled or available in production

Key changes:
  - Implicit group model; groups exist while they have members.
  - IP pool integration for multicast allocation with silo link gating.
  - Simplified API centered on join/leave flows.
  - Add multicast_ip to the member table for responses.
  - For consistency, move to `Instant` type over `SystemTime` for mcast-related caches

Follow-ups (stacked PRs)
  - [ ] Remove MVLAN from group data model.
  - [ ] Default IP pool support (IPv4/IPv6 Followrequire unicast/multicast).
  - [ ] Dendrite: use omicron-common constants for validation.
Egress multicast (instances sending to external receivers) is not in
MVP scope, so the MVLAN field for VLAN-tagged upstream traffic is
unnecessary, as it probably won't be attached specifically to a group,
and is worth revisiting.

Changes:
  - Drop mvlan column from multicast_group table (schema v213)
  - Remove mvlan from Rust structs, SQL queries, and API views
  - Add TODO scope documentation

When egress support lands, VLAN tagging will be reintroduced with
proper uplink port configuration.
@zeeshanlakhani zeeshanlakhani force-pushed the zl/drop-mvlan-from-group branch from 05ed654 to 68a0df3 Compare December 1, 2025 07:17
Introduce API version `VERSION_MULTICAST_IMPLICIT_LIFECYCLE_UPDATES`
(v2025120500) to support the transition from explicit to implicit
multicast group lifecycle management.

Changes in new API version:
  - Groups are created implicitly when first member joins
  - Groups are deleted implicitly when last member leaves
  - Instance create/update accept `MulticastGroupIdentifier` (name, UUID,
    or multicast IP address) instead of just `NameOrId`
  - MulticastGroupMemberAdd now has optional `source_ips` for SSM

Backward compatibility (v20251120):
  - Add `v20251120` module with compatibility types using `NameOrId`
  - Explicit group create/update/delete endpoints marked deprecated
  - Proper base64 validation for user_data via shared UserData serde helper

Also includes:
  - Add version_policy to techport server for omdb compatibility
Includes:
- Remove GLOP (233/8), admin-scoped (239/8), and specific reserved
  address (NTP, Cisco Auto-RP, PTP) restrictions from IP pool validation
- Only link-local multicast (224.0.0.0/24) is now rejected (not routable)
- Add ASM pool fallback when join-by-name with source_ips finds no SSM
  pool linked
- Allow source filtering on ASM addresses (IGMPv3/MLDv2 supports this)
- SSM addresses still require sources per RFC 4607

The previous restrictions were overly conservative. Customers may have
legitimate use cases for GLOP (AS-based allocations), admin-scoped
(organization-local multicast), and protocol-specific addresses.
@zeeshanlakhani zeeshanlakhani self-assigned this Dec 10, 2025
This update moves source IPs from group to member for per-member source filtering.
Each member can now subscribe to different sources within the same
multicast group, i.e., [(S, G)]. The group's `source_ips` API field now shows the union of
all member source IPs.

Includes:
  - Add source_ips column to multicast_group_member table
  - Add underlay_salt for XOR-fold collision avoidance when mapping
    external multicast IPs to admin-local IPv6 underlay addresses
    - Document the mapping algorithm and add more tests
  - Schema migration rename: multicast-implicit-lifecycle (v213)
  - Update instance-centric join API to accept source_ips
  - Remove deprecated group-centric member add/remove endpoints
  - Clean up redundant comments and fix typos
…rce TODO wrt Dendrite

Includes:
  - Add shared `put_upsert` helper for idempotent PUT+CREATED requests, for 201 responses
  - Add pool_selection.rs tests for SSM/ASM fallback behavior
  - ASM sources TODO/workaround:
    - Only send sources to DPD for SSM groups (232/8 IPv4, ff3x:: IPv6)
    - ASM groups get `None` for sources, meaning "any source allowed"
    - Temporary fix until dendrite accepts ASM source filtering (upcoming PR)
  - Schema
    - Bump version 213.0. 214.0.0 (post-merge_
@zeeshanlakhani zeeshanlakhani force-pushed the zl/drop-mvlan-from-group branch from 31c9a69 to bc20284 Compare December 17, 2025 03:59
@zeeshanlakhani zeeshanlakhani force-pushed the zl/drop-mvlan-from-group branch from f86fc3f to 5ca441d Compare December 17, 2025 10:43
@zeeshanlakhani
Copy link
Collaborator Author

zeeshanlakhani commented Dec 17, 2025

This PR should be good now after rummaging through conflicts on sources post-merge of #9450.

zeeshanlakhani added a commit that referenced this pull request Dec 23, 2025
Previously, each silo could only have one default IP pool. This change
allows one default pool per (pool_type, ip_version) combination, enabling
silos to have separate defaults for:

  - Unicast IPv4
  - Unicast IPv6
  - Multicast IPv4
  - Multicast IPv6

This work previously branched off
#9451, but now off `main`,
involving changes that have to do with the mcast lifecycle changes.

Includes:

  - Each default can now be set or unset and demoted independently.
    Unsetting the unicast IPv4 default does not affect the multicast IPv4
    default, for example.
  - Add `pool_type` and `ip_version` columns to `ip_pool_resource`
    (denormalized from parent `ip_pool` for unique index)
  - Replace unique index with partial index on (resource_id, pool_type,
    ip_version) WHERE is_default = true
  - Rename `IpPoolResourceLink` to `IncompleteIpPoolResource` to reflect
    that pool_type/ip_version are actually populated by the linking query
  - Add `ip_version` field to API params for default pool disambiguation
  - API versioning for backwards compatibility with older clients
zeeshanlakhani added a commit that referenced this pull request Dec 23, 2025
Previously, each silo could only have one default IP pool. This change
allows one default pool per (pool_type, ip_version) combination, enabling
silos to have separate defaults for:

  - Unicast IPv4
  - Unicast IPv6
  - Multicast IPv4
  - Multicast IPv6

This work previously branched off #9451, but is now off `main`,
involving changes that have to do with the mcast lifecycle changes.

Includes:

  - Each default can now be set or unset and demoted independently.
    Unsetting the unicast IPv4 default does not affect the multicast IPv4
    default, for example.
  - Add `pool_type` and `ip_version` columns to `ip_pool_resource`
    (denormalized from parent `ip_pool` for unique index)
  - Replace unique index with partial index on (resource_id, pool_type,
    ip_version) WHERE is_default = true
  - Rename `IpPoolResourceLink` to `IncompleteIpPoolResource` to reflect
    that pool_type/ip_version are actually populated by the linking query
  - Add `ip_version` field to API params for default pool disambiguation
  - API versioning for backwards compatibility with older clients
zeeshanlakhani added a commit that referenced this pull request Dec 23, 2025
Previously, each silo could only have one default IP pool. This change
allows one default pool per (pool_type, ip_version) combination, enabling
silos to have separate defaults for:

  - Unicast IPv4
  - Unicast IPv6
  - Multicast IPv4
  - Multicast IPv6

This work previously branched off #9451, but is now off `main`,
involving changes that have to do with the mcast lifecycle changes.

Includes:

  - Each default can now be set or unset and demoted independently.
    Unsetting the unicast IPv4 default does not affect the multicast IPv4
    default, for example.
  - Add `pool_type` and `ip_version` columns to `ip_pool_resource`
    (denormalized from parent `ip_pool` for unique index)
  - Replace unique index with partial index on (resource_id, pool_type,
    ip_version) WHERE is_default = true
  - Rename `IpPoolResourceLink` to `IncompleteIpPoolResource` to reflect
    that pool_type/ip_version are actually populated by the linking query
  - Add `ip_version` field to API params for default pool disambiguation
  - API versioning for backwards compatibility with older clients
zeeshanlakhani added a commit that referenced this pull request Dec 23, 2025
Previously, each silo could only have one default IP pool. This change
allows one default pool per (pool_type, ip_version) combination, enabling
silos to have separate defaults for:

  - Unicast IPv4
  - Unicast IPv6
  - Multicast IPv4
  - Multicast IPv6

This work previously branched off #9451, but is now off `main`,
involving changes that have to do with the mcast lifecycle changes.

Includes:

  - Each default can now be set or unset and demoted independently.
    Unsetting the unicast IPv4 default does not affect the multicast IPv4
    default, for example.
  - Add `pool_type` and `ip_version` columns to `ip_pool_resource`
    (denormalized from parent `ip_pool` for unique index)
  - Replace unique index with partial index on (resource_id, pool_type,
    ip_version) WHERE is_default = true
  - Rename `IpPoolResourceLink` to `IncompleteIpPoolResource` to reflect
    that pool_type/ip_version are actually populated by the linking query
  - Add `ip_version` field to API params for default pool disambiguation
  - API versioning for backwards compatibility with older clients
Base automatically changed from zl/mcast-implicit-lifecycle to main January 10, 2026 06:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants