Skip to content

sync_switch_configuration complains about multiple address lot blocks in the initial-infra address block #10182

@jgallagher

Description

@jgallagher

Poking around the Nexus logs on dogfood, I noticed this warning from the sync_switch_configuration background task:

02:48:10.775Z WARN 37090b68-aa4c-456d-bf2a-c8631fcff50c (ServerContext): more than one block assigned to infra lot
    background_task = switch_port_config_manager
    blocks = [
        AddressLotBlock { id: 2ada3a23-8cc0-4dad-a5d5-ba030b3a581a, address_lot_id: 59dac3c8-af15-4468-b4c6-f573db150e2e, first_address: V4(Ipv4Network { addr: 172.20.15.21, prefix: 32 }), last_address: V4(Ipv4Network { addr: 172.20.15.22, prefix: 32 }) },
        AddressLotBlock { id: e30373de-88b5-427c-acb7-65f896695e40, address_lot_id: 59dac3c8-af15-4468-b4c6-f573db150e2e, first_address: V6(Ipv6Network { addr: fd00:99::1, prefix: 128 }), last_address: V6(Ipv6Network { addr: fd00:99::ffff, prefix: 128 }) }
    ]
    file = nexus/src/app/background/tasks/sync_switch_configuration.rs:1252
    rack_id = de608e01-b8e4-4d93-b972-a7dbed36dd22

This is coming from this bit of code, where we ask the datastore for all the address lot blocks in the initial-infra address lot, but only expect to find one (and ignore any other than the first):

let blocks = match self.datastore.address_lot_blocks_by_name(opctx, INFRA_LOT.into()).await {
Ok(blocks) => blocks,
Err(e) => {
error!(log, "error while fetching address lot blocks from db"; "error" => %e);
continue;
},
};
// currently there should only be one block assigned. If there is more than one
// block, grab the first one and emit a warning.
if blocks.len() > 1 {
warn!(log, "more than one block assigned to infra lot"; "blocks" => ?blocks);
}
let (infra_ip_first, infra_ip_last)= match blocks.get(0) {
Some(AddressLotBlock{ first_address, last_address, ..}) => {
(first_address.ip(), last_address.ip())
},
None => {
error!(log, "no blocks assigned to infra lot");
continue;
},
}
;

I haven't dug into this, but some casual observations / questions:

  1. There's no order_by in the address_lot_blocks_by_name query, so we're (accidentally?) relying on CRDB to consistently return the same row first
  2. We only use this to populate the infra_ip_first and infra_ip_last fields in the RackNetworkConfig we push to the bootstore. Does sled-agent act on these fields? If so, what would it do if we happened to pick the IPv6 address lot block instead of the IPv4 one?
  3. Do we expect to have multiple address lot blocks in the initial-infra address lot? (It seems like either the configuration is wrong or the bg task is incorrect and needs to account for multiple blocks.)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions