-
Notifications
You must be signed in to change notification settings - Fork 429
Open
Description
When a SR-SIM CPM container starts up, the following process happens (simplified):
- python script reads the mgmt IP from linux eth0 interface,
- bof config is populated with the mgmt IP.
- sros process is started, and the IP is deleted from linux eth0 interface.
Now... the problem may happen when a node includes two CPM containers (Dual-CPM). Because both CPMs share the same eth0 on the same namespace, if one CPM container reach step 3 before the other CPM finish step 1, then the later CPM won't be able to fetch the mgmt IP address, which results in a broken node.
The issue is more likely to happen on a loaded server, where many containers start concurrently.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels