-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Description
Is it platform specific
generic
Importance or Severity
Critical
Description of the bug
CPU consumption is heavier on Trixie, which already seems to impact fast-reboot performance on some systems, as described in this bug [Fast-reboot] Control plane time disruption is exceeds 90 seconds with Trixie.
Another result of this is test_advanced_reboot.py::test_fast_reboot failure due to the number of flooded packets being above 250.
Packets are being flooded in a small time window, between Vlan members are added to the point that the Vlan's RIF is ready, here is a comparison of these operations and timing before and after moving to Trixie:
Before Trixie:
- First Vlan1000 member added - 15:30:55.260180 sonic NOTICE swss#orchagent: :- addVlanMember: Add member Ethernet0 to VLAN Vlan1000 vid:1000 pid1000000000017
- Vlan1000 RIF created - 15:30:55.695734 sonic NOTICE syncd#SDK: [SAI_RIF.NOTICE] ... Created ROUTER_INTERFACE [OID:0x24100000006] [Type:DEFAULT, ID:9]
After Trixie:
- First Vlan1000 member added - 14:07:11.603139 sonic NOTICE swss#orchagent: :- addVlanMember: Add member Ethernet114 to VLAN Vlan1000 vid:1000 pid1000000000048
- Vlan1000 RIF created - 14:07:12.402659 sonic NOTICE syncd#SDK: [SAI_RIF.NOTICE] ... Created ROUTER_INTERFACE [OID:0x24100000006] [Type:DEFAULT, ID:9]
The diff is ~0.4 seconds, which results in ~550 flooded packets
Steps to Reproduce
Run platform_tests/test_advanced_reboot.py::test_fast_reboot on Trixie based SONiC
Actual Behavior and Expected Behavior
Actual:
FAILED:dut:Unexpected count of sent packets available in pcap file. Could be issue with DUT flooding for original packets which was sent to DUT, flooded count is: 566
Expected: Test should pass without errors
Relevant log output
Output of show version, show techsupport
Attach files (if any)
No response