Skip to content

Bug: [CMIS][Port-Breakout] Subport speed change is affecting the other subport in the same breakout group. #23006

@Keshavg-marvell

Description

@Keshavg-marvell

Is it platform specific

generic

Importance or Severity

Critical

Description of the bug

We have 8 lanes port, breakout in to 2x400G(R4) as below. When both the ports are in same speed CMIS programming
happens properly and interfaces comes up as below.

#show interface status Ethernet288-292
  Interface            Lanes    Speed    MTU    FEC    Alias    Vlan    Oper    Admin                           Type    Asym PFC
-----------  ---------------  -------  -----  -----  -------  ------  ------  -------  -----------------------------  ----------
Ethernet288  393,394,395,396     400G   9100     rs   etp37a  routed      up       up  OSFP 8X Pluggable Transceiver         N/A
Ethernet292  397,398,399,400     400G   9100     rs   etp37b  routed      up       up  OSFP 8X Pluggable Transceiver         N/A

However when we issue a speed change command on Ethernet288, it brings down the other sub-ports belonging to same port
breakout group.

Here issue seems to be due to decommission_all_datapaths which get called in CMIS state machine which deinit
all the datapath instance instead of deiniting lanes associate with subport for which the speed (cmis appsel)
change is configured.

As per CMIS spec 5.2 section- 8.9.1

Multiple Data Paths are mutually independent. They may be initialized or deinitialized at the same time or at different times.

Relevant code:

   CMIS state machine.
                      if True == self.is_appl_reconfigure_required(api, appl):
                           self.log_notice("{}: Decommissioning all lanes/datapaths to default AppSel=0".format(lport))
                           if True != api.decommission_all_datapaths():
                               self.log_notice("{}: Failed to default to AppSel=0".format(lport))
                               self.force_cmis_reinit(lport, retries + 1)
                               continue
def decommission_all_datapaths(self):
     '''
         Return True if all datapaths are successfully de-commissioned, False otherwise
     '''
     # De-init all datpaths
     self.set_datapath_deinit((1 << self.NUM_CHANNELS) - 1)
     # Decommision all lanes by apply AppSel=0
     self.set_application(((1 << self.NUM_CHANNELS) - 1), 0, 0)
     # Start with AppSel=0 i.e undo any default AppSel
     self.scs_apply_datapath_init((1 << self.NUM_CHANNELS) - 1)

     dp_state = self.get_datapath_state()
     config_state = self.get_config_datapath_hostlane_status()

     for lane in range(self.NUM_CHANNELS):
         name = "DP{}State".format(lane + 1)
         if dp_state[name] != 'DataPathDeactivated':
             return False

         name = "ConfigStatusLane{}".format(lane + 1)
         if config_state[name] != 'ConfigSuccess':
             return False

     return True

other issue is that is_appl_reconfigure_required api should be called with associated hostlane for same subport
and should not check against datapath lanes belonging to other subports

def is_appl_reconfigure_required(self, api, app_new): 
        for lane in range(self.CMIS_MAX_HOST_LANES):
            app_cur = api.get_application(lane)
            if app_cur != 0 and app_cur != app_new:
                return True
        return False

Steps to Reproduce

Steps to reproduce:

  1. Below is two breakout port alias etp37a, etp37b
# show interface status Ethernet288-292
  Interface            Lanes    Speed    MTU    FEC    Alias    Vlan    Oper    Admin                           Type    Asym PFC
-----------  ---------------  -------  -----  -----  -------  ------  ------  -------  -----------------------------  ----------
Ethernet288  393,394,395,396     400G   9100     rs   etp37a  routed      up       up  OSFP 8X Pluggable Transceiver         N/A
Ethernet292  397,398,399,400     400G   9100     rs   etp37b  routed      up       up  OSFP 8X Pluggable Transceiver         N/A
$ sudo sfputil show eeprom -p Ethernet288
Ethernet288: SFP EEPROM detected
        Active App Selection Host Lane 1: 8
        Active App Selection Host Lane 2: 8
        Active App Selection Host Lane 3: 8
        Active App Selection Host Lane 4: 8
        Active App Selection Host Lane 5: 8
        Active App Selection Host Lane 6: 8
        Active App Selection Host Lane 7: 8
        Active App Selection Host Lane 8: 8
        Application Advertisement: 100GAUI-4 C2M (Annex 135E) - Host Assign (0x11) - Active Cable assembly with BER < 10^-6 - Media
Assign (0x11)
                                   100GAUI-2 C2M (Annex 135G) - Host Assign (0x55) - Active Cable assembly with BER < 10^-6 - Media
Assign (0x55)
                                   200GAUI-8 C2M (Annex 120C) - Host Assign (0x1) - Active Cable assembly with BER < 10^-6 - Media A
ssign (0x1)
                                   200GAUI-4 C2M (Annex 120E) - Host Assign (0x11) - Active Cable assembly with BER < 10^-6 - Media
Assign (0x11)
                                   200GAUI-2-S C2M (Annex 120G) - Host Assign (0x55) - Active Cable assembly with BER < 10^-6 - Medi
a Assign (0x55)
                                   200GAUI-2-L C2M (Annex 120G) - Host Assign (0x55) - Active Cable assembly with BER < 10^-6 - Medi
a Assign (0x55)
                                   400GAUI-8 C2M (Annex 120E) - Host Assign (0x1) - Active Cable assembly with BER < 10^-6 - Media A
ssign (0x1)
                                   400GAUI-4-S C2M (Annex 120G) - Host Assign (0x11) - Active Cable assembly with BER < 10^-6 - Medi
a Assign (0x11)
                                   800G S C2M (placeholder) - Host Assign (0x1) - Active Cable assembly with BER < 10^-6 - Media Ass
ign (0x1)
                                   800G L C2M (placeholder) - Host Assign (0x1) - Active Cable assembly with BER < 10^-6 - Media Ass
ign (0x1)
                                   50GAUI-2 C2M (Annex 135E) - Host Assign (0x55) - Active Cable assembly with BER < 10^-6 - Media A
ssign (0x55)
  1. Change the speed of one subport Ethernet288 which brings down the other subport

# config interface speed Ethernet288 200000

# show interface status Ethernet288-292
  Interface            Lanes    Speed    MTU    FEC    Alias    Vlan    Oper    Admin                           Type    Asym PFC
-----------  ---------------  -------  -----  -----  -------  ------  ------  -------  -----------------------------  ----------
Ethernet288  393,394,395,396     200G   9100     rs   etp37a  routed    down       up  OSFP 8X Pluggable Transceiver         N/A
Ethernet292  397,398,399,400     400G   9100     rs   etp37b  routed    down       up  OSFP 8X Pluggable Transceiver         N/A <<<<< this port should not be affected
#

Actual Behavior and Expected Behavior

Other subports of the port-breakout group should not be affected during CMIS operation.

Relevant log output

2025 Jun 17 09:29:20.389655 sonic NOTICE pmon#xcvrd[29]: CMIS: Ethernet288: 200G, lanemask=0xf, CMIS state=INSERTED, Module state=Mo
duleReady, DP state={'DP1State': 'DataPathActivated', 'DP2State': 'DataPathActivated', 'DP3State': 'DataPathActivated', 'DP4State':
'DataPathActivated', 'DP5State': 'DataPathActivated', 'DP6State': 'DataPathActivated', 'DP7State': 'DataPathActivated', 'DP8State':
'DataPathActivated'}, appl 8 host_lane_count 4 retries=0
2025 Jun 17 09:29:20.416884 sonic NOTICE pmon#xcvrd[29]: CMIS: Ethernet288: Setting appl=4
2025 Jun 17 09:29:20.447987 sonic NOTICE pmon#xcvrd[29]: CMIS: Ethernet288: Setting host_lanemask=0xf
2025 Jun 17 09:29:20.502287 sonic NOTICE pmon#xcvrd[29]: CMIS: Ethernet288: Setting media_lanemask=0xf
2025 Jun 17 09:29:20.509685 sonic NOTICE pmon#xcvrd[29]: CMIS: Ethernet288: Decommissioning all lanes/datapaths to default AppSel=0
2025 Jun 17 09:29:20.567196 sonic NOTICE pmon#xcvrd[29]: CMIS: Ethernet288: Failed to default to AppSel=0

Output of show version, show techsupport

$ show version

SONiC Software Version: SONiC.202411.268-dirty-20250613.173406
SONiC OS Version: 12
Distribution: Debian 12.9
Kernel: 6.1.0-29-2-amd64
Build commit: 22f6c9a63
Build date: Fri Jun 13 18:35:52 UTC 2025
Built by: marvell@cpss-testvm1

Attach files (if any)

No response

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions