-
Notifications
You must be signed in to change notification settings - Fork 195
Fix port SI setting per speed during speed change #649
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Fix port SI setting per speed during speed change #649
Conversation
This change fixes the issue of port media settings not getting applied when speed change command is issued in CLI. Signed-off-by: Pavan Naregundi <[email protected]>
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
Signed-off-by: Pavan Naregundi <[email protected]>
4335943 to
60f975f
Compare
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
Signed-off-by: Pavan Naregundi <[email protected]>
60f975f to
4ada69a
Compare
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
@mihirpat1 @prgeor @tshalvi Can you please review this PR. |
|
@bobby-nexthop would you also review? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pavannaregundi How are we ensuring NPU_SI_SETTINGS_SYNC_STATUS_KEY is set properly? @mihirpat1 FYI
Also, does the CmisManager reinitialize the port on speed change only after ensuring the following? Do you have the log to confirm?
- The host serdes has set the right media settings
- A valid host signal as per the new speed change (with newly notified media settings) is being sent to the module - This is ensured via
host_tx_readybetween swss and xcvrd
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@prgeor Please find my answers inline. Also please refer to the logs
media_settings_with_speed_change_log.txt.txt
These logs are taken with following command,
# config interface speed Ethernet256 400000
# config interface speed Ethernet280 400000
@pavannaregundi How are we ensuring
NPU_SI_SETTINGS_SYNC_STATUS_KEYis set properly? @mihirpat1 FYI
NPU_SI_SETTINGS_SYNC_STATUS_KEY is always set to NPU_SI_SETTINGS_DEFAULT_VALUE in on_update_logical_port() before calling notify_media_setting(), to make sure serdes values are newly calculated as per the current port speed. Finally it is set to NPU_SI_SETTINGS_NOTIFIED_VALUE by notify_media_setting() which indicates the serdes values are notified. State db dump of the ports under tests are attached in logs file to show value is NPU_SI_SETTINGS_NOTIFIED_VALUE.
Also, does the CmisManager reinitialize the port on speed change only after ensuring the following? Do you have the log to confirm?
- The host serdes has set the right media settings
In attached log file, sairedis.logs shows that old serdes object is removed and new values are applied to syncd SAI SDK. Media setting values are same as the one shown in the issue sonic-net/sonic-buildimage#23480
- A valid host signal as per the new speed change (with newly notified media settings) is being sent to the module - This is ensured via
host_tx_readybetween swss and xcvrd
In attached log file, sairedis.log/syslog show 'SAI_PORT_HOST_TX_READY_STATUS_READY' and corresponding 'host_tx_ready' in swss#orchagent. Also indicates CMIS reinit after host_tx_ready set to 'true'.
|
|
||
| fvp['port_name'] = key | ||
| fvp['asic_id'] = asic_context[port_tbl] | ||
| filter = ['speed'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One concern here:
Consider the following settings:
COPPER100 {
...
}
Copper50 {
}
If the breakout goes from Ethernet0 400G-8 (using Copper50 tuning values) to 2x400G-4 (Copper100 values)
I think that event will be missed based on speed alone, the number of lanes should also be a factor
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bobby-nexthop Thank you for review. Number of lane change only happens due to port breakout and breakout triggers a port remove and add. This scenario already works as, PORT ADD and PORT REMOVE is taken care with existing code base.
We do see a error w.r.t NPU_SI_SETTINGS_SYNC_STATUS and this change is already there for review with PR #622
|
@prgeor @bobby-nexthop Hope I have answered your queries. Let me know if you have any more comments. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pavannaregundi I would suggest to use a port re-creation approach so that Xcvrd can delete the OLD port and create the new PORT even if only port speed attribute is changed. There are other attributes like lanes, fec etc which may be different with new speed. So a cleaner approach is to delete the old port from config DB. This way, when the new PORT is created, Xcvrd should publish the SI settings accordingly as here:-
| media_settings_parser.notify_media_setting(port_change_event.port_name, transceiver_dict, self.xcvr_table_helper, self.port_mapping) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@prgeor This change is only about speed change - "config interface speed interface speed". Where current code is treating speed change as update to config DB. This is not a delete and create operation. Port delete and create in config db for speed change will be much bigger change.
https://github.com/sonic-net/SONiC/blob/master/doc/port-si/Port_SI_Per_Speed.md HLD has this statement in this regard - "After applying the port SI per speed enhancements, it will also be carried out upon port speed change events: Whenever a port speed change is detected by listening to CONFIG_DB, Notify-Media-Settings-Process will be called to send the most applicable SI values in the JSON to SAI. Port speed changes require invoking the Notify-Media-Settings-Process becuase after such a change, the lane_speed_key used for lookup in the JSON changes accordingly, and the previously configured SI values in the ASIC are no longer relevant."
However, change required to achieve this got missed in code.
Number of lane change does happen only if there is a breakout and it will automatically trigger a port delete and port add even for CONFIG_DB. So it will end up with delete and add for xcvrd too. (SAI_PORT_ATTR_HW_LANE_LIST in sai is a create only attribute.)
FEC changes is not a parameter for media settings. It should not invoke any change in media settings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pavannaregundi The HLD can be updated to what I mentioned. Dont want to invent the wheel for each of the following scenarios. I would like to fix it one for all.
- speed is changed but not lanes
- speed is changed along with fec and other port attributes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pavannaregundi I don't know what you mean by bigger change. During breakout of the port speed, we delete the old port and create the new port anyways.
This change fixes the issue of port media settings not getting applied when speed change command is issued in CLI.
Description
Fix the issue of port SI settings not getting applied during the speed change with a CLI executions.
Motivation and Context
fixes sonic-net/sonic-buildimage#23480
This change will also bring the code inline with HLD https://github.com/sonic-net/SONiC/blob/master/doc/port-si/Port_SI_Per_Speed.md
HOW:
How Has This Been Tested?
Verify the syslog and redis db to check applied SI settings after a speed change. Steps and log of verification is attached
media_settings_with_speed_change.txt.txt
Also verified xcvrd kill and pmon docker restart.
Additional Information (Optional)
Which release branch to backport (provide reason below if selected)