BROADCOM_LEGACY_SAI_COMPAT: Fix sai_get_stats_ext crash on TH1 legacy image#26014
BROADCOM_LEGACY_SAI_COMPAT: Fix sai_get_stats_ext crash on TH1 legacy image#26014lipxu wants to merge 2 commits intosonic-net:masterfrom
Conversation
…s_st_capability at runtime Add SAI_STATS_ST_CAPABILITY_SUPPORTED=0 to sai.profile for Arista 7060cx (BCM56960/Tomahawk-1) to disable sai_query_stats_st_capability at runtime. This prevents a SIGSEGV in brcm_sai_st_pd_ctr_cap_list_get when running the legacy SAI binary which does not initialize p_pdapi_st->vtable for TH1. The runtime guard is implemented in sonic-sairedis PR sonic-net#1788. Signed-off-by: Liping Xu <xuliping@microsoft.com>
… image Add SAI_STATS_EXT_SWITCH_SUPPORTED=0 to sai.profile for Arista 7060cx (BCM56960/Tomahawk-1) to disable sai_get_stats_ext for switch objects. The legacy SAI binary crashes when FlexCounter calls sai_get_stats_ext on switch objects during polling. The runtime guard is implemented in sonic-sairedis PR sonic-net#1789. Signed-off-by: Liping Xu <xuliping@microsoft.com>
|
/azp run Azure.sonic-buildimage |
|
Azure Pipelines successfully started running 1 pipeline(s). |
There was a problem hiding this comment.
Pull request overview
This PR updates Arista 7060CX (Tomahawk-1 / broadcom-legacy) HWSKU sai.profile files to add runtime feature flags that allow syncd/FlexCounter to avoid calling unsupported legacy-SAI stats APIs which can crash syncd.
Changes:
- Add
SAI_STATS_EXT_SWITCH_SUPPORTED=0to disablesai_get_stats_extusage for switch-object counters on TH1 legacy SAI. - Add
SAI_STATS_ST_CAPABILITY_SUPPORTED=0to disable streaming-telemetry stats capability querying on TH1 legacy SAI. - Add inline comments documenting the platform limitations and why the flags are set.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| device/arista/x86_64-arista_7060_cx32s/Arista-7060CX-32S/sai.profile | Add TH1 legacy runtime-disable flags for stats-ext (switch) and ST capability |
| device/arista/x86_64-arista_7060_cx32s/Arista-7060CX-32S-T96C8/sai.profile | Same flags for T96C8 variant |
| device/arista/x86_64-arista_7060_cx32s/Arista-7060CX-32S-Q24C8/sai.profile | Same flags for Q24C8 variant |
| device/arista/x86_64-arista_7060_cx32s/Arista-7060CX-32S-D48C8/sai.profile | Same flags for D48C8 variant |
| device/arista/x86_64-arista_7060_cx32s/Arista-7060CX-32S-C32/sai.profile | Same flags for C32 variant |
| # BROADCOM_LEGACY_SAI_COMPAT: sai_get_stats_ext is not supported for switch objects on TH1 (BCM56960). | ||
| # Setting to 0 disables use_sai_stats_ext in FlexCounter for COUNTER_TYPE_SWITCH. | ||
| SAI_STATS_EXT_SWITCH_SUPPORTED=0 |
There was a problem hiding this comment.
The description says this key is added for “all Arista 7060cx HWSKUs”, but device/arista/x86_64-arista_7060_cx32s/Arista-7060CX-32S-Q32/ uses sai.profile.j2 and currently doesn’t include SAI_STATS_EXT_SWITCH_SUPPORTED (or SAI_STATS_ST_CAPABILITY_SUPPORTED). If Q32 is also affected, please update its template as well; otherwise, consider clarifying in the PR description which SKUs are intentionally excluded.
| # BROADCOM_LEGACY_SAI_COMPAT: TH1 (BCM56960) has no streaming telemetry platform driver; | ||
| # sai_query_stats_st_capability crashes in brcm_sai_st_pd_ctr_cap_list_get. | ||
| SAI_STATS_ST_CAPABILITY_SUPPORTED=0 | ||
|
|
There was a problem hiding this comment.
The PR description focuses on disabling sai_get_stats_ext for switch objects, but this change also introduces SAI_STATS_ST_CAPABILITY_SUPPORTED=0 (streaming telemetry stats capability) in the same profile. Please update the PR description (and/or title) to explicitly mention this additional behavior, or split it into a separate PR if it’s intended to land independently.
Why I did it
On Arista 7060cx (BCM56960/Tomahawk-1) running the broadcom-legacy image, syncd crashes inside
sai_get_stats_extduring FlexCounter polling for switch objects. The legacy SAI binary does not supportsai_get_stats_extfor switch objects on TH1.Root cause: sonic-sairedis PR #1775 set
use_sai_stats_ext = trueforCOUNTER_TYPE_SWITCHinFlexCounter::createCounterContext(). This is needed for TH5 but causes a crash on TH1 with the legacy SAI binary.Work item tracking
How I did it
Add
SAI_STATS_EXT_SWITCH_SUPPORTED=0tosai.profilefor all Arista 7060cx HWSKUs (BCM56960/Tomahawk-1). The runtime guard in syncd (sonic-sairedis PR #1789) reads this key and setsuse_sai_stats_ext = falsefor switch counter contexts on this platform.Files changed:
device/arista/x86_64-arista_7060_cx32s/Arista-7060CX-32S/sai.profiledevice/arista/x86_64-arista_7060_cx32s/Arista-7060CX-32S-C32/sai.profiledevice/arista/x86_64-arista_7060_cx32s/Arista-7060CX-32S-D48C8/sai.profiledevice/arista/x86_64-arista_7060_cx32s/Arista-7060CX-32S-Q24C8/sai.profiledevice/arista/x86_64-arista_7060_cx32s/Arista-7060CX-32S-T96C8/sai.profileHow to verify it
Which release branch to backport (provide reason below if selected)
These are bug fixes for broadcom-legacy platform (TH1). The crashes are present in 202511.
Tested branch (Please provide the tested image version)
Description for the changelog
BROADCOM_LEGACY_SAI_COMPAT: Add sai.profile key to disable sai_get_stats_ext for switch objects on Arista 7060cx (TH1) to prevent syncd crash during FlexCounter polling on broadcom-legacy image.
Link to config_db schema for YANG module changes
N/A — sai.profile change only, no config_db schema impact.
A picture of a cute animal (not mandatory but encouraged)
🐧