Skip to content

BROADCOM_LEGACY_SAI_COMPAT: Fix sai_get_stats_ext crash on TH1 legacy image#26014

Open
lipxu wants to merge 2 commits intosonic-net:masterfrom
lipxu:fix/brcm-legacy-compat-buildimage-issue2
Open

BROADCOM_LEGACY_SAI_COMPAT: Fix sai_get_stats_ext crash on TH1 legacy image#26014
lipxu wants to merge 2 commits intosonic-net:masterfrom
lipxu:fix/brcm-legacy-compat-buildimage-issue2

Conversation

@lipxu
Copy link
Contributor

@lipxu lipxu commented Mar 11, 2026

Why I did it

On Arista 7060cx (BCM56960/Tomahawk-1) running the broadcom-legacy image, syncd crashes inside sai_get_stats_ext during FlexCounter polling for switch objects. The legacy SAI binary does not support sai_get_stats_ext for switch objects on TH1.

Root cause: sonic-sairedis PR #1775 set use_sai_stats_ext = true for COUNTER_TYPE_SWITCH in FlexCounter::createCounterContext(). This is needed for TH5 but causes a crash on TH1 with the legacy SAI binary.

Work item tracking
  • Microsoft ADO (number only):

How I did it

Add SAI_STATS_EXT_SWITCH_SUPPORTED=0 to sai.profile for all Arista 7060cx HWSKUs (BCM56960/Tomahawk-1). The runtime guard in syncd (sonic-sairedis PR #1789) reads this key and sets use_sai_stats_ext = false for switch counter contexts on this platform.

Files changed:

  • device/arista/x86_64-arista_7060_cx32s/Arista-7060CX-32S/sai.profile
  • device/arista/x86_64-arista_7060_cx32s/Arista-7060CX-32S-C32/sai.profile
  • device/arista/x86_64-arista_7060_cx32s/Arista-7060CX-32S-D48C8/sai.profile
  • device/arista/x86_64-arista_7060_cx32s/Arista-7060CX-32S-Q24C8/sai.profile
  • device/arista/x86_64-arista_7060_cx32s/Arista-7060CX-32S-T96C8/sai.profile

How to verify it

  1. Build a broadcom-legacy SONiC image for Arista 7060cx
  2. Boot the device and enable FlexCounter for switch objects
  3. Confirm syncd does not crash during FlexCounter polling
  4. Confirm switch counter stats are still collected (via non-ext path)

Which release branch to backport (provide reason below if selected)

These are bug fixes for broadcom-legacy platform (TH1). The crashes are present in 202511.

  • 202305
  • 202311
  • 202405
  • 202411
  • 202505
  • 202511

Tested branch (Please provide the tested image version)

  • 20251110.15 (broadcom-legacy, Arista 7060cx)

Description for the changelog

BROADCOM_LEGACY_SAI_COMPAT: Add sai.profile key to disable sai_get_stats_ext for switch objects on Arista 7060cx (TH1) to prevent syncd crash during FlexCounter polling on broadcom-legacy image.

Link to config_db schema for YANG module changes

N/A — sai.profile change only, no config_db schema impact.

A picture of a cute animal (not mandatory but encouraged)

🐧

lipxu added 2 commits March 11, 2026 03:00
…s_st_capability at runtime

Add SAI_STATS_ST_CAPABILITY_SUPPORTED=0 to sai.profile for Arista 7060cx
(BCM56960/Tomahawk-1) to disable sai_query_stats_st_capability at runtime.
This prevents a SIGSEGV in brcm_sai_st_pd_ctr_cap_list_get when running
the legacy SAI binary which does not initialize p_pdapi_st->vtable for TH1.

The runtime guard is implemented in sonic-sairedis PR sonic-net#1788.

Signed-off-by: Liping Xu <xuliping@microsoft.com>
… image

Add SAI_STATS_EXT_SWITCH_SUPPORTED=0 to sai.profile for Arista 7060cx
(BCM56960/Tomahawk-1) to disable sai_get_stats_ext for switch objects.
The legacy SAI binary crashes when FlexCounter calls sai_get_stats_ext
on switch objects during polling.

The runtime guard is implemented in sonic-sairedis PR sonic-net#1789.

Signed-off-by: Liping Xu <xuliping@microsoft.com>
Copilot AI review requested due to automatic review settings March 11, 2026 03:03
@mssonicbld
Copy link
Collaborator

/azp run Azure.sonic-buildimage

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates Arista 7060CX (Tomahawk-1 / broadcom-legacy) HWSKU sai.profile files to add runtime feature flags that allow syncd/FlexCounter to avoid calling unsupported legacy-SAI stats APIs which can crash syncd.

Changes:

  • Add SAI_STATS_EXT_SWITCH_SUPPORTED=0 to disable sai_get_stats_ext usage for switch-object counters on TH1 legacy SAI.
  • Add SAI_STATS_ST_CAPABILITY_SUPPORTED=0 to disable streaming-telemetry stats capability querying on TH1 legacy SAI.
  • Add inline comments documenting the platform limitations and why the flags are set.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
device/arista/x86_64-arista_7060_cx32s/Arista-7060CX-32S/sai.profile Add TH1 legacy runtime-disable flags for stats-ext (switch) and ST capability
device/arista/x86_64-arista_7060_cx32s/Arista-7060CX-32S-T96C8/sai.profile Same flags for T96C8 variant
device/arista/x86_64-arista_7060_cx32s/Arista-7060CX-32S-Q24C8/sai.profile Same flags for Q24C8 variant
device/arista/x86_64-arista_7060_cx32s/Arista-7060CX-32S-D48C8/sai.profile Same flags for D48C8 variant
device/arista/x86_64-arista_7060_cx32s/Arista-7060CX-32S-C32/sai.profile Same flags for C32 variant

Comment on lines +6 to +8
# BROADCOM_LEGACY_SAI_COMPAT: sai_get_stats_ext is not supported for switch objects on TH1 (BCM56960).
# Setting to 0 disables use_sai_stats_ext in FlexCounter for COUNTER_TYPE_SWITCH.
SAI_STATS_EXT_SWITCH_SUPPORTED=0
Copy link

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The description says this key is added for “all Arista 7060cx HWSKUs”, but device/arista/x86_64-arista_7060_cx32s/Arista-7060CX-32S-Q32/ uses sai.profile.j2 and currently doesn’t include SAI_STATS_EXT_SWITCH_SUPPORTED (or SAI_STATS_ST_CAPABILITY_SUPPORTED). If Q32 is also affected, please update its template as well; otherwise, consider clarifying in the PR description which SKUs are intentionally excluded.

Copilot uses AI. Check for mistakes.
Comment on lines +2 to +5
# BROADCOM_LEGACY_SAI_COMPAT: TH1 (BCM56960) has no streaming telemetry platform driver;
# sai_query_stats_st_capability crashes in brcm_sai_st_pd_ctr_cap_list_get.
SAI_STATS_ST_CAPABILITY_SUPPORTED=0

Copy link

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR description focuses on disabling sai_get_stats_ext for switch objects, but this change also introduces SAI_STATS_ST_CAPABILITY_SUPPORTED=0 (streaming telemetry stats capability) in the same profile. Please update the PR description (and/or title) to explicitly mention this additional behavior, or split it into a separate PR if it’s intended to land independently.

Copilot uses AI. Check for mistakes.
Copy link
Contributor

@Gfrom2016 Gfrom2016 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants