Skip to content

Conversation

@rminnikanti
Copy link
Contributor

@rminnikanti rminnikanti commented Nov 9, 2025

Fixes #3983

What I did
Added cleanup of COUNTERS_*_NAME_MAP entries for a port during its deinit phase, and regenerated the NAME_MAP tables with fresh OIDs during port init when queue flex counters are already enabled.

Why I did it
After dynamic port breakout of a port, queue name map tables in the COUNTERS_DB table are not regenerated leaving stale entries resulting in CLI crash

How I verified it

# show queue counters Ethernet0
For namespace :
     Port    TxQ    Counter/pkts    Counter/bytes    Drop/pkts    Drop/bytes
---------  -----  --------------  ---------------  -----------  ------------
Ethernet0    UC0               0                0            0             0
Ethernet0    UC1               0                0            0             0
Ethernet0    UC2               0                0            0             0
Ethernet0    UC3               0                0            0             0
Ethernet0    UC4               0                0            0             0
Ethernet0    UC5               0                0            0             0
Ethernet0    UC6               0                0            0             0
Ethernet0    UC7               0                0            0             0

root@sonic:~# redis-dump -d 2 -k "COUNTERS_PORT_NAME_MAP*" | jq | grep Ethernet448
      "Ethernet448": "oid:0x100000000085c",
root@sonic:~#  redis-dump -d 2 -k "COUNTERS_QUEUE_PORT_MAP*" | jq | grep 0x100000000085c
      "oid:0x15000000000860": "oid:0x100000000085c",
      "oid:0x15000000000861": "oid:0x100000000085c",
      "oid:0x15000000000862": "oid:0x100000000085c",
      "oid:0x15000000000863": "oid:0x100000000085c",
      "oid:0x15000000000864": "oid:0x100000000085c",
      "oid:0x1500000000085d": "oid:0x100000000085c",
      "oid:0x1500000000085f": "oid:0x100000000085c",
      "oid:0x1500000000085e": "oid:0x100000000085c",
root@sonic:~# redis-dump -d 2 -k "COUNTERS_QUEUE_NAME_MAP*" | jq | grep -E "0x15000000000860|0x15000000000861|0x15000000000862|0x15000000000863|0x15000000000864|0x1500000000085d|0x1500000000085f|0x1500000000085e"
      "Ethernet448:6": "oid:0x15000000000863",
      "Ethernet448:2": "oid:0x1500000000085f",
      "Ethernet448:4": "oid:0x15000000000861",
      "Ethernet448:7": "oid:0x15000000000864",
      "Ethernet448:3": "oid:0x15000000000860",
      "Ethernet448:0": "oid:0x1500000000085d",
      "Ethernet448:5": "oid:0x15000000000862",
      "Ethernet448:1": "oid:0x1500000000085e",
root@sonic:~# redis-dump -d 2 -k "COUNTERS_QUEUE_INDEX_MAP*" | jq | grep -E "0x15000000000860|0x15000000000861|0x15000000000862|0x15000000000863|0x15000000000864|0x1500000000085d|0x1500000000085f|0x1500000000085e"
      "oid:0x15000000000860": "3",
      "oid:0x15000000000861": "4",
      "oid:0x15000000000862": "5",
      "oid:0x15000000000863": "6",
      "oid:0x15000000000864": "7",
      "oid:0x1500000000085d": "0",
      "oid:0x1500000000085f": "2",
      "oid:0x1500000000085e": "1",
root@sonic:~#  redis-dump -d 2 -k "COUNTERS_QUEUE_TYPE_MAP*" | jq | grep -E "0x15000000000860|0x15000000000861|0x15000000000862|0x15000000000863|0x15000000000864|0x1500000000085d|0x1500000000085f|0x1500000000085e"
      "oid:0x15000000000860": "SAI_QUEUE_TYPE_UNICAST",
      "oid:0x15000000000861": "SAI_QUEUE_TYPE_UNICAST",
      "oid:0x15000000000862": "SAI_QUEUE_TYPE_UNICAST",
      "oid:0x15000000000863": "SAI_QUEUE_TYPE_UNICAST",
      "oid:0x15000000000864": "SAI_QUEUE_TYPE_UNICAST",
      "oid:0x1500000000085d": "SAI_QUEUE_TYPE_UNICAST",
      "oid:0x1500000000085f": "SAI_QUEUE_TYPE_UNICAST",
      "oid:0x1500000000085e": "SAI_QUEUE_TYPE_UNICAST",

Details if related

…kout

Fix cleans up COUNTERS_*_NAME_MAP of the port during deinit of the port and
regenerate the NAME_MAP's with new OID's during port init if queue flex counters
are already enabled.

Signed-off-by: Ravi Minnikanti <[email protected]>
@rminnikanti rminnikanti requested a review from prsunny as a code owner November 9, 2025 10:15
@mssonicbld
Copy link
Collaborator

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

{
SWSS_LOG_ENTER();

if (gHFTOrch)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Pterosaur can you please review?

@mssonicbld
Copy link
Collaborator

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

Copy link
Contributor

@Pterosaur Pterosaur left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kperumalbfn Could you please help to review this PR?
If we merge this PR, I will close my PR: #3967 that is duplicated.

bool queueFcEnabled = flex_counters_orch->getQueueCountersState() ||
flex_counters_orch->getQueueWatermarkCountersState() ||
flex_counters_orch->getWredQueueCountersState();
if (queueFcEnabled && !p.m_queue_ids.empty())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rminnikanti could you please check the DPB sonic-mgmt tests? If this is not covered, could you please add them?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kperumalbfn I don't see DPB test component in PTF. As far as I understand, DPB testing in sonic-mgmt can be performed on ports that are not part of the topology.

The mock_tests included in this PR verifies the regeneration of the NAME_MAP's. I’ve also validated this behavior on a device and shared the redis-dump output in the PR description for reference.

Copy link
Contributor

@kperumalbfn kperumalbfn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Pterosaur could you close the PR - #3967

kperumalbfn
kperumalbfn previously approved these changes Nov 11, 2025
@mssonicbld
Copy link
Collaborator

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@mssonicbld
Copy link
Collaborator

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@StormLiangMS
Copy link
Contributor

/azp run Azure.sonic-swss

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@mssonicbld
Copy link
Collaborator

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@sdszhang
Copy link
Contributor

@rminnikanti can you create a cherry-pick PR to msft-202412

@rminnikanti
Copy link
Contributor Author

@rminnikanti can you create a cherry-pick PR to msft-202412

@sdszhang created PR - Azure/sonic-swss.msft#170

@mssonicbld
Copy link
Collaborator

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@rminnikanti
Copy link
Contributor Author

@r12f @ZhaohuiS Fixed. PR checks passed.

@r12f I have propagated fix to Azure/sonic-swss.msft#170

@prsunny prsunny merged commit 4d39712 into sonic-net:master Nov 28, 2025
15 checks passed
@mssonicbld
Copy link
Collaborator

Cherry-pick PR to msft-202412:

@r12f
Copy link

r12f commented Dec 2, 2025

Hi @rminnikanti , we are hitting test failure on 202412 due to the PR being incorrectly ported, do you mind to help us look into it?

@kperumalbfn for viz.

@rminnikanti
Copy link
Contributor Author

Hi @rminnikanti , we are hitting test failure on 202412 due to the PR being incorrectly ported, do you mind to help us look into it?

@kperumalbfn for viz.

@r12f I identified the fix. Its not exactly because of the cherry-picked PR but because of a missing mock_table function in 202412. I will fix it soon.

@r12f
Copy link

r12f commented Dec 2, 2025

Thanks a lot Ravi!

@rminnikanti
Copy link
Contributor Author

Thanks a lot Ravi!

Created Azure/sonic-swss.msft#175

kalash-nexthop pushed a commit to kalash-nexthop/sonic-swss that referenced this pull request Dec 16, 2025
sonic-net#3982)

What I did
Added cleanup of COUNTERS_*_NAME_MAP entries for a port during its deinit phase, and regenerated the NAME_MAP tables with fresh OIDs during port init when queue flex counters are already enabled.

Why I did it
After dynamic port breakout of a port, queue name map tables in the COUNTERS_DB table are not regenerated leaving stale entries resulting in CLI crash

Signed-off-by: Kalash Nainwal <[email protected]>
@govi-nokia
Copy link

@prsunny Could this fix be taken to 202511 please? Please add the appropriate labels for the same.

@Pavan-Nokia viz
@balanokia viz
@dgodwin-nokia viz

Pterosaur pushed a commit to Janetxxx/sonic-swss that referenced this pull request Jan 6, 2026
sonic-net#3982)

What I did
Added cleanup of COUNTERS_*_NAME_MAP entries for a port during its deinit phase, and regenerated the NAME_MAP tables with fresh OIDs during port init when queue flex counters are already enabled.

Why I did it
After dynamic port breakout of a port, queue name map tables in the COUNTERS_DB table are not regenerated leaving stale entries resulting in CLI crash
@mssonicbld
Copy link
Collaborator

Cherry-pick PR to 202511: #4109

dakotac-arista added a commit to dakotac-arista/sonic-swss.msft that referenced this pull request Jan 20, 2026
PR:sonic-net/sonic-swss#3982

Also, added missing mock test function from master identified in:
Azure#175

Signed-off-by: Dakota Crozier <[email protected]>
@dakotac-arista
Copy link

Hi @prsunny, can the fix be backported to msft-202503 as well? PR: Azure/sonic-swss.msft#196

arlakshm pushed a commit to Azure/sonic-swss.msft that referenced this pull request Jan 29, 2026
…ost port breakout (#196)

PR: sonic-net/sonic-swss#3982

Also, included missing mock test function from master. As identified in
msft-202412: #175

Signed-off-by: Dakota Crozier <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[DPB]: Stale queue counter mappings in COUNTERS_DB post port breakout