Skip to content

Commit c3eecb8

Browse files
authored
Fixing state_db not having delete_field attribute causing a crash when DPUs in bad state (#246)
### What I did Fixed the AttributeError caused by missing [delete_field](vscode-file://vscode-app/Applications/Visual%20Studio%20Code%205.app/Contents/Resources/app/out/vs/code/electron-browser/workbench/workbench.html) method in the [StateDBHelper](vscode-file://vscode-app/Applications/Visual%20Studio%20Code%205.app/Contents/Resources/app/out/vs/code/electron-browser/workbench/workbench.html) class when managing DPU state transitions. The code was attempting to call [state_db.delete_field()](vscode-file://vscode-app/Applications/Visual%20Studio%20Code%205.app/Contents/Resources/app/out/vs/code/electron-browser/workbench/workbench.html) to remove the 'transition_start_time' field from the database, but this method didn't exist, causing crashes when DPUs were in bad state. ### How I did it Added the missing [delete_field](vscode-file://vscode-app/Applications/Visual%20Studio%20Code%205.app/Contents/Resources/app/out/vs/code/electron-browser/workbench/workbench.html) method to the [StateDBHelper](vscode-file://vscode-app/Applications/Visual%20Studio%20Code%205.app/Contents/Resources/app/out/vs/code/electron-browser/workbench/workbench.html) class that properly removes fields from Redis using the [hdel](vscode-file://vscode-app/Applications/Visual%20Studio%20Code%205.app/Contents/Resources/app/out/vs/code/electron-browser/workbench/workbench.html) command Maintained the existing logic in [set_state_transition_in_progress](vscode-file://vscode-app/Applications/Visual%20Studio%20Code%205.app/Contents/Resources/app/out/vs/code/electron-browser/workbench/workbench.html) that removes 'transition_start_time' from both local state and database when transitioning to 'False' Ensured consistency between local dictionary state and database state by properly implementing field deletion The fix addresses the root cause of the AttributeError while preserving the intended behavior of cleaning up transition timestamps when state transitions complete. Code Changes: Added [delete_field(self, table, key, field)](vscode-file://vscode-app/Applications/Visual%20Studio%20Code%205.app/Contents/Resources/app/out/vs/code/electron-browser/workbench/workbench.html) method to [StateDBHelper](vscode-file://vscode-app/Applications/Visual%20Studio%20Code%205.app/Contents/Resources/app/out/vs/code/electron-browser/workbench/workbench.html) class Method uses [client.hdel(redis_key, field)](vscode-file://vscode-app/Applications/Visual%20Studio%20Code%205.app/Contents/Resources/app/out/vs/code/electron-browser/workbench/workbench.html) to properly delete fields from Redis database Existing call to [state_db.delete_field()](vscode-file://vscode-app/Applications/Visual%20Studio%20Code%205.app/Contents/Resources/app/out/vs/code/electron-browser/workbench/workbench.html) on line 85 now works correctly ### How to verify it Make the DPU midplane unreachable for one of the DPUs Toggle the DPU ON/OFF state a couple of times using config chassis modules startup/shutdown DPUx Verify that the commands complete without AttributeError crashes Confirm that 'transition_start_time' field is properly removed from STATE_DB when transitions complete Check that both local state and database state remain synchronized Why this approach vs. removing the line: This fix maintains the original intended behavior of cleaning up transition timestamps, ensuring database consistency and preventing stale data accumulation, while properly implementing the missing functionality that was causing the crash. Which release branch to backport (provide reason below if selected) [x] 202505 [x] 202506
1 parent 5f2360d commit c3eecb8

File tree

2 files changed

+42
-8
lines changed

2 files changed

+42
-8
lines changed

config/chassis_modules.py

Lines changed: 17 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@
66
import subprocess
77
import utilities_common.cli as clicommon
88
from utilities_common.chassis import is_smartswitch, get_all_dpus
9-
from datetime import datetime, timedelta
9+
from datetime import datetime, timedelta, timezone
1010

1111
TIMEOUT_SECS = 10
1212
TRANSITION_TIMEOUT = timedelta(seconds=240) # 4 minutes
@@ -27,6 +27,12 @@ def set_entry(self, table, key, entry):
2727
for field, value in entry.items():
2828
self.db.set("STATE_DB", redis_key, field, value)
2929

30+
def delete_field(self, table, key, field):
31+
"""Delete a specific field from table|key."""
32+
redis_key = f"{table}|{key}"
33+
client = self.db.get_redis_client("STATE_DB")
34+
return client.hdel(redis_key, field)
35+
3036
#
3137
# 'chassis_modules' group ('config chassis_modules ...')
3238
#
@@ -72,8 +78,9 @@ def set_state_transition_in_progress(db, chassis_module_name, value):
7278
entry = state_db.get_entry('CHASSIS_MODULE_TABLE', chassis_module_name) or {}
7379
entry['state_transition_in_progress'] = value
7480
if value == 'True':
75-
entry['transition_start_time'] = datetime.utcnow().isoformat()
81+
entry['transition_start_time'] = datetime.now(timezone.utc).isoformat()
7682
else:
83+
# Remove transition_start_time from both local entry and database
7784
entry.pop('transition_start_time', None)
7885
state_db.delete_field('CHASSIS_MODULE_TABLE', chassis_module_name, 'transition_start_time')
7986
state_db.set_entry('CHASSIS_MODULE_TABLE', chassis_module_name, entry)
@@ -92,7 +99,14 @@ def is_transition_timed_out(db, chassis_module_name):
9299
start_time = datetime.fromisoformat(start_time_str)
93100
except ValueError:
94101
return False
95-
return datetime.utcnow() - start_time > TRANSITION_TIMEOUT
102+
103+
# Use UTC everywhere for consistent comparison
104+
current_time = datetime.now(timezone.utc)
105+
if start_time.tzinfo is None:
106+
# If stored time is naive, assume it's UTC
107+
start_time = start_time.replace(tzinfo=timezone.utc)
108+
109+
return current_time - start_time > TRANSITION_TIMEOUT
96110

97111
#
98112
# Name: check_config_module_state_with_timeout

tests/chassis_modules_test.py

Lines changed: 25 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
import sys
22
import os
33
from click.testing import CliRunner
4-
from datetime import datetime, timedelta
4+
from datetime import datetime, timedelta, timezone
55
from config.chassis_modules import (
66
set_state_transition_in_progress,
77
is_transition_timed_out,
@@ -489,7 +489,7 @@ def test_shutdown_triggers_transition_in_progress(self):
489489
fvs = {
490490
'admin_status': 'up',
491491
'state_transition_in_progress': 'True',
492-
'transition_start_time': datetime.utcnow().isoformat()
492+
'transition_start_time': datetime.now(timezone.utc).isoformat()
493493
}
494494
db.cfgdb.set_entry('CHASSIS_MODULE', "DPU0", fvs)
495495

@@ -516,7 +516,7 @@ def test_shutdown_triggers_transition_timeout(self):
516516
fvs = {
517517
'admin_status': 'up',
518518
'state_transition_in_progress': 'True',
519-
'transition_start_time': (datetime.utcnow() - timedelta(minutes=30)).isoformat()
519+
'transition_start_time': (datetime.now(timezone.utc) - timedelta(minutes=30)).isoformat()
520520
}
521521
db.cfgdb.set_entry('CHASSIS_MODULE', "DPU0", fvs)
522522

@@ -592,15 +592,35 @@ def test_is_transition_timed_out_all_paths(self):
592592
assert is_transition_timed_out(db, "DPU0") is False
593593

594594
# Case 4: Timed out
595-
old_time = (datetime.utcnow() - TRANSITION_TIMEOUT - timedelta(seconds=1)).isoformat()
595+
old_time = (datetime.now(timezone.utc) - TRANSITION_TIMEOUT - timedelta(seconds=1)).isoformat()
596596
db.statedb.get_entry.return_value = {"transition_start_time": old_time}
597597
assert is_transition_timed_out(db, "DPU0") is True
598598

599599
# Case 5: Not timed out yet
600-
now = datetime.utcnow().isoformat()
600+
now = datetime.now(timezone.utc).isoformat()
601601
db.statedb.get_entry.return_value = {"transition_start_time": now}
602602
assert is_transition_timed_out(db, "DPU0") is False
603603

604+
def test_delete_field(self):
605+
"""Single test to cover missing delete_field and timezone handling lines"""
606+
from config.chassis_modules import StateDBHelper
607+
from datetime import timezone
608+
609+
# Test delete_field method (covers lines 32-34)
610+
mock_sonic_db = mock.MagicMock()
611+
mock_redis_client = mock.MagicMock()
612+
mock_sonic_db.get_redis_client.return_value = mock_redis_client
613+
helper = StateDBHelper(mock_sonic_db)
614+
helper.delete_field('TEST_TABLE', 'test_key', 'test_field')
615+
mock_redis_client.hdel.assert_called_once_with("TEST_TABLE|test_key", "test_field")
616+
617+
# Test timezone-aware datetime handling (covers line 109)
618+
db = mock.MagicMock()
619+
db.statedb = mock.MagicMock()
620+
tz_time = (datetime.now(timezone.utc) - TRANSITION_TIMEOUT - timedelta(seconds=1)).isoformat()
621+
db.statedb.get_entry.return_value = {"transition_start_time": tz_time}
622+
assert is_transition_timed_out(db, "DPU0") is True
623+
604624
@classmethod
605625
def teardown_class(cls):
606626
print("TEARDOWN")

0 commit comments

Comments
 (0)