Skip to content

Commit fb3a130

Browse files
phlogistonjohnmergify[bot]
authored andcommitted
sambacc: avoid logging an error if cluster is being torn down
Saw this in a ceph teuthology run: ``` 2024-08-20 20:39:57,289: DEBUG: Creating RADOS connection 2024-08-20 20:39:57,333: INFO: cluster meta content changed 2024-08-20 20:39:57,333: DEBUG: cluster meta: previous={'nodes': [{'pnn': 0, 'identity': 'smb.adctdb1.0.0.ceph0.kdlxgn', 'node': '192.168.76.200', 'state': 'ready'}, {'pnn': 1, 'identity': 'smb.adctdb1.1.0.ceph1.ngbqkk', 'node': '192.168.76.201', 'state': 'ready'}, {'pnn': 2, 'identity': 'smb.adctdb1.2.0.ceph2.rhmqnu', 'node': '192.168.76.202', 'state': 'ready'}], '_source': 'cephadm'} current={} 2024-08-20 20:39:57,333: ERROR: error during ctdb_monitor_nodes: max() arg is an empty sequence, count=0 Traceback (most recent call last): File "/usr/lib/python3.9/site-packages/sambacc/commands/ctdb.py", line 479, in catch yield File "/usr/lib/python3.9/site-packages/sambacc/commands/ctdb.py", line 360, in ctdb_monitor_nodes ctdb.monitor_cluster_meta_changes( File "/usr/lib/python3.9/site-packages/sambacc/ctdb.py", line 561, in monitor_cluster_meta_changes expected_nodes = _cluster_meta_to_ctdb_nodes( File "/usr/lib/python3.9/site-packages/sambacc/ctdb.py", line 506, in _cluster_meta_to_ctdb_nodes pnn_max = max(n["pnn"] for n in nodes) + 1 # pnn is zero indexed ValueError: max() arg is an empty sequence ``` I could see from the ceph logs the smb cluster was being removed right around this time. If we had nodes and they suddenly vanish we're likely in the process of getting removed and we raced a tad with cephadm removing services while the smb mgr module was removing the contents of the .smb pool. Signed-off-by: John Mulligan <[email protected]>
1 parent 2cdbd79 commit fb3a130

File tree

1 file changed

+7
-0
lines changed

1 file changed

+7
-0
lines changed

sambacc/ctdb.py

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -551,6 +551,13 @@ def monitor_cluster_meta_changes(
551551
if curr_meta == prev_meta:
552552
_logger.debug("cluster meta content unchanged: %r", curr_meta)
553553
continue
554+
if len(prev_meta) > 0 and len(curr_meta) == 0:
555+
# cluster is possibly (probably?) being destroyed.
556+
# Return from this loop and let the command-level loop decide if
557+
# this function needs to be restarted or not. There's a chance this
558+
# process will be terminated very soon anyway.
559+
_logger.warning("no current nodes available")
560+
return
554561
_logger.info("cluster meta content changed")
555562
_logger.debug(
556563
"cluster meta: previous=%r current=%r", prev_meta, curr_meta

0 commit comments

Comments
 (0)