-
Notifications
You must be signed in to change notification settings - Fork 86
Closed
Description
Hello,
When using clustershell with milkcheck, I have an error:
ClusterShell.Propagation.RouteResolvingError: No route available to pm4-nod01
The error comes as soon as I have a topology file.
The debug mode of milkcheck shows:
Traceback (most recent call last):bmc]
File "/usr/lib/python3.6/site-packages/MilkCheck/UI/Cli.py", line 538, in execute
self.manager.call_services(services, action, conf=self._conf)
File "/usr/lib/python3.6/site-packages/MilkCheck/ServiceManager.py", line 173, in call_services
self.run(action)
File "/usr/lib/python3.6/site-packages/MilkCheck/Engine/Service.py", line 236, in run
action_manager_self().run()
File "/usr/lib/python3.6/site-packages/MilkCheck/Engine/Action.py", line 182, in run
self._master_task.run()
File "/usr/lib/python3.6/site-packages/ClusterShell/Task.py", line 877, in run
self.resume(timeout)
File "/usr/lib/python3.6/site-packages/ClusterShell/Task.py", line 831, in resume
self._resume()
File "/usr/lib/python3.6/site-packages/ClusterShell/Task.py", line 794, in _resume
self._run(self.timeout)
File "/usr/lib/python3.6/site-packages/ClusterShell/Task.py", line 404, in _run
self._engine.run(timeout)
File "/usr/lib/python3.6/site-packages/ClusterShell/Engine/Engine.py", line 723, in run
self.runloop(timeout)
File "/usr/lib/python3.6/site-packages/ClusterShell/Engine/EPoll.py", line 170, in runloop
self.remove_stream(client, stream)
File "/usr/lib/python3.6/site-packages/ClusterShell/Engine/Engine.py", line 520, in remove_stream
self.remove(client)
File "/usr/lib/python3.6/site-packages/ClusterShell/Engine/Engine.py", line 495, in remove
self._remove(client, abort, did_timeout)
File "/usr/lib/python3.6/site-packages/ClusterShell/Engine/Engine.py", line 483, in _remove
client._close(abort=abort, timeout=did_timeout)
File "/usr/lib/python3.6/site-packages/ClusterShell/Worker/Exec.py", line 142, in _close
self.worker._check_fini()
File "/usr/lib/python3.6/site-packages/ClusterShell/Worker/Exec.py", line 384, in _check_fini
self._has_timeout)
File "/usr/lib/python3.6/site-packages/ClusterShell/Worker/Worker.py", line 55, in _eh_sigspec_invoke_compat
return method(*args)
File "/usr/lib/python3.6/site-packages/ClusterShell/Propagation.py", line 417, in ev_close
mw._relaunch(gateway)
File "/usr/lib/python3.6/site-packages/ClusterShell/Worker/Tree.py", line 404, in _relaunch
self._launch(targets)
File "/usr/lib/python3.6/site-packages/ClusterShell/Worker/Tree.py", line 265, in _launch
next_hops = self._distribute(self.task.info("fanout"), nodes.copy())
File "/usr/lib/python3.6/site-packages/ClusterShell/Worker/Tree.py", line 342, in _distribute
for gw, dstset in self.router.dispatch(dst_nodeset):
File "/usr/lib/python3.6/site-packages/ClusterShell/Propagation.py", line 106, in dispatch
yield self.next_hop(host), host
File "/usr/lib/python3.6/site-packages/ClusterShell/Propagation.py", line 141, in next_hop
str(dst))
ClusterShell.Propagation.RouteResolvingError: No route available to pm4-nod01
I cannot reproduce the error using clush only:
$ clush --remote=no -u2 -bw pm4-nod01 hostname
---------------
pm4-nod01
---------------
mngt0-2
$ clush -u2 -bw pm4-nod01 hostname
---------------
pm4-nod01
---------------
pm4-nod01
$ cat /etc/clustershell/topology.conf
[routes]
mngt0-1: mngt0-2
mngt0-2: @compute
Python version 3.6.8
In order to have a temporary fix I did change this:
--- /usr/lib/python3.6/site-packages/ClusterShell/Propagation.py.orig 2023-06-27 15:00:39.099237135 +0200
+++ /usr/lib/python3.6/site-packages/ClusterShell/Propagation.py 2023-06-27 15:00:47.504344461 +0200
@@ -405,7 +405,7 @@ class PropagationChannel(Channel):
self.logger.debug("ev_close rc=%s", self._rc) # may be None
# NOTE: self._rc may be None if the communication channel has aborted
- if self._rc != 0:
+ if self._rc != 0 and not self._rc == None:
self.logger.debug("error on gateway %s (setup=%s)", gateway,
self.setup)
self.task.router.mark_unreachable(gateway)
And this:
--- /bin/milkcheck.orig 2024-09-04 09:19:15.826180684 +0200
+++ /bin/milkcheck 2024-09-04 09:19:22.076099490 +0200
@@ -1,4 +1,4 @@
-#!/usr/libexec/platform-python
+#!/usr/bin/python3
#
# Copyright CEA (2011)
# Contributor: Jeremie TATIBOUET