Skip to content

Commit 5b01295

Browse files
authored
intelmqctl: On stop wait longer for bots to be stopped (#2598)
fixes #2595 retry multiple times on `intelmqctl stop` to check if bots really stopped, since the bots might take longer to stop. Using retry in constrast to increasing the sleep_time keeps the delay short in case the bots did already stop.
1 parent c35a284 commit 5b01295

File tree

3 files changed

+45
-6
lines changed

3 files changed

+45
-6
lines changed

CHANGELOG.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,11 +14,13 @@ Please refer to the [NEWS](NEWS.md) for a list of changes which have an affect o
1414
--------------------------------
1515

1616
### Configuration
17+
- New parameter `stop_retry_limit` (PR#2598 by Lukas Heindl).
1718

1819
### Core
1920
- Drop support for Python 3.8 (fixes #2616, PR#2617 by Sebastian Wagner).
2021
- `intelmq.lib.splitreports`: Handle bot parameter `chunk_size` values empty string, due to missing parameter typing checks (PR#2604 by Sebastian Wagner).
2122
- `intelmq.lib.mixins.sql` Add Support for MySQL (PR#2625 by Karl-Johan Karlsson).
23+
- New parameter `stop_retry_limit` to gracefully handle stopping bots which take longer to shutdown (PR#2598 by Lukas Heindl, fixes #2595).
2224

2325
### Development
2426

docs/admin/configuration/intelmq.md

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -237,6 +237,13 @@ configured to do so.
237237

238238
(optional, boolean) Verify the TLS certificate of the server. Defaults to true.
239239

240+
**`stop_retry_limit`**
241+
242+
(optional, integer) amount of retries when checking the status of a botnet after issuing `intelmqctl stop`. Each retry
243+
another *0.1s* longer is waited until a maximum of *5s* to sleep in each iteration is reached. Only applies when
244+
stopping a bot*net* (not individual bots).
245+
Defaults to 5.
246+
240247
#### Individual Bot Configuration
241248

242249
!!! info

intelmq/bin/intelmqctl.py

Lines changed: 36 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -563,12 +563,42 @@ def botnet_stop(self, group=None):
563563
for bot_id in bots:
564564
self.bot_stop(bot_id, getstatus=False)
565565

566-
retval = 0
567-
time.sleep(0.75)
568-
for bot_id in bots:
569-
botnet_status[bot_id] = self.bot_status(bot_id)[1]
570-
if botnet_status[bot_id] not in ['stopped', 'disabled']:
571-
retval = 1
566+
# shallow copy of the list suffices
567+
# only aliasing the list to ease reading the following
568+
stopped_but_still_running_bots = bots
569+
570+
retries = getattr(self._parameters, 'stop_retry_limit', 5)
571+
572+
# parameters (default):
573+
# - sleep 0.75 s with an increment of 0.1
574+
# - at most 5 tries
575+
# => sleep-ing at most 4.75 seconds
576+
sleep_time = 0.75 # in seconds
577+
for _ in range(retries):
578+
# give the bots some time to terminate
579+
time.sleep(sleep_time)
580+
# update the botnet_status
581+
for bot_id in stopped_but_still_running_bots:
582+
botnet_status[bot_id] = self.bot_status(bot_id)[1]
583+
# only keep bots in the list which are not stopped already
584+
stopped_but_still_running_bots = [
585+
bot_id
586+
for bot_id in stopped_but_still_running_bots
587+
if botnet_status[bot_id] not in ['stopped', 'disabled']
588+
]
589+
590+
# check if all bots are stopped -> no need to wait further
591+
if not stopped_but_still_running_bots:
592+
break
593+
# the longer the bots need to terminate the longer we wait to check
594+
# again to avoid long-term load on the system
595+
# but stop at 5 seconds to avoid waiting too long until rechecking
596+
# the status
597+
sleep_time = min(5, sleep_time + 0.1)
598+
599+
retval = 1
600+
if len(stopped_but_still_running_bots) == 0:
601+
retval = 0
572602

573603
self.log_botnet_message('stopped', group)
574604
return retval, botnet_status

0 commit comments

Comments
 (0)