-
-
Notifications
You must be signed in to change notification settings - Fork 417
Description
It seems that running systemd units can directly tell the framework what sort of timeouts to expect at run-time. Possibly other service management frameworks have similar features.
- See
EXTEND_TIMEOUT_USECandWATCHDOG_USECin https://www.freedesktop.org/software/systemd/man/sd_notify.html
This RFE is about making use of such facility to tune the timeouts to current circumstances; e.g. for startup - if ups.conf says we are going to initial-walk a ton of slow SNMP UPSes for 5 minutes, do not require to manually or semi-automatically (NDE) fix unit definitions to reflect that. For watchdogs, this could tie into factoring some 2x-3x-5x times the expected upsdrv_update frequency (covering partial/full data walk time and possible sleeps between that; possibly even different timeouts for frequent partial vs. rare full updates). For such example, if the data walk blocks e.g. because the device does not respond quickly enough, or due to coding errors in NUT, the main loop would not iterate in time to ping the watchdog and the driver daemon would get killed and restarted by the framework. It should be generous enough to allow for lags, but not infinitely generous to lose the UPS monitoring ability (e.g. device disconnected and going back to restart as root is needed to reconnect).
With #1777 the foundations for this tech were laid into the codebase; however WatchdogSec (for systemd units) was not pre-defined and thus not enabled by default, making it an end-user configuration (via drop-in tweaks). It was tested locally to work though.