Skip to content

NUT services vs. watchdogs and other timeouts: self-configure where possible #1791

@jimklimov

Description

@jimklimov

It seems that running systemd units can directly tell the framework what sort of timeouts to expect at run-time. Possibly other service management frameworks have similar features.

This RFE is about making use of such facility to tune the timeouts to current circumstances; e.g. for startup - if ups.conf says we are going to initial-walk a ton of slow SNMP UPSes for 5 minutes, do not require to manually or semi-automatically (NDE) fix unit definitions to reflect that. For watchdogs, this could tie into factoring some 2x-3x-5x times the expected upsdrv_update frequency (covering partial/full data walk time and possible sleeps between that; possibly even different timeouts for frequent partial vs. rare full updates). For such example, if the data walk blocks e.g. because the device does not respond quickly enough, or due to coding errors in NUT, the main loop would not iterate in time to ping the watchdog and the driver daemon would get killed and restarted by the framework. It should be generous enough to allow for lags, but not infinitely generous to lose the UPS monitoring ability (e.g. device disconnected and going back to restart as root is needed to reconnect).

With #1777 the foundations for this tech were laid into the codebase; however WatchdogSec (for systemd units) was not pre-defined and thus not enabled by default, making it an end-user configuration (via drop-in tweaks). It was tested locally to work though.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementservice/daemon start/stopGeneral subject for starting and stopping NUT daemons (drivers, server, monitor); also BG/FG/Debugsystemd

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions