Skip to content
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
a30e440
Add functionality to notify systemd on olad startup and config reload
shenghaoyang Jul 12, 2018
a5e57fe
Add systemd notification support to olad v2
shenghaoyang Jul 14, 2018
183ac8d
Fix license and doxygen warnings.
shenghaoyang Jul 14, 2018
693fe33
Merge branch 'master' into add_notify
shenghaoyang Jul 15, 2018
0257949
Merge branch 'master' into add_notify
shenghaoyang Jul 16, 2018
50c9bda
Style fixups for commits to add systemd notification functionality
shenghaoyang Jul 16, 2018
29556d0
Add additional checks and error messages when configuring for systemd
shenghaoyang Jul 16, 2018
b7afd38
Style changes for function names and log messages
shenghaoyang Jul 25, 2018
83807d4
Enable systemd support by default on supported build systems
shenghaoyang Jul 25, 2018
469361d
Add comment clarifying usage of strerror_r
shenghaoyang Jul 25, 2018
f15b385
Add a declaration for XSI-compliant strerror_r
shenghaoyang Jul 25, 2018
d674930
Fix formatting issues in log messages
shenghaoyang Jul 28, 2018
bf8deff
Move strerror_r wrapper into utility library
shenghaoyang Jul 28, 2018
175b393
Add new wrapper function for strerror_r and update naming
shenghaoyang Aug 9, 2018
17d917c
Merge branch 'master' into add_notify
shenghaoyang Aug 9, 2018
1cd1202
Refactor StrError_R() and fix doxygen references.
shenghaoyang Aug 10, 2018
64400b8
Merge branch 'master' into add_notify
shenghaoyang Aug 11, 2018
d62a8f6
Fix formatting, style, and spelling, refactor StrError_R
shenghaoyang Aug 12, 2018
ff34724
Add config error on being unable to explicitly enable systemd support
shenghaoyang Aug 12, 2018
f6f5813
Switch to internal ola library function for int to string conversion
shenghaoyang Aug 12, 2018
15efd6f
Merge branch 'master' into add_notify
shenghaoyang Aug 19, 2018
18735c4
Merge branch 'master' into add_notify
shenghaoyang Aug 25, 2018
a21f0a4
Add systemd unit file for running olad with systemd in user mode
shenghaoyang Aug 25, 2018
c714ee8
Add more documentation URLs to the unit file for the user instance.
shenghaoyang Sep 10, 2018
527b115
Merge branch 'master' into add_notify
shenghaoyang Sep 11, 2018
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions configure.ac
Original file line number Diff line number Diff line change
Expand Up @@ -387,6 +387,13 @@ esac
have_libftd2xx="no"
AM_CONDITIONAL([HAVE_LIBFTD2XX], [test "x$have_libftd2xx" = xyes])

# libsystemd
AC_SEARCH_LIBS([sd_notify], [systemd], [have_libsystemd="yes"])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you see have_libftd2xx above, you should really initialise it to no before you use it here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No problem. I'll follow the examples from the above location.

What do you think about making linking against libsystemd a command line option to the build system? I'm thinking build systems that have libsystemd may silently cause built binaries to link against libsystemd, and this could lead to a bit of a dependency problem, unless they're aware of this new change.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it may make sense, see e.g. https://github.com/OpenLightingProject/ola/blob/master/configure.ac#L494-L518 .

I was wondering, but at least it's only libsystemd, it doesn't require systemd itself.

AC_CHECK_HEADER([systemd/sd-daemon.h], [], [have_libsystemd="no"])

AS_IF([test "x$have_libsystemd" = xyes],
[AC_DEFINE([HAVE_LIBSYSTEMD], [1], [define if libsystemd is installed])])

# ncurses
AC_CHECK_LIB([ncurses], [initscr], [have_ncurses="yes"])
AM_CONDITIONAL([HAVE_NCURSES], [test "x$have_ncurses" = xyes])
Expand Down
12 changes: 12 additions & 0 deletions olad/OlaServer.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,10 @@
#include <config.h>
#endif // HAVE_CONFIG_H

#ifdef HAVE_LIBSYSTEMD
#include <systemd/sd-daemon.h>
#endif // HAVE_LIBSYSTEMD

#include <errno.h>
#include <signal.h>
#include <stdio.h>
Expand Down Expand Up @@ -472,9 +476,17 @@ bool OlaServer::InternalNewConnection(
}

void OlaServer::ReloadPluginsInternal() {
#ifdef HAVE_LIBSYSTEMD
// Return value is intentionally not checked for both calls.
// See return value section under sd_notify(3).
sd_notify(0, "RELOADING=1\nSTATUS=Reloading plugins\n");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having read the man page, I understand why it's not checked, but for user sanity, we should add some logging. Probably both to signify sd_notify has been run/the return code and particularly a warning if it failed.

I'd probably suggest OLA_WARN if we get <0 return code, and then maybe OLA_INFO for 0 and OLA_DEBUG for >0, but that may be a bit excessive. Likewise others may argue those log levels should all be reduced by one...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if the system is too old to support RELOADING? I assume they just get discarded as they're plain text?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if the system is too old to support RELOADING? I assume they just get discarded
as they're plain text?

sd_notify(3) mentions this: The string may contain any kind of variable assignments, but the following shall be considered well-known.

I think that does indeed imply that if the service manager doesn't recognize the variable being assigned to, it'll either log that somewhere or ignore that assignment, so we shouldn't have a problem with old systems.

I understand why it's not checked, but for user sanity, we should add some logging. Probably
both to signify sd_notify has been run/the return code and particularly a warning if it failed.

If that's the case, would a message at startup do? libsystemd provides capabilities for determining whether a binary has been setup under the control of systemd, and that could be used to indicate that notification messages are being used. As for the failures, though, we could log those as well, but I think conditionally - only when being run under systemd

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Scratch that, there's no library function to determine whether the service has been run under systemd, but the call shouldn't return a negative value if the socket was not passed, anyway. I'll log failures and log from main to indicate whether notifications are being passed.

#endif // HAVE_LIBSYSTEMD
OLA_INFO << "Reloading plugins";
StopPlugins();
m_plugin_manager->LoadAll();
#ifdef HAVE_LIBSYSTEMD
sd_notify(0, "READY=1\nSTATUS=Plugin reload complete\n");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, some logging to help the user.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto.

#endif // HAVE_LIBSYSTEMD
}

void OlaServer::UpdatePidStore(const RootPidStore *pid_store) {
Expand Down
9 changes: 9 additions & 0 deletions olad/Olad.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,10 @@
#include <config.h>
#endif // HAVE_CONFIG_H

#if HAVE_LIBSYSTEMD
#include <systemd/sd-daemon.h>
#endif // HAVE_LIBSYSTEMD

#include <signal.h>
#include <stdio.h>
#include <stdlib.h>
Expand Down Expand Up @@ -171,6 +175,11 @@ int main(int argc, char *argv[]) {
ola::NewCallback(olad->GetOlaServer(), &ola::OlaServer::ReloadPlugins));
#endif // _WIN32

#if HAVE_LIBSYSTEMD
// Return value is intentionally not checked. See return value section
// under sd_notify(3).
sd_notify(0, "READY=1\nSTATUS=Startup complete\n");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, a bit of logging please.

Copy link
Contributor Author

@shenghaoyang shenghaoyang Jul 14, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm.. I don't think we need it for this one, though. It could be useful to tell the user that we're not running under systemd, or that we have failed to access the notification socket due to a configuration error.

However, if olad was launched as a systemd service of type notify (which is the intended use case): then systemd will report that the start of the service failed, anyway, which should alert the user to the error. There also shouldn't be an error accesing the notification socket, because systemd implicitly allows services with type=notify to access the notification socket.

A failure looks like this:

● olad.service - ola daemon
   Loaded: loaded (/home/shenghao/.config/systemd/user/olad.service; static; vendor preset: enabled)
   Active: failed (Result: timeout) since Sat 2018-07-14 11:43:42 +08; 9s ago
  Process: 25379 ExecStart=/home/shenghao/Projects/ola/olad/olad (code=exited, status=0/SUCCESS)
 Main PID: 25379 (code=exited, status=0/SUCCESS)

Jul 14 11:42:13 shenghao-crbook15 olad[25379]: *** WARNING *** Please fix your application to use the native API of Avahi!
Jul 14 11:42:13 shenghao-crbook15 olad[25379]: *** WARNING *** For more information see <http://0pointer.de/blog/projects/avahi-compat.html>
Jul 14 11:42:13 shenghao-crbook15 lt-olad[25379]: *** WARNING *** For more information see <http://0pointer.de/blog/projects/avahi-compat.html>
Jul 14 11:42:13 shenghao-crbook15 olad[25379]: common/io/IOUtils.cpp:39: open(/dev/dmx0): No such file or directory
Jul 14 11:42:13 shenghao-crbook15 olad[25379]: plugins/opendmx/OpenDmxPlugin.cpp:81: Could not open /dev/dmx0 No such file or directory
Jul 14 11:42:13 shenghao-crbook15 olad[25379]: common/io/IOUtils.cpp:39: open(/dev/kldmx0): No such file or directory
Jul 14 11:42:13 shenghao-crbook15 olad[25379]: plugins/karate/KaratePlugin.cpp:80: Could not open /dev/kldmx0 No such file or directory
Jul 14 11:43:42 shenghao-crbook15 systemd[3896]: olad.service: Start operation timed out. Terminating.
Jul 14 11:43:42 shenghao-crbook15 systemd[3896]: olad.service: Failed with result 'timeout'.
Jul 14 11:43:42 shenghao-crbook15 systemd[3896]: Failed to start ola daemon.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's those edge cases that can be frustrating to troubleshoot though. If I forget to put type=notify, I want olad to tell me the socket isn't available, so I know to tweak my systemd config. Likewise in that timeout case, I'd like to know olad did/didn't send the message so I can troubleshoot ifdef type stuff.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I've added this in commit a5e57fe. A success/fail message everytime may clog up the logs if we want to use notify() a little more, though....

#endif // HAVE_LIBSYSTEMD
olad->Run();
return ola::EXIT_OK;
}