Skip to content

Add support for graceful termination#20

Open
spectrumjade wants to merge 2 commits intoslackhq:masterfrom
spectrumjade:graceful_termination
Open

Add support for graceful termination#20
spectrumjade wants to merge 2 commits intoslackhq:masterfrom
spectrumjade:graceful_termination

Conversation

@spectrumjade
Copy link
Copy Markdown
Contributor

  • I've read and understood the Contributing guidelines and have done my best effort to follow them.
  • I've read and agree to the Code of Conduct.
  • I've been mindful about doing atomic commits, adding documentation to my changes, not refactoring too much.
  • I've a descriptive title and added any useful information for the reviewer. Where appropriate, I've attached a screenshot and/or screencast (gif preferrably).
  • I've written tests to cover the new code and functionality included in this PR.
  • I've read, agree to, and signed the Contributor License Agreement (CLA).

PR Summary

This change adds a signal handler that responds to SIGINT, SIGTERM, and SIGQUIT. It will unregister itself with kaudit and close the netlink socket (similar to what vanilla auditd would do). It will also cause the main event loop to end and gracefully exit with a 0 status code.

I've noticed that without this, the system console would get some error messages for the first audit log generated after go-audit exited. Setting the audit PID to 0 seems to be the way to effectively unregister the userspace process.

@codecov-io
Copy link
Copy Markdown

codecov-io commented Jan 30, 2017

Codecov Report

Merging #20 into master will increase coverage by -1.12%.

@@            Coverage Diff             @@
##           master      #20      +/-   ##
==========================================
- Coverage   67.04%   65.93%   -1.12%     
==========================================
  Files           5        5              
  Lines         437      455      +18     
==========================================
+ Hits          293      300       +7     
- Misses        131      141      +10     
- Partials       13       14       +1
Impacted Files Coverage Δ
audit.go 54.24% <ø> (-2.41%)
client.go 90.41% <83.33%> (-1.78%)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 752b335...039e07b. Read the comment docs.

@nbrownus
Copy link
Copy Markdown
Collaborator

I like the idea of this but I think we can do it without the loop condition. NetlinkClient.keepConnection could be stop taking the socket and we call NetlinkClient.Close when the signal comes. The signal goroutine could then sleep for a brief moment and then exit the process.

@spectrumjade
Copy link
Copy Markdown
Contributor Author

Hey @nbrownus - I'll prepare a change to remove the loop from the signal handler goroutine. What I found is that, when the socket is closed (or perhaps it's when unsetting the PID), the netlink client receives a confirmation packet, which causes the main loop to iterate once and then bail because process_events is now false. To me, that seems cleaner than having the goroutine perform the exit.

Incidentally, I'm curious as to why there is a loop around KeepConnection(). Shouldn't it be sufficient to set the audit PID once, upon startup?

@nbrownus
Copy link
Copy Markdown
Collaborator

The reason for the loop is that until recently there could only be one process listening for audit events. We had some trouble with auditd coming back to life and snatching the socket from us, in some of our kernels there was no indication that we lost the socket. That loop was our solution to keeping the socket. Pair that with the lost message notifications and you can figure out that there is contention on the socket and investigate.

I am hesitant about the loop condition because on a busy system it will probably become noticeable. It certainly is more readable the way you have it now though.

@spectrumjade
Copy link
Copy Markdown
Contributor Author

Ah, I see. I think I misunderstood you originally.

The reason I didn't do it that way originally is because the result would be an infinite loop once the socket is closed (which would print the Error during message receive error message for each iteration) until the goroutine finished waiting and triggered the exit.

Perhaps a better approach would be handling a socket close error in the main loop differently than other errors and exit the loop immediately, but this could mask an error case where the socket was closed unexpectedly (although I'm not sure how often this happens in reality). The current approach with a boolean conditional sidesteps that issue.

@spectrumjade
Copy link
Copy Markdown
Contributor Author

@nbrownus Any thoughts?

@spectrumjade
Copy link
Copy Markdown
Contributor Author

bump

@CLAassistant
Copy link
Copy Markdown

CLAassistant commented Jul 30, 2019

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants