-
Notifications
You must be signed in to change notification settings - Fork 188
[8.19] (backport #9612) Add /readiness and /liveness when enrolling with the container #9648
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: 8.19
Are you sure you want to change the base?
Conversation
* Enable health checking pre-enroll in container start-up path. Add readiness endpoint. * Fix func signature. * Fix tests. * Add changelog entry. * Fix imports. * Apply suggestion from @ycombinator Co-authored-by: Shaunak Kashyap <[email protected]> --------- Co-authored-by: Shaunak Kashyap <[email protected]> (cherry picked from commit c028f68) # Conflicts: # internal/pkg/agent/application/monitoring/server_test.go # internal/pkg/agent/cmd/container.go
Cherry-pick of c028f68 has failed:
To fix up this pull request, you can check it out locally. See documentation: https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/reviewing-changes-in-pull-requests/checking-out-pull-requests-locally |
Pinging @elastic/elastic-agent-control-plane (Team:Elastic-Agent-Control-Plane) |
This pull request has not been merged yet. Could you please review and merge it @blakerouse? 🙏 |
What does this PR do?
Starts the monitoring endpoint when Elastic Agent is running as a container and has enrollment enabled. This allows the healthchecks in Kubernetes to succeed even when it is enrolling into Fleet. If enrollment to Fleet takes longer than the k8s healthchecks then it can result in the pod not being reported as healthy, which can cause it to be killed and then it never enrolls.
This also adds a simple
/readiness
endpoint that provides a different between ready and alive. There is a difference, and this provides the difference between the two.Why is it important?
This ensures that healthchecks do not prevent enrollment from actually working. Enrollment will still fail and the container will restart, giving the same behavior.
Checklist
[ ] I have made corresponding changes to the documentation[ ] I have made corresponding change to the default configuration files./changelog/fragments
using the changelog tool[ ] I have added an integration test or an E2E testDisruptive User Impact
None
How to test this PR locally
Related issues
This is an automatic backport of pull request #9612 done by [Mergify](https://mergify.com).