Skip to content

Conversation

mergify[bot]
Copy link
Contributor

@mergify mergify bot commented Aug 29, 2025

What does this PR do?

Starts the monitoring endpoint when Elastic Agent is running as a container and has enrollment enabled. This allows the healthchecks in Kubernetes to succeed even when it is enrolling into Fleet. If enrollment to Fleet takes longer than the k8s healthchecks then it can result in the pod not being reported as healthy, which can cause it to be killed and then it never enrolls.

This also adds a simple /readiness endpoint that provides a different between ready and alive. There is a difference, and this provides the difference between the two.

Why is it important?

This ensures that healthchecks do not prevent enrollment from actually working. Enrollment will still fail and the container will restart, giving the same behavior.

Checklist

  • I have read and understood the pull request guidelines of this project.
  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • [ ] I have made corresponding changes to the documentation
  • [ ] I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works
  • I have added an entry in ./changelog/fragments using the changelog tool
  • [ ] I have added an integration test or an E2E test

Disruptive User Impact

None

How to test this PR locally

$ docker run -it --rm -p 5066:5066 -e FLEET_ENROLL=1 -e FLEET_URL=https://invalid-url:443 -e FLEET_ENROLLMENT_TOKEN=invalid-token ${build_image}
$ curl -v http://localhost:5066/readiness
$ curl -v http://localhost:5066/liveness

Related issues


This is an automatic backport of pull request #9612 done by [Mergify](https://mergify.com).

* Enable health checking pre-enroll in container start-up path. Add readiness endpoint.

* Fix func signature.

* Fix tests.

* Add changelog entry.

* Fix imports.

* Apply suggestion from @ycombinator

Co-authored-by: Shaunak Kashyap <[email protected]>

---------

Co-authored-by: Shaunak Kashyap <[email protected]>
(cherry picked from commit c028f68)

# Conflicts:
#	internal/pkg/agent/application/monitoring/server_test.go
#	internal/pkg/agent/cmd/container.go
@mergify mergify bot added backport conflicts There is a conflict in the backported pull request labels Aug 29, 2025
@mergify mergify bot requested a review from a team as a code owner August 29, 2025 17:29
@mergify mergify bot requested review from ycombinator and removed request for a team August 29, 2025 17:29
@mergify mergify bot added the conflicts There is a conflict in the backported pull request label Aug 29, 2025
@mergify mergify bot requested a review from pchila August 29, 2025 17:29
@mergify mergify bot added the backport label Aug 29, 2025
Copy link
Contributor Author

mergify bot commented Aug 29, 2025

Cherry-pick of c028f68 has failed:

On branch mergify/bp/8.19/pr-9612
Your branch is up to date with 'origin/8.19'.

You are currently cherry-picking commit c028f68fa.
  (fix conflicts and run "git cherry-pick --continue")
  (use "git cherry-pick --skip" to skip this patch)
  (use "git cherry-pick --abort" to cancel the cherry-pick operation)

Changes to be committed:
	new file:   changelog/fragments/1756325090-Fix-missing-liveness-healthcheck-during-container-enrollment.yaml
	modified:   internal/pkg/agent/application/monitoring/liveness.go
	modified:   internal/pkg/agent/application/monitoring/liveness_test.go
	modified:   internal/pkg/agent/application/monitoring/process.go
	new file:   internal/pkg/agent/application/monitoring/readiness.go
	new file:   internal/pkg/agent/application/monitoring/readiness_test.go
	modified:   internal/pkg/agent/application/monitoring/server.go
	modified:   internal/pkg/agent/application/monitoring/v1_monitor.go
	modified:   internal/pkg/agent/cmd/run.go

Unmerged paths:
  (use "git add <file>..." to mark resolution)
	both modified:   internal/pkg/agent/application/monitoring/server_test.go
	both modified:   internal/pkg/agent/cmd/container.go

To fix up this pull request, you can check it out locally. See documentation: https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/reviewing-changes-in-pull-requests/checking-out-pull-requests-locally

@github-actions github-actions bot added the Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team label Aug 29, 2025
@elasticmachine
Copy link
Collaborator

Pinging @elastic/elastic-agent-control-plane (Team:Elastic-Agent-Control-Plane)

Copy link
Contributor Author

mergify bot commented Sep 1, 2025

This pull request has not been merged yet. Could you please review and merge it @blakerouse? 🙏

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport conflicts There is a conflict in the backported pull request Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants