-
Notifications
You must be signed in to change notification settings - Fork 188
Liveness agent state #9673
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Liveness agent state #9673
Conversation
…e and remove unused vars
…r (already handled)
This pull request does not have a backport label. Could you fix it @nkvoll? 🙏
|
@@ -76,12 +77,10 @@ func livenessHandler(coord CoordinatorState) func(http.ResponseWriter, *http.Req | |||
return fmt.Errorf("error handling form values: %w", err) | |||
} | |||
|
|||
// if user has requested `coordinator` mode, just revert to that, skip everything else | |||
if !failConfig.Degraded && !failConfig.Failed && failConfig.Heartbeat { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed this part as it's already covered by line 70-73, without the encapsulating if-statement.
From my testing, if this is the startup-state of the agent, it doesn't seem to start any components, but if configuration is edited while the agent is running, it keeps all existing components as-is. This makes me wonder if what currently happens in the liveness endpoint should be happening in the readiness endpoint instead. Worth discussing? /cc @cmacknz @blakerouse |
|
💛 Build succeeded, but was flaky
Failed CI Stepscc @nkvoll |
Pinging @elastic/elastic-agent-control-plane (Team:Elastic-Agent-Control-Plane) |
What does this PR do?
This PR includes the aggregated status of the agent node to the liveness health check.
As a bonus, it also adds status code assertion to the tests, which were missing before. (All liveness/readiness tests were passing without any assertions).
Why is it important?
Checklist
./changelog/fragments
using the changelog toolDisruptive User Impact
Liveness probes will now fail if the configuration is invalid, likely causing the container to be restarted (see https://kubernetes.io/docs/concepts/configuration/liveness-readiness-startup-probes/#liveness-probe).
How to test this PR locally
elastic-agent.yml
file with an invalid output, i.e setuse_output: nonexistent
elastic-agent status
Related issues