Improve AutoOpsAgentPolicy Status Reporting#9095
Open
moukoublen wants to merge 1 commit intoelastic:mainfrom
Open
Improve AutoOpsAgentPolicy Status Reporting#9095moukoublen wants to merge 1 commit intoelastic:mainfrom
moukoublen wants to merge 1 commit intoelastic:mainfrom
Conversation
10 tasks
Collaborator
✅ Snyk checks have passed. No issues have been found so far.
💻 Catch issues earlier using the plugins for VS Code, JetBrains IDEs, Visual Studio, and Eclipse. |
161bc7c to
c928cdc
Compare
c928cdc to
2c646f9
Compare
2c646f9 to
20a911b
Compare
20a911b to
fdefc8c
Compare
fdefc8c to
62c9945
Compare
62c9945 to
715dcb1
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Improve AutoOpsAgentPolicy Status Reporting
Summary
kubectl get autoopsagentpolicieschanged to add more information.New Status Fields
Skipped(int)The number of Elasticsearch resources that are skipped from monitoring due to RBAC permission issues. When the operator is configured with
--enforce-rbac-on-refsand the specifiedserviceAccountNamelacks permission to access an Elasticsearch resource in a different namespace, that resource is counted as skipped rather than errored.Example:
skipped: 2indicates 2 Elasticsearch clusters couldn't be monitored due to insufficient RBAC permissions.ReadyCount(string)A human-readable string showing the ratio of ready monitored resources to total monitored resources in the format
Ready/Resources. This provides an at-a-glance view of the policy's health without needing to compare separate numeric fields.Example:
readyCount: "3/5"indicates 3 out of 5 matched Elasticsearch clusters have healthy AutoOps agents deployed.Message(string)A human-readable summary of the current status, combining information about ready resources, errors, and skipped resources into a single descriptive string. The message is dynamically generated based on non-zero counts.
Examples:
"3 resource ready"- all resources healthy"2 resource ready, 1 error"- partial success with errors"1 resource ready, 1 error, 2 skipped due to RBAC"- mixed status with RBAC issuesDetails(map[string]ResourceStatus)A map providing per-resource status information, keyed by resource identifier in the format
namespace/name. Only resources with non-ready states (errors or skipped) are included in this map to keep the status lightweight. Each entry contains:Phase(ResourcePhase): Either"Error"or"Skipped"Message(string): Human-readable explanation, set for skipped resources (e.g.,"RBAC access denied for service account my-sa")Error(string): Detailed error information, set only for error states (e.g.,"Failed to create AutoOps ES CA secret: secret not found")Example:
New Types
ResourceStatus(struct)A lightweight struct for per-resource status information:
Phase: The resource phase (ErrororSkipped)Message: Human-readable explanation (only for non-ready states)Error: Error details (only for error states)ResourcePhase(string)An enumeration for resource-level phases:
ErrorResourcePhase("Error"): Resource reconciliation failedSkippedResourcePhase("Skipped"): Resource skipped due to RBACOther Changes
Renamed Method
CalculateFinalPhase()→Finalize(): Now also generates the human-readableMessageandReadyCountfields at the end of reconciliationEnhanced Error Tracking
Replaced generic
MarkResourceError()calls with specific error methods that capture the failing operation:ResourceRBACError(es): RBAC access deniedResourceError(es, message, err): Captures specific failures for CA secret, API key, config map, and deployment operationskubectl get autoopsagentpoliciesExamples: