Skip to content

Conversation

@aarnq
Copy link
Contributor

@aarnq aarnq commented Jul 10, 2024

Warning

This is a public repository, ensure not to disclose:

  • personal data beyond what is necessary for interacting with this pull request, nor
  • business confidential information, such as customer names.

What kind of PR is this?

Required: Mark one of the following that is applicable:

  • kind/feature
  • kind/improvement
  • kind/deprecation
  • kind/documentation
  • kind/clean-up
  • kind/bug
  • kind/other

Optional: Mark one or more of the following that are applicable:

Important

Breaking changes should be marked kind/admin-change or kind/dev-change depending on type
Critical security fixes should be marked with kind/security

  • kind/admin-change
  • kind/dev-change
  • kind/security
  • kind/adr

What does this PR do / why do we need this PR?

This changes the index over size limit alert to only consider indices with a positive document rate over the last hour, meaning it will eventually stop alerting for indices that are inactive such as after a successful rollover.

Information to reviewers

I haven't had the time to test it yet.

Checklist

  • Proper commit message prefix on all commits
  • Change checks:
    • The change is transparent
    • The change is disruptive
    • The change requires no migration steps
    • The change requires migration steps
    • The change upgrades CRDs
    • The change updates the config and the schema
  • Metrics checks:
    • The metrics are still exposed and present in Grafana after the change
    • The metrics names didn't change (Grafana dashboards and Prometheus alerts are not affected)
    • The metrics names did change (Grafana dashboards and Prometheus alerts were fixed)
  • Logs checks:
    • The logs do not show any errors after the change
  • Pod Security Policy checks:
    • Any changed pod is covered by Pod Security Admission
    • Any changed pod is covered by Gatekeeper Pod Security Policies
    • The change does not cause any pods to be blocked by Pod Security Admission or Policies
  • Network Policy checks:
    • Any changed pod is covered by Network Policies
    • The change does not cause any dropped packets in the NetworkPolicy Dashboard
  • Audit checks:
    • The change does not cause any unnecessary Kubernetes audit events
    • The change requires changes to Kubernetes audit policy
  • Falco checks:
    • The change does not cause any alerts to be generated by Falco
  • Bug checks:
    • The bug fix is covered by regression tests

@aarnq aarnq added the kind/improvement Improvement of existing features, e.g. code cleanup or optimizations. label Jul 10, 2024
@aarnq aarnq self-assigned this Jul 10, 2024
@aarnq
Copy link
Contributor Author

aarnq commented Dec 13, 2024

@elastisys/goto-logging-stack Would anyone of you like to take over this idea to make the alerts only report indices that are actively written to?
I have not the time to properly test it.

@lunkan93
Copy link
Contributor

Do you have time for this @salehsedghpour ? I might have time next week, but otherwise I wont have time until January, if not I can take it anyway.

@salehsedghpour
Copy link

Yeah sure, I'll test it, I might have time for it next week.

@salehsedghpour salehsedghpour self-assigned this Dec 13, 2024
@salehsedghpour
Copy link

I've tested it and it works perfectly. I believe the use of elasticsearch_indices_docs_primary instead of elasticsearch_indices_docs_total would be more precise, as the primary shard should get the and validate the values and then forward to other replicas. What do you think @lunkan93?

@lunkan93
Copy link
Contributor

I've tested it and it works perfectly. I believe the use of elasticsearch_indices_docs_primary instead of elasticsearch_indices_docs_total would be more precise, as the primary shard should get the and validate the values and then forward to other replicas. What do you think @lunkan93?

Nice! 👍

I agree that it makes more sense to use elasticsearch_indices_docs_primary 😄

@salehsedghpour salehsedghpour marked this pull request as ready for review December 17, 2024 08:51
@salehsedghpour salehsedghpour requested a review from a team as a code owner December 17, 2024 08:51
@salehsedghpour salehsedghpour assigned aarnq and unassigned aarnq Dec 17, 2024
@lunkan93 lunkan93 force-pushed the aarnq/opensearch-active-index-size-alert branch from 337deb3 to ec90572 Compare January 7, 2025 13:58
Copy link
Contributor Author

@aarnq aarnq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, cannot approve though as I opened it 😅

@lunkan93 lunkan93 force-pushed the aarnq/opensearch-active-index-size-alert branch from ec90572 to c151444 Compare January 8, 2025 08:12
@lunkan93 lunkan93 merged commit 8c3fc63 into main Jan 8, 2025
12 checks passed
@lunkan93 lunkan93 deleted the aarnq/opensearch-active-index-size-alert branch January 8, 2025 08:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

kind/improvement Improvement of existing features, e.g. code cleanup or optimizations.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants