-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Enforcing max size for random samples #136134
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR implements maximum size enforcement for document sampling in Elasticsearch. It adds a new rejection counter for documents that exceed the configured size limit and prevents samples from growing beyond the specified maximum size.
- Added size-based filtering logic to prevent documents from being added when they would exceed the configured maximum size
- Introduced a new counter
samplesRejectedForSize
to track documents rejected due to size constraints - Implemented efficient size calculation for sample collections with caching to optimize performance
Reviewed Changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.
Show a summary per file
File | Description |
---|---|
SamplingService.java | Core implementation of size enforcement logic and new rejection counter |
SamplingServiceTests.java | Added test coverage for maximum size enforcement behavior |
SamplingServiceSampleStatsTests.java | Updated test infrastructure to handle new rejection counter |
GetSampleStatsActionResponseTests.java | Updated test data generation for new stats field |
GetSampleStatsActionNodeResponseTests.java | Updated test data generation for new stats field |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
server/src/main/java/org/elasticsearch/ingest/SamplingService.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/ingest/SamplingService.java
Outdated
Show resolved
Hide resolved
…java Co-authored-by: Copilot <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
server/src/main/java/org/elasticsearch/ingest/SamplingService.java
Outdated
Show resolved
Hide resolved
Pinging @elastic/es-data-management (Team:Data Management) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me overall, just left a few questions I'd like to discuss
This enforces the max size on the sampling config, and includes a new counter for documents rejected due to having exceeded the max size.
Note: I'm changing the serialization of SampleStats without including a new TransportVersion. This is safe because this object is not reachable in any released version, and is unavailable on serverless.