Skip to content

Conversation

@masseyke
Copy link
Member

There were some parts of random sampling that would break when using data streams instead of ordinary indices. This fixes those, and adds some tests.

@masseyke masseyke requested a review from seanzatzdev October 28, 2025 15:17
@masseyke masseyke added >non-issue :Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP v9.3.0 labels Oct 28, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-data-management (Team:Data Management)

@elasticsearchmachine elasticsearchmachine added the Team:Data Management Meta label for data/management team label Oct 28, 2025
@masseyke masseyke requested a review from Copilot October 28, 2025 20:08
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This pull request adds support for data streams in the sampling service. The changes enable sampling configurations to be applied to data streams in addition to regular indices, and ensure that sampling configurations are properly cleaned up when data streams are deleted.

Key changes:

  • New utility method to validate that a request targets either a data stream or a single existing index
  • Automatic cleanup of sampling configurations when data streams are deleted
  • Comprehensive test coverage for data stream operations including creation, deletion, and sampling

Reviewed Changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
SamplingService.java Added validation method for data streams/indices and cleanup logic for deleted data streams
TransportPutSampleConfigurationAction.java Updated to use new validation method supporting data streams
TransportGetSampleStatsAction.java Updated to use new validation method supporting data streams
TransportGetSampleAction.java Updated to use new validation method supporting data streams
TransportDeleteSampleConfigurationAction.java Updated to use new validation method supporting data streams
GetSampleStatsAction.java Added includeDataStreams() override to enable data stream support
SamplingServiceIT.java Added integration test for data stream deletion cleanup
30_with_data_streams.yml (get_sample_stats) Added comprehensive YAML test suite for sample stats with data streams
30_with_data_streams.yml (get_sample) Added comprehensive YAML test suite for getting samples from data streams

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Contributor

@seanzatzdev seanzatzdev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm thanks

@masseyke masseyke merged commit caa029f into elastic:main Oct 28, 2025
34 checks passed
@masseyke masseyke deleted the random-sampling-support-data-streams branch October 28, 2025 22:33
elasticsearchmachine pushed a commit that referenced this pull request Oct 29, 2025
I failed to merge in main before merging #137271 so I didn't pick up the
changes in #137290. This accounts for them.
chrisparrinello pushed a commit to chrisparrinello/elasticsearch that referenced this pull request Nov 3, 2025
chrisparrinello pushed a commit to chrisparrinello/elasticsearch that referenced this pull request Nov 3, 2025
…7301)

I failed to merge in main before merging elastic#137271 so I didn't pick up the
changes in elastic#137290. This accounts for them.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP >non-issue Team:Data Management Meta label for data/management team v9.3.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants