Skip to content

Conversation

@rubvs
Copy link
Contributor

@rubvs rubvs commented Mar 26, 2025

Closes https://github.com/elastic/apm-managed-service/issues/1541

Enable failure store to APM datastreams

Copy link
Member

@jbaiera jbaiera left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, though this will likely need to wait until after the feature is out from behind the feature flag to be merged.

Copy link
Member

@jbaiera jbaiera left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This won't handle APM data streams that already exist when applied, is that something that needs to be considered as well for this rollout?

@rubvs
Copy link
Contributor Author

rubvs commented Apr 2, 2025

This won't handle APM data streams that already exist when applied, is that something that needs to be considered as well for this rollout?

@1pkg and @simitt can you please give your input on this?

@simitt
Copy link
Contributor

simitt commented Apr 15, 2025

@jbaiera is there a different way to also apply it to existing data streams?

@jbaiera
Copy link
Member

jbaiera commented Apr 16, 2025

is there a different way to also apply it to existing data streams?

@simitt This is what the data_streams.failure_store.enabled cluster setting is for. It takes a list of index wildcard patterns as a value, and any data streams that match the patterns will have failure store enabled, as long as those data streams do not have failure store explicitly enabled/disabled. It's effectively selecting the data stream names that will have the feature enabled by default if not currently specified.

We'll need to use both the template and the cluster setting as part of the rollout. The template to enable the failure store on all data streams going forward, and the cluster setting to pick up any existing data streams from before the template update.

@rubvs rubvs marked this pull request as ready for review June 12, 2025 16:17
@rubvs rubvs requested a review from a team as a code owner June 12, 2025 16:17
@elasticsearchmachine elasticsearchmachine added the needs:triage Requires assignment of a team area label label Jun 12, 2025
Copy link
Member

@carsonip carsonip left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does changing a setting require a plugin version bump in

?

@rubvs rubvs added the v8.19.0 label Jun 12, 2025
@rubvs rubvs assigned rubvs and unassigned rubvs Jun 12, 2025
@rubvs rubvs added >enhancement Team:obs-ds-intake-services Meta label for Observability Intake Services team labels Jun 12, 2025
@elasticsearchmachine elasticsearchmachine removed the Team:obs-ds-intake-services Meta label for Observability Intake Services team label Jun 12, 2025
@rubvs rubvs added Team:obs-ds-intake-services Meta label for Observability Intake Services team and removed needs:triage Requires assignment of a team area label labels Jun 12, 2025
@elasticsearchmachine elasticsearchmachine added needs:triage Requires assignment of a team area label and removed Team:obs-ds-intake-services Meta label for Observability Intake Services team labels Jun 12, 2025
@rubvs rubvs added :Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP Team:Data Management Meta label for data/management team labels Jun 12, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-data-management (Team:Data Management)

@elasticsearchmachine elasticsearchmachine removed the needs:triage Requires assignment of a team area label label Jun 12, 2025
@elasticsearchmachine
Copy link
Collaborator

Hi @rubvs, I've created a changelog YAML for you.

@rubvs rubvs added the auto-backport Automatically create backport pull requests when merged label Jun 12, 2025
Copy link
Member

@dakrone dakrone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@simitt
Copy link
Contributor

simitt commented Jun 18, 2025

is there a different way to also apply it to existing data streams?

@simitt This is what the data_streams.failure_store.enabled cluster setting is for. It takes a list of index wildcard patterns as a value, and any data streams that match the patterns will have failure store enabled, as long as those data streams do not have failure store explicitly enabled/disabled. It's effectively selecting the data stream names that will have the feature enabled by default if not currently specified.

@jbaiera should the data_streams.failure_store.enabled cluster settings be changed as part of this PR then or is this done elsewhere?

@rubvs @1pkg please clarify with PMs (@akhileshpok @mlunadia @LucaWintergerst ) that enabling FS for all apm data streams by default is ok (in the past we have focused on MIS/serverless, but afaics this PR would enable it also for APM Server, so we need to coordinate).

@rubvs
Copy link
Contributor Author

rubvs commented Jul 2, 2025

@rubvs rubvs removed the v8.19.0 label Jul 4, 2025
@rubvs rubvs closed this by deleting the head repository Jul 11, 2025
@rubvs
Copy link
Contributor Author

rubvs commented Jul 15, 2025

Superseded by #131296

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

auto-backport Automatically create backport pull requests when merged :Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP >non-issue Team:Data Management Meta label for data/management team v9.2.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants