-
Notifications
You must be signed in to change notification settings - Fork 25.7k
Avoid changing tsid creation strategy for an index #135514
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoid changing tsid creation strategy for an index #135514
Conversation
This avoids that the tsid and routing is calculated differently on the same index. Doing so would have the following consequences: * De-duplication would not work, for example, when a client re-transmits a batch as the id (which is based on the tsid) will be different. * When replaying a translog operation, the re-computed tsid and id will differ from the id stored in the translog. This would lead to a failure of the index operation.
...ata-streams/src/main/java/org/elasticsearch/datastreams/DataStreamIndexSettingsProvider.java
Show resolved
Hide resolved
|
I kinda like this. Let's try to be as restrictive as possible. |
|
I like it better, too but didn't make the change earlier due to potential impact on backwards compatibility. Are you worried about that at all? |
|
Not really.. If anything, I'd think it's a best practice to avoid such changes (label => dimension) outside rollovers. More so, we should only have dimensions and metrics at this point, no labels. Maybe worth documenting better. |
Today, it's already prohibited to change a non-dimension field to a dimension field. However, there are two problematic scenarios where replaying the translog or re-submitting a batch can lead to duplicates or rejections:
(See #135402 (comment) for more details about these scenarios) It seems that the only way'd feel comfortable with adding the While we may still need to discuss backwards compatibility implications before removing the feature flag for the new |
|
Pinging @elastic/es-storage-engine (Team:StorageEngine) |
|
I agree, let's proceed with this for |
…e' into index-dimensions-no-downgrade
That would be possible already by explicitly setting the |
|
Yeah let's add a separate boolean index setting; |
That's the reason I was thinking about making this a cluster-level setting rather than per data stream. |
|
We mostly use index settings for this purpose, to have fine-grained control. It's not expected to be used, and it's not that hard to update all TSDS in a cluster if need be. |
|
I'm good either way but wanted to note one thing
This would change it for all existing data streams. But new data streams would then still use the |
This avoids that the tsid and routing is calculated differently on the same index. Doing so would have the following consequences:
This could be seen as a breaking change as operations on backing indices that were allowed before are not allowed after this change. However, this only applies to data streams as
index.dimensionswill never be set on plain time_series indices. For data streams, you first change the template, then optionally the backing indices, and roll over if changing the backing indices doesn't work.