-
Notifications
You must be signed in to change notification settings - Fork 25.7k
Support different downsampling methods through ILM #136951
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support different downsampling methods through ILM #136951
Conversation
🔍 Preview links for changed docs |
ℹ️ Important: Docs version tagging👋 Thanks for updating the docs! Just a friendly reminder that our docs are now cumulative. This means all 9.x versions are documented on the same page and published off of the main branch, instead of creating separate pages for each minor version. We use applies_to tags to mark version-specific features and changes. Expand for a quick overviewWhen to use applies_to tags:✅ At the page level to indicate which products/deployments the content applies to (mandatory) What NOT to do:❌ Don't remove or replace information that applies to an older version 🤔 Need help?
|
|
Hi @gmarouli, I've created a changelog YAML for you. |
|
Pinging @elastic/es-storage-engine (Team:StorageEngine) |
|
Pinging @elastic/es-data-management (Team:Data Management) |
nielsbauman
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for working on this, Mary! Looking good to me, I just have a few pretty minor comments.
Who would have thought that a new functionality would require fewer changes in ILM than in DLM 😄
| `sampling_method` {applies_to}`stack: ga 9.3` | ||
| : (Optional, string) The sampling method that will be used to sample metrics; there are two methods available `aggregate` and | ||
| the `last_value`. Defaults to `aggregate`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we link to a page that explains the difference between the two methods? Or do we not have such a page yet?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do not have a page about this yet. But this is a very good point and I will keep a note about that.
x-pack/plugin/core/src/main/java/org/elasticsearch/xpack/core/ilm/DownsampleAction.java
Outdated
Show resolved
Hide resolved
x-pack/plugin/core/src/main/java/org/elasticsearch/xpack/core/ilm/DownsampleAction.java
Outdated
Show resolved
Hide resolved
| return switch (between(0, 2)) { | ||
| case 0 -> null; | ||
| case 1 -> DownsampleConfig.SamplingMethod.AGGREGATE; | ||
| case 2 -> DownsampleConfig.SamplingMethod.LAST_VALUE; | ||
| default -> throw new IllegalArgumentException("Unknown randomisation branch"); | ||
| }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Small optional nit:
| return switch (between(0, 2)) { | |
| case 0 -> null; | |
| case 1 -> DownsampleConfig.SamplingMethod.AGGREGATE; | |
| case 2 -> DownsampleConfig.SamplingMethod.LAST_VALUE; | |
| default -> throw new IllegalArgumentException("Unknown randomisation branch"); | |
| }; | |
| return randomBoolean() ? null : randomFrom(DownsampleConfig.SamplingMethod.values()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I used to have it like this but it favours null, I will change it though to something fair that still picks up the values dynamically. Nice that you challenged it.
I am considering something along the lines of:
static DownsampleConfig.SamplingMethod randomSamplingMethod() {
if (between(0, DownsampleConfig.SamplingMethod.values().length) == 0) {
return null;
} else {
return randomFrom(DownsampleConfig.SamplingMethod.values());
}
}
...plugin/core/src/test/java/org/elasticsearch/xpack/core/ilm/TimeseriesLifecycleTypeTests.java
Outdated
Show resolved
Hide resolved
...nsample/src/internalClusterTest/java/org/elasticsearch/xpack/downsample/ILMDownsampleIT.java
Outdated
Show resolved
Hide resolved
...nsample/src/internalClusterTest/java/org/elasticsearch/xpack/downsample/ILMDownsampleIT.java
Outdated
Show resolved
Hide resolved
Hahah, actually, if we did the same in DLM just adding it to the each round, it required almost nothing. But the users would need to pay the price. So, I still think DLM wins, it's just unfortunate that we did not see this when we were first coming up with the configuration. |
| assertEquals(policy, settings.get(LifecycleSettings.LIFECYCLE_NAME_SETTING.getKey())); | ||
| }, 120, TimeUnit.SECONDS); | ||
|
|
||
| // update the policy to now contain the downsample action in cold, whilst not existing in warm anymore (this will have our already | ||
| // downsampled index attempt to go through the downsample action again when in cold) | ||
|
|
||
| // Sometimes the sampling method might be different; this step should not fail considering that the index is already downsampled. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand this comment, but I think it might be hard to understand without the context of this PR.
Also, wouldn't it be valuable to have a similar test that explicitly changes the sampling method between the two phases? Now we sometimes implicitly do that through the default value (which is what your comment hints at), but I feel like that's an important edge case that is worth explicitly verifying. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes you are right, initially, I thought the sampling method should not matter, but as you say it's good to check that the test case captures this. I changed it to always change the sampling method
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. I think we should also update this comment as Sometimes is no longer correct, right?
...qa/multi-node/src/javaRestTest/java/org/elasticsearch/xpack/ilm/actions/ILMDownsampleIT.java
Outdated
Show resolved
Hide resolved
nielsbauman
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thanks a lot for the work and the iterations :)
|
Thanks for the review @nielsbauman . Very helpful! |
Following #136813, we expose to ILM the new sampling method config in the downsampling API. This will allow users to configure the sampling method in their downsample action of their ILM policies. For example: