-
Notifications
You must be signed in to change notification settings - Fork 25.6k
[Downsampling++] Add time series telemetry in xpack usage #134214
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Downsampling++] Add time series telemetry in xpack usage #134214
Conversation
|
Hi @gmarouli, I've created a changelog YAML for you. |
|
Pinging @elastic/es-storage-engine (Team:StorageEngine) |
| /* | ||
| * We now add a number of simulated data streams to the cluster state. We mix different combinations of: | ||
| * - time series and standard data streams & backing indices | ||
| * - lifecycle with or without downsampling |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| * - lifecycle with or without downsampling | |
| * - DLM with or without downsampling |
| var downsamplingConfiguredBy = randomFrom(DownsampledBy.values()); | ||
| boolean isDownsampled = downsamplingConfiguredBy != DownsampledBy.NONE && isTimeSeriesDataStream; | ||
| // An index/data stream can have both ILM & DLM configured; by default, ILM "wins" | ||
| boolean hasLifecycle = usually() || (isDownsampled && downsamplingConfiguredBy == DownsampledBy.DLM); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: usually is !rarely so it's almost always.. Maybe use randomDouble() < 0.8 or so, to make it more concrete?
| tsIndexCount, | ||
| ilmStats.getDownsamplingStats(), | ||
| ilmStats.getIlmPolicyStats(), | ||
| dlmStats.getDownsamplingStats(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why dlmStats?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean why I picked this variable name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought this call is for ilmStats, surprised to see dlmStats for this arg.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I see. There are two different constructors one that accepts stats for both ILM and DLM and one that only accepts DLM for the serverless use case, this was syntactic sugar to make more explicit that ILM stats would be null.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will add a comment for clarity
| builder.field("phases_in_use", phasesUsedInDownsampling); | ||
| builder.endObject(); | ||
| } | ||
| builder.startObject("dlm"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we also check for null dlmDownsamplingStats here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't hurt, I will add it.
| } | ||
| } | ||
|
|
||
| public record DownsamplingFeatureStats(long dataStreamsCount, long indexCount, long minRounds, double averageRounds, long maxRounds) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you planning to add stats on dataset size and reduction, later?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Stats on dataset size and reduction require a lot more infrastructure that we do not have currently. If we do move ahead with these plans then yes, I think we should try to expose them here too.
kkrik-es
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice and clean. Consider asking Martijn to take a look too.
martijnvg
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 👍
In this PR we introduce telemetry for time series trying to answer the following questions:
Fixes: #133953