-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Measure search load per index #122262
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Measure search load per index #122262
Conversation
The code has been refactored based on Armin's suggestion to register a The benchmark after running for a while is the following, |
|
I'm not sure about all this history appearing. To work on this issue, I did the following |
| <method v="2" /> | ||
| </configuration> | ||
| </component> | ||
| </component> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to revert this, but the newline persists there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's intelliJ's fault, try removing it from vim/nano (with IJ closed) or another editor then commit the change and it should work 😄
| * @param tookInNanos the number of nanoseconds the query execution took | ||
| */ | ||
| default void onFailedQueryPhase(SearchContext searchContext) {} | ||
| default void onFailedQueryPhase(SearchContext searchContext, long tookInNanos) {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We also need to track the execution time when a phase fails.
| * @param indexName the name of the index | ||
| * @return the EWMA of the execution time for the index | ||
| */ | ||
| public double getLoadEMWAPerIndex(String indexName) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The EMWA is still under consideration if we need to calculate and export it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe some more context here:
We are calculating load on a per-index basis, loads are then collected and summed up (TODO) in the master node. With this information we will need to calculate which of the indices where under the most load and act based on that. The idea is to then normalize the "global" index load and act on the normalized values.
That said the per-node EMWA is not really suitable to be summed across nodes in our opinion.
That's why this comment.
|
|
||
| private static final Logger logger = LogManager.getLogger(ShardSearchPerIndexTimeTrackingMetrics.class); | ||
|
|
||
| private final ConcurrentHashMap<String, Tuple<LongAdder, ExponentiallyWeightedMovingAverage>> indexExecutionTime; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm sorry for saying it in such a straightforward manner, but do we really want to add more logic based on this class?
Especially on a per-shard basis, using the math in ExponentiallyWeightedMovingAverage seems questionable.
We calculate newValue = alpha * lastValue + (1 - alpha) * currentValue. In a large number of use-cases you may see the fetch and query times be an order of magnitude apart. So now, assuming the query always matches, we will essentially flap between two values constantly for a shard?
Why not just use the existing metrics we have in org.elasticsearch.index.search.stats.ShardSearchStats? EWMA makes no sense here, if anything isn't total query time and it's derivative what we care about?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, we left it in because we where not sure about it and I asked Dimi to leave a comment on the PR about this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All right, let's check how it can be adapted into the ShardSearchStats
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we should do this given that we have ShardSearchStats already. If we miss any metric we should add it to that thing shouldn't we?


With this PR we introduce a way to track EMWA and total time spent executing tasks for each index in the search thread-pool.
We extended
TaskExecutionTimeTrackingEsThreadPoolExecutorthat already has logic to track globally (not per-index) EWMA and time spent executing tasks in the search thread pool to track on a per-index basis inTaskExecutionTimeTrackingPerIndexEsThreadPoolExecutor. We decided to extend this logic in order not to duplicate the time tracking logic.We take care of new indices being tracked with the
computeIfAbsentinside thetrackExecutionTimemethod and take care or removing deleted indices with the cluster state listenerSearchIndexTimeTrackingCleanupService