Skip to content

Conversation

@rockdaboot
Copy link
Contributor

@rockdaboot rockdaboot commented Jun 29, 2025

Background

To display statistically sound data, Universal Profiling queries require to fetch a lot of documents (aka profiling events). This can be more than 100k, even if only 20k documents are needed.

For this, the number of documents allowed in a single response is set to 150k. It also requires to increase a cluster-wide setting search.max_buckets to 150k. The alternative, to paginate the response, is/was not an option as it increased the query latency unacceptably.

The new (and still experimental) random_sampler aggregation can not be used as every profiling event document has a weight (aka count).

Problem

On serverless, the cluster-wide setting search.max_buckets can no longer be changed and it defaults to 64k. The option to fetch the data in a paginated way is too slow (up to 15 sequential requests).

Solution

The profiling events were recently switched to have nanosecond precise timestamps.
This made it very unlikely to have events with a count value != 1.
So with ES 9.2, the count value is either dropped or always set to 1 (see also open-telemetry/opentelemetry-collector-contrib#40947).

With profiling event documents having all the same weight, we can leverage the random_sampler aggregation to reduce the number of documents to be fetched to a maximum of 20k. This allows using a paginated response (1 additional request/response roundtrip) without massively increasing the total query latency.

This PR is the first step, using the aggregated doc_count value instead of using aggregated count values. In a second PR, we'll switch to use the random_sampler aggregation in combination with pagination.

@rockdaboot rockdaboot self-assigned this Jun 29, 2025
@rockdaboot rockdaboot added >enhancement :UniversalProfiling/Application Elastic Universal Profiling REST APIs and infrastructure v9.2.0 labels Jun 29, 2025
@elasticsearchmachine elasticsearchmachine added Team:obs-ux-infra_services Observability Infrastructure & Services User Experience Team external-contributor Pull request authored by a developer outside the Elasticsearch team labels Jun 29, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/obs-ux-infra_services-team (Team:obs-ux-infra_services)

@rockdaboot rockdaboot merged commit c370c05 into elastic:main Jun 30, 2025
32 checks passed
mridula-s109 pushed a commit to mridula-s109/elasticsearch that referenced this pull request Jul 3, 2025
* [Profiling] Ignore events count value

* [CI] Auto commit changes from spotless

---------

Co-authored-by: elasticsearchmachine <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

>enhancement external-contributor Pull request authored by a developer outside the Elasticsearch team Team:obs-ux-infra_services Observability Infrastructure & Services User Experience Team :UniversalProfiling/Application Elastic Universal Profiling REST APIs and infrastructure v9.2.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants