Skip to content

Conversation

@alex-spies
Copy link
Contributor

To understand query performance, we often peruse the output of _query-requests run with "profile": true.

This is difficult when the query runs in a large cluster with many nodes and shards, or in case of CCQ.

This adds an option to visualize a query using Chromium's/Chrome's builtin about:tracing - or, for even better visuals and querying the different drivers via SQL, perfetto (c.f. https://ui.perfetto.dev/).

To use, save the JSON output of a query run with "profile": true to a file, like output.json and then invoke the following Gradle task:

./gradlew x-pack:plugin:esql:tools:parseProfile --args='~/output.json ~/parsed_profile.json' 

Either open about:tracing in Chromium/Chrome
image
Or head over to https://ui.perfetto.dev (build locally in case of potentially sensitive data in the profille):
image

Every slice is a driver, the colors indicating the ratio of cpu time over total time.

  • In Perfetto, essentials like duration, cpu duration, timestamp and a few others can be queried via SQL - this allows e.g. querying for all drivers that spent more than 50% of their time waiting and other fun things.
    image

  • Details about a driver, esp. which operators it ran, are available when clicking the driver's slice.
    image

Invoke it like this:
./gradlew x-pack:plugin:esql:qa:testFixtures:parseProfile --args='~/elasticsearch/profile.json ~/elasticsearch/output.json'

Then it can be imported into e.g. perfetto or into Chromes trace viewer
(about:tracing)
@alex-spies alex-spies added >non-issue auto-backport Automatically create backport pull requests when merged :Analytics/ES|QL AKA ESQL v8.19.0 v9.1.0 labels Mar 7, 2025
@elasticsearchmachine elasticsearchmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Mar 7, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)

Path outputFileName = Path.of(args[1].replaceFirst("^~", System.getProperty("user.home"))).toAbsolutePath();

Map<String, Object> map;
try (InputStream input = Files.newInputStream(inputFileName)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe you originally started with jq. What was the final reason to replace it with java?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. It was too messy to properly emit metadata events for nodes, assign correct tid and pids etc.
  2. If we want to evolve either the visualization/profile parsing or the profile itself, it's much easier to do so in Java.
  3. I can nicely test this.

Map<String, Object> map;
try (InputStream input = Files.newInputStream(inputFileName)) {
logger.info("Starting to parse {}", inputFileName);
map = XContentHelper.convertToMap(JsonXContent.jsonXContent, input, true);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should consider model rather than a map?
Possibly something like:

    public record Response(Profile profile) {}
    public record Profile(List<Driver> drivers) {}
    public record Driver(
            @JsonProperty("task_description") String taskDescription,
            @JsonProperty("cluster_name") String cluster,
            @JsonProperty("node_name") String node,
            @JsonProperty("start_millis") long startMillis,
            @JsonProperty("stop_millis") long stopMillis,
            List<Operator> operators
    ) {}
    public record Operator(String operator) {}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, with Jackson it's indeed much clearer. I use this approach now + updated the tests accordingly.

@alex-spies alex-spies added the auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) label Mar 12, 2025
@elasticsearchmachine elasticsearchmachine merged commit fc4d8d6 into elastic:main Mar 13, 2025
17 checks passed
@alex-spies alex-spies deleted the profiling_parser branch March 13, 2025 14:10
@elasticsearchmachine
Copy link
Collaborator

💔 Backport failed

Status Branch Result
8.x Commit could not be cherrypicked due to conflicts

You can use sqren/backport to manually backport by running backport --upstream elastic/elasticsearch --pr 124361

@alex-spies
Copy link
Contributor Author

💚 All backports created successfully

Status Branch Result
8.x

Questions ?

Please refer to the Backport tool documentation

alex-spies added a commit to alex-spies/elasticsearch that referenced this pull request Mar 13, 2025
To understand query performance, we often peruse the output of
`_query`-requests run with `"profile": true`.

This is difficult when the query runs in a large cluster with many nodes
and shards, or in case of CCQ.

This adds an option to visualize a query using Chromium's/Chrome's
builtin `about:tracing` - or, for even better visuals and querying the
different drivers via SQL, perfetto (c.f. https://ui.perfetto.dev/).

To use, save the JSON output of a query run with `"profile": true` to a
file, like `output.json` and then invoke the following Gradle task:

```
./gradlew x-pack:plugin:esql:tools:parseProfile --args='~/output.json ~/parsed_profile.json'
```

Either open `about:tracing` in Chromium/Chrome
![image](https://github.com/user-attachments/assets/75e17ddf-f032-4aa1-bf3e-61b985b4e0b6)
Or head over to https://ui.perfetto.dev (build locally in case of
potentially sensitive data in the profille):
![image](https://github.com/user-attachments/assets/b3372b7d-fbec-45aa-a68c-b24e62a8c704)

Every slice is a driver, the colors indicating the ratio of cpu time
over total time. - In Perfetto, essentials like duration, cpu duration,
timestamp and a few others can be queried via SQL - this allows e.g.
querying for all drivers that spent more than 50% of their time waiting
and other fun things.
![image](https://github.com/user-attachments/assets/4a0ab2ce-3585-4953-b2eb-71991777b3fa)

- Details about a driver, esp. which operators it ran, are available when clicking the driver's slice.
![image](https://github.com/user-attachments/assets/e1c0b30d-0a31-468c-9ff4-27ca452716fc)

(cherry picked from commit fc4d8d6)

# Conflicts:
#	x-pack/plugin/esql/qa/server/single-node/src/javaRestTest/java/org/elasticsearch/xpack/esql/qa/single_node/RestEsqlIT.java
jfreden pushed a commit to jfreden/elasticsearch that referenced this pull request Mar 13, 2025
To understand query performance, we often peruse the output of
`_query`-requests run with `"profile": true`.

This is difficult when the query runs in a large cluster with many nodes
and shards, or in case of CCQ.

This adds an option to visualize a query using Chromium's/Chrome's
builtin `about:tracing` - or, for even better visuals and querying the
different drivers via SQL, perfetto (c.f. https://ui.perfetto.dev/).

To use, save the JSON output of a query run with `"profile": true` to a
file, like `output.json` and then invoke the following Gradle task:

```
./gradlew x-pack:plugin:esql:tools:parseProfile --args='~/output.json ~/parsed_profile.json'
```

Either open `about:tracing` in Chromium/Chrome
![image](https://github.com/user-attachments/assets/75e17ddf-f032-4aa1-bf3e-61b985b4e0b6)
Or head over to https://ui.perfetto.dev (build locally in case of
potentially sensitive data in the profille):
![image](https://github.com/user-attachments/assets/b3372b7d-fbec-45aa-a68c-b24e62a8c704)

Every slice is a driver, the colors indicating the ratio of cpu time
over total time. - In Perfetto, essentials like duration, cpu duration,
timestamp and a few others can be queried via SQL - this allows e.g.
querying for all drivers that spent more than 50% of their time waiting
and other fun things.
![image](https://github.com/user-attachments/assets/4a0ab2ce-3585-4953-b2eb-71991777b3fa)

- Details about a driver, esp. which operators it ran, are available when clicking the driver's slice.
![image](https://github.com/user-attachments/assets/e1c0b30d-0a31-468c-9ff4-27ca452716fc)
elasticsearchmachine pushed a commit that referenced this pull request Mar 13, 2025
* ESQL: Enable visualizing a query profile (#124361)

To understand query performance, we often peruse the output of
`_query`-requests run with `"profile": true`.

This is difficult when the query runs in a large cluster with many nodes
and shards, or in case of CCQ.

This adds an option to visualize a query using Chromium's/Chrome's
builtin `about:tracing` - or, for even better visuals and querying the
different drivers via SQL, perfetto (c.f. https://ui.perfetto.dev/).

To use, save the JSON output of a query run with `"profile": true` to a
file, like `output.json` and then invoke the following Gradle task:

```
./gradlew x-pack:plugin:esql:tools:parseProfile --args='~/output.json ~/parsed_profile.json'
```

Either open `about:tracing` in Chromium/Chrome
![image](https://github.com/user-attachments/assets/75e17ddf-f032-4aa1-bf3e-61b985b4e0b6)
Or head over to https://ui.perfetto.dev (build locally in case of
potentially sensitive data in the profille):
![image](https://github.com/user-attachments/assets/b3372b7d-fbec-45aa-a68c-b24e62a8c704)

Every slice is a driver, the colors indicating the ratio of cpu time
over total time. - In Perfetto, essentials like duration, cpu duration,
timestamp and a few others can be queried via SQL - this allows e.g.
querying for all drivers that spent more than 50% of their time waiting
and other fun things.
![image](https://github.com/user-attachments/assets/4a0ab2ce-3585-4953-b2eb-71991777b3fa)

- Details about a driver, esp. which operators it ran, are available when clicking the driver's slice.
![image](https://github.com/user-attachments/assets/e1c0b30d-0a31-468c-9ff4-27ca452716fc)

(cherry picked from commit fc4d8d6)

# Conflicts:
#	x-pack/plugin/esql/qa/server/single-node/src/javaRestTest/java/org/elasticsearch/xpack/esql/qa/single_node/RestEsqlIT.java

* Account for missing driver descr., node, cluster
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Analytics/ES|QL AKA ESQL auto-backport Automatically create backport pull requests when merged auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) backport pending >non-issue Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v8.19.0 v9.1.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants