Skip to content

Commit 422ad0d

Browse files
committed
docs: add explanation about using explain to assess performance issues
Signed-off-by: Miguel Molina <[email protected]>
1 parent 9e415d1 commit 422ad0d

File tree

1 file changed

+52
-0
lines changed

1 file changed

+52
-0
lines changed

docs/using-gitbase/optimize-queries.md

Lines changed: 52 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,58 @@ There are two ways to optimize a gitbase query:
66
- Create an index for some parts.
77
- Making sure the joined tables are squashed.
88

9+
## Assessing performance bottlenecks
10+
11+
To assess if there is a performance bottleneck you might want to inspect the execution tree of the query. This is also very helpful when reporting performance issues on gitbase.
12+
13+
The output from an `EXPLAIN` query is represented as a tree and shows how the query is actually evaluated.
14+
You can do that using the following query:
15+
16+
```sql
17+
EXPLAIN FORMAT=TREE <SQL QUERY TO EXPLAIN>
18+
```
19+
20+
For example, the given query:
21+
22+
```sql
23+
EXPLAIN FORMAT=TREE
24+
SELECT * FROM refs
25+
NATURAL JOIN ref_commits
26+
WHERE ref_commits.history_index = 0
27+
```
28+
29+
Will output something like this:
30+
31+
```
32+
+-----------------------------------------------------------------------------------------+
33+
| plan |
34+
+-----------------------------------------------------------------------------------------+
35+
| Project(refs.repository_id, refs.ref_name, refs.commit_hash, ref_commits.history_index) |
36+
| └─ SquashedTable(refs, ref_commits) |
37+
| ├─ Columns |
38+
| │ ├─ Column(repository_id, TEXT, nullable=false) |
39+
| │ ├─ Column(ref_name, TEXT, nullable=false) |
40+
| │ ├─ Column(commit_hash, TEXT, nullable=false) |
41+
| │ ├─ Column(repository_id, TEXT, nullable=false) |
42+
| │ ├─ Column(commit_hash, TEXT, nullable=false) |
43+
| │ ├─ Column(ref_name, TEXT, nullable=false) |
44+
| │ └─ Column(history_index, INT64, nullable=false) |
45+
| └─ Filters |
46+
| ├─ refs.repository_id = ref_commits.repository_id |
47+
| ├─ refs.ref_name = ref_commits.ref_name |
48+
| ├─ refs.commit_hash = ref_commits.commit_hash |
49+
| └─ ref_commits.history_index = 0 |
50+
+-----------------------------------------------------------------------------------------+
51+
15 rows in set (0.00 sec)
52+
```
53+
54+
#### Detecting performance issues in the query tree
55+
56+
Some performance issues might not be obvious, but there are a few that really stand out by just looking at the query tree.
57+
58+
- Joins not squashed. If you performed some joins between tables and instead of a `SquashedTable` node you see `Join` and `Table` nodes, it means the joins were not successfully squashed. There is a more detailed explanation about this in next sections of this document.
59+
- Indexes not used. If you can't see the indexes in your table nodes, it means somehow those indexes are not being used by the table. There is a more detailed explanation about this in next sections of this document.
60+
961
## Indexes
1062

1163
The more obvious way to improve the performance of a query is to create an index for such query. Since you can index multiple columns or a single arbitrary expression, this may be useful for some kinds of queries. For example, if you're querying by language, you may want to index that so there is no need to compute the language each time.

0 commit comments

Comments
 (0)