Skip to content

Commit 11be6d5

Browse files
committed
make updates to existing language
1 parent f373522 commit 11be6d5

File tree

5 files changed

+19
-7
lines changed

5 files changed

+19
-7
lines changed

docs/best-practices/_snippets/_avoid_optimize_final.md

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,13 @@ While it's tempting to manually trigger this merge using:
1616
OPTIMIZE TABLE <table> FINAL;
1717
```
1818

19-
**you should avoid this operation in most cases** as it initiates resource intensive operations which may impact cluster performance.
19+
**you should avoid the `OPTIMIZE FINAL` operation in most cases** as it initiates
20+
resource intensive operations which may impact cluster performance.
21+
22+
:::note OPTIMIZE FINAL vs FINAL
23+
`OPTIMIZE FINAL` is not the same as `FINAL`, which is sometimes necessary to use
24+
for results without duplicates, such as with the `ReplacingMergeTree`.
25+
:::
2026

2127
## Why avoid? {#why-avoid}
2228

docs/cloud/bestpractices/index.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -27,5 +27,5 @@ These are in addition to the standard best practices which apply to all deployme
2727
| [Selecting an Insert Strategy](/best-practices/selecting-an-insert-strategy) | Strategies for efficient data insertion in ClickHouse. |
2828
| [Data Skipping Indices](/best-practices/use-data-skipping-indices-where-appropriate) | When to apply data skipping indices for performance gains. |
2929
| [Avoid Mutations](/best-practices/avoid-mutations) | Reasons to avoid mutations and how to design without them. |
30-
| [Avoid OPTIMIZE FINAL](/best-practices/avoid-optimize-final) | Why `OPTIMIZE FINAL` can be costly and how to work around it. |
30+
| [Avoid `OPTIMIZE FINAL`](/best-practices/avoid-optimize-final) | Why `OPTIMIZE FINAL` can be costly and how to work around it. |
3131
| [Use JSON where appropriate](/best-practices/use-json-where-appropriate) | Considerations for using JSON columns in ClickHouse. |

docs/guides/best-practices/index.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -11,8 +11,8 @@ This section contains tips and best practices for improving performance with Cli
1111
We recommend users read [Core Concepts](/parts) as a precursor to this section,
1212
which covers the main concepts required to improve performance.
1313

14-
| Topic | Description |
15-
|---------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
14+
| Topic | Description |
15+
|---------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
1616
| [Query Optimization Guide](/optimize/query-optimization) | A good place to start for query optimization, this simple guide describes common scenarios of how to use different performance and optimization techniques to improve query performance. |
1717
| [Primary Indexes Advanced Guide](/guides/best-practices/sparse-primary-indexes) | A deep dive into ClickHouse indexing including how it differs from other DB systems, how ClickHouse builds and uses a table's spare primary index and what some of the best practices are for indexing in ClickHouse. |
1818
| [Query Parallelism](/optimize/query-parallelism) | Explains how ClickHouse parallelizes query execution using processing lanes and the max_threads setting. Covers how data is distributed across lanes, how max_threads is applied, when it isn't fully used, and how to inspect execution with tools like EXPLAIN and trace logs. |
@@ -23,7 +23,7 @@ which covers the main concepts required to improve performance.
2323
| [Asynchronous Inserts](/optimize/asynchronous-inserts) | Focuses on ClickHouse's asynchronous inserts feature. It likely explains how asynchronous inserts work (batching data on the server for efficient insertion) and their benefits (improved performance by offloading insert processing). It might also cover enabling asynchronous inserts and considerations for using them effectively in your ClickHouse environment. |
2424
| [Avoid Mutations](/optimize/avoid-mutations) | Discusses the importance of avoiding mutations (updates and deletes) in ClickHouse. It recommends using append-only inserts for optimal performance and suggests alternative approaches for handling data changes. |
2525
| [Avoid nullable columns](/optimize/avoid-nullable-columns) | Discusses why you may want to avoid nullable columns to save space and increase performance. Demonstrates how to set a default value for a column. |
26-
| [Avoid Optimize Final](/optimize/avoidoptimizefinal) | Explains how the `OPTIMIZE TABLE ... FINAL` query is resource-intensive and suggests alternative approaches to optimize ClickHouse performance. |
26+
| [Avoid `OPTIMIZE FINAL`](/optimize/avoidoptimizefinal) | Explains how the `OPTIMIZE TABLE ... FINAL` query is resource-intensive and suggests alternative approaches to optimize ClickHouse performance. |
2727
| [Analyzer](/operations/analyzer) | Looks at the ClickHouse Analyzer, a tool for analyzing and optimizing queries. Discusses how the Analyzer works, its benefits (e.g., identifying performance bottlenecks), and how to use it to improve your ClickHouse queries' efficiency. |
2828
| [Query Profiling](/operations/optimizing-performance/sampling-query-profiler) | Explains ClickHouse's Sampling Query Profiler, a tool that helps analyze query execution. |
2929
| [Query Cache](/operations/query-cache) | Details ClickHouse's Query Cache, a feature that aims to improve performance by caching the results of frequently executed `SELECT` queries. |

docs/guides/developer/deduplication.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -105,7 +105,9 @@ FINAL
105105
The result only has 2 rows, and the last row inserted is the row that gets returned.
106106

107107
:::note
108-
Using `FINAL` works OK if you have a small amount of data. If you are dealing with a large amount of data, using `FINAL` is probably not the best option. Let's discuss a better option for finding the latest value of a column...
108+
Using `FINAL` works okay if you have a small amount of data. If you are dealing with a large amount of data,
109+
using `FINAL` is probably not the best option. Let's discuss a better option for
110+
finding the latest value of a column.
109111
:::
110112

111113
### Avoiding FINAL {#avoiding-final}

docs/guides/developer/replacing-merge-tree.md

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -220,7 +220,11 @@ Peak memory usage: 8.14 MiB.
220220

221221
## FINAL performance {#final-performance}
222222

223-
The `FINAL` operator will have a performance overhead on queries despite ongoing improvements. This will be most appreciable when queries are not filtering on primary key columns, causing more data to be read and increasing the deduplication overhead. If users filter on key columns using a `WHERE` condition, the data loaded and passed for deduplication will be reduced.
223+
The `FINAL` operator does have a small performance overhead on queries.
224+
This will be most noticeable when queries are not filtering on primary key columns,
225+
causing more data to be read and increasing the deduplication overhead. If users
226+
filter on key columns using a `WHERE` condition, the data loaded and passed for
227+
deduplication will be reduced.
224228

225229
If the `WHERE` condition does not use a key column, ClickHouse does not currently utilize the `PREWHERE` optimization when using `FINAL`. This optimization aims to reduce the rows read for non-filtered columns. Examples of emulating this `PREWHERE` and thus potentially improving performance can be found [here](https://clickhouse.com/blog/clickhouse-postgresql-change-data-capture-cdc-part-1#final-performance).
226230

0 commit comments

Comments
 (0)