Skip to content

Commit 1b1ab57

Browse files
committed
add KB article explaining difference between optimize final and final
1 parent 4a78d64 commit 1b1ab57

File tree

1 file changed

+37
-340
lines changed

1 file changed

+37
-340
lines changed
Lines changed: 37 additions & 340 deletions
Original file line numberDiff line numberDiff line change
@@ -1,357 +1,54 @@
11
---
2-
date: 2023-10-25
3-
title: About Quotas and Query complexity
4-
tags: ['Managing Cloud']
5-
keywords: ['Quotas', 'Query Complexity']
6-
description: 'Quotas and Query Complexity are powerful ways to limit and restrict what users can do in ClickHouse. This KB article shows examples on how to apply these two different approaches.'
2+
date: 2025-07-20
3+
title: What is the difference between OPTIMIZE FINAL and FINAL?
4+
tags: ['Core Data Concepts']
5+
keywords: ['OPTIMIZE FINAL', 'FINAL']
6+
description: 'Discusses the differences between OPTIMIZE FINAL and FINAL, and when to use and avoid them.'
77
---
88

99
{frontMatter.description}
1010
{/* truncate */}
1111

12-
## About Quotas and Query complexity {#about-quotas-and-query-complexity}
12+
# What is the difference between `OPTIMIZE FINAL` and `FINAL`?
1313

14-
[Quotas](https://clickhouse.com/docs/operations/quotas) and [query complexity](https://clickhouse.com/docs/operations/settings/query-complexity) are powerful ways to limit and restrict what users can do in ClickHouse.
14+
`OPTIMIZE FINAL` is a DDL command that physically and permanently reorganizes
15+
and optimizes data on disk. It physically merges data parts in `MergeTree` tables,
16+
performing data deduplication in the process by removing duplicate rows from storage.
1517

16-
Quotas do apply restrictions within the context of a time interval, while query complexity applies regardless of time intervals.
18+
`FINAL` is a **query-time** modifier that provides deduplicated results without
19+
changing the structure of the stored data. It works by performing merge logic at
20+
read-time. It is temporary, only affecting the current query result.
1721

18-
This KB article shows examples on how to apply these two different approaches.
22+
Users are often advised to avoid using `OPTIMIZE FINAL`, as it has a significant
23+
performance overhead, however they should not confuse the two. It is often necessary
24+
to use `FINAL` to get back results without duplicates, especially when using table
25+
engines like `ReplacingMergeTree` which may contain duplicate rows which have not
26+
been replaced during the eventual, background merge process.
1927

20-
## The sample data {#the-sample-data}
28+
The table below summarises the key differences:
2129

22-
We refer to this simple sample table for the purpose of these examples:
30+
|Aspect |`OPTIMIZE FINAL` | `FINAL` |
31+
|------------------|--------------------------------------------|----------------------------------------------------|
32+
|Type | DDL Command | Query Modifier |
33+
|Effect | Permanent storage optimization | Temporary query-time deduplication |
34+
|Performance | Impact High cost once, then faster queries | Lower individual cost, but repeated for each query |
35+
|Data Modification | Yes - physically changes storage | No - read-only operation |
36+
|Use Case | Periodic maintenance/optimization | Real-time deduplicated queries |
2337

24-
```sql
25-
clickhouse-cloud :) CREATE TABLE default.test_table (name String, age UInt8) ENGINE=MergeTree ORDER BY tuple();
38+
## When to use each {#when-to-use-each}
2639

27-
-- CREATE TABLE default.test_table
28-
-- (
29-
-- `name` String,
30-
-- `age` UInt8
31-
-- )
32-
-- ENGINE = MergeTree
33-
-- ORDER BY tuple()
40+
Use `OPTIMIZE FINAL` when:
3441

35-
-- Query id: 4fd405db-a96e-4004-b1f6-e7f87def05d7
42+
- You want to permanently improve query performance
43+
- You can afford the one-time optimization cost
44+
- You're doing periodic table maintenance
45+
- You want to physically clean up duplicate data
3646

37-
-- Ok.
47+
Use `FINAL` when:
3848

39-
-- 0 rows in set. Elapsed: 0.313 sec.
40-
41-
clickhouse-cloud :) INSERT INTO default.test_table SELECT * FROM generateRandom('name String, age UInt8',1,1) LIMIT 100;
42-
43-
-- INSERT INTO default.test_table SELECT *
44-
-- FROM generateRandom('name String, age UInt8', 1, 1)
45-
-- LIMIT 100
46-
47-
-- Query id: 6eccfdc6-d98c-4377-ae25-f18deec6c807
48-
49-
-- Ok.
50-
51-
-- 0 rows in set. Elapsed: 0.055 sec.
52-
53-
clickhouse-cloud :) SELECT * FROM default.test_table_00006488 LIMIT 5
54-
55-
-- SELECT *
56-
-- FROM default.test_table_00006488
57-
-- LIMIT 5
58-
59-
-- Query id: 9fa58419-fb57-4260-886a-ccb836449f58
60-
61-
-- ┌─name─┬─age─┐
62-
-- │ │ 200 │
63-
-- │ 4 │ 72 │
64-
-- │ + │ 127 │
65-
-- │ │ 144 │
66-
-- │ ] │ 60 │
67-
-- └──────┴─────┘
68-
69-
-- 5 rows in set. Elapsed: 0.003 sec.
70-
```
71-
72-
## Using Quotas {#using-quotas}
73-
74-
In this example we create a role to which we'll apply a Quota that allows only 10 result rows to be retrieved for each 10 seconds interval:
75-
76-
```sql
77-
# AS the privileged user
78-
79-
# create a user
80-
clickhouse-cloud :) CREATE USER user_with_quota IDENTIFIED WITH sha256_password BY 'Dr6P1S8SGaQ@u!BUAnv';
81-
82-
-- CREATE USER user_with_quota IDENTIFIED WITH sha256_hash BY '2444E98ADA7433FC12F55C467D3564BF87F47B1A996E70D77496A2F1E42BAD73' SALT '129F92F8AB4AB6E56A01AA826D10D1239F14148606E197EB19D7612F8AF8BC52'
83-
84-
-- Query id: 542a4013-e34c-4776-b374-962fcfd2575a
85-
86-
-- Ok.
87-
88-
-- 0 rows in set. Elapsed: 0.097 sec.
89-
90-
# create a role to which quotas will be applied
91-
clickhouse-cloud :) CREATE ROLE role_with_quota
92-
93-
-- CREATE ROLE role_with_quota
94-
95-
-- Query id: 133a843b-8619-4642-84d9-9c232539b6a0
96-
97-
-- Ok.
98-
99-
-- 0 rows in set. Elapsed: 0.096 sec.
100-
101-
102-
-- grant select privileges
103-
clickhouse-cloud :) GRANT SELECT ON default.* TO role_with_quota;
104-
105-
-- GRANT SELECT ON default.* TO role_with_quota
106-
107-
-- Query id: 1b0e295e-597d-477f-8847-13411157fd1c
108-
109-
-- Ok.
110-
111-
-- 0 rows in set. Elapsed: 0.100 sec.
112-
113-
114-
-- grant role to the user
115-
clickhouse-cloud :) GRANT role_with_quota TO user_with_quota
116-
117-
-- GRANT role_with_quota TO user_with_quota
118-
119-
-- Query id: 0e19ff50-8990-4c17-8f91-5c8ce4142bdd
120-
121-
-- Ok.
122-
123-
-- 0 rows in set. Elapsed: 0.099 sec.
124-
125-
126-
-- create a quota that allows max 10 result rows in each 10 seconds interval and apply that to the role
127-
clickhouse-cloud :) CREATE QUOTA quota_max_10_result_rows_per_10_seconds FOR INTERVAL 10 second MAX result_rows = 10 TO role_with_quota
128-
129-
-- CREATE QUOTA quota_max_10_result_rows_per_10_seconds FOR INTERVAL 10 second MAX result_rows = 10 TO role_with_quota
130-
131-
-- 0 rows in set. Elapsed: 23.427 sec.
132-
133-
-- Query id: fe4d2038-2d35-415d-89ec-9eaaa2533fcd
134-
```
135-
136-
Now login as the user `user_with_quota`
137-
138-
```sql
139-
-- login as the user where quota is applied through the role
140-
clickhouse-cloud :) SELECT user()
141-
142-
-- SELECT user()
143-
144-
-- Query id: 56ebd28d-0d36-4caf-9cef-c3e51d9f0b9d
145-
146-
-- ┌─currentUser()───┐
147-
-- │ user_with_quota │
148-
-- └─────────────────┘
149-
150-
-- 1 row in set. Elapsed: 0.002 sec.
151-
152-
153-
-- list grants
154-
clickhouse-cloud :) SHOW GRANTS
155-
156-
-- SHOW GRANTS
157-
158-
-- Query id: cc78bada-28f4-4862-9fdf-7e68aae6fd80
159-
160-
-- ┌─GRANTS───────────────────────────────────┐
161-
-- │ GRANT role_with_quota TO user_with_quota │
162-
-- └──────────────────────────────────────────┘
163-
164-
-- 1 row in set. Elapsed: 0.001 sec.
165-
166-
-- check the timem
167-
clickhouse-cloud :) select now()
168-
169-
-- SELECT now()
170-
171-
-- Query id: bbbd54a8-6c2f-4d3b-982a-03d7bd143aa9
172-
173-
-- ┌───────────────now()─┐
174-
-- │ 2023-10-25 14:37:38 │
175-
-- └─────────────────────┘
176-
177-
-- 1 row in set. Elapsed: 0.001 sec.
178-
179-
180-
-- query ten rows
181-
clickhouse-cloud :) SELECT * FROM test_table LIMIT 10
182-
183-
-- SELECT *
184-
-- FROM test_table
185-
-- LIMIT 10
186-
187-
-- Query id: 20f1c02f-c938-4d06-851d-824c82693eb9
188-
189-
-- ┌─name─┬─age─┐
190-
-- │ │ 200 │
191-
-- │ 4 │ 72 │
192-
-- │ + │ 127 │
193-
-- │ │ 144 │
194-
-- │ ] │ 60 │
195-
-- │ │ 137 │
196-
-- │ │ 176 │
197-
-- │ │ 147 │
198-
-- │ │ 107 │
199-
-- │ Q │ 128 │
200-
-- └──────┴─────┘
201-
202-
-- 10 rows in set. Elapsed: 0.002 sec.
203-
204-
-- attempt to get another row within the 10 seconds interval since the last query
205-
clickhouse-cloud :) SELECT * FROM test_table LIMIT 1
206-
207-
-- SELECT *
208-
-- FROM test_table
209-
-- LIMIT 1
210-
211-
-- Query id: 48ae46ef-7b33-4765-affa-e47e889f48e5
212-
213-
214-
-- 0 rows in set. Elapsed: 0.094 sec.
215-
216-
-- Received exception from server (version 23.8.1):
217-
-- Code: 201. DB::Exception: Received from dxqjx1s5lt.eu-west-1.aws.clickhouse.cloud:9440. DB::Exception: Quota for user `user_with_quota` for 10s has been exceeded: result_rows = 11/10.
218-
-- Interval will end at 2023-10-25 14:37:50. Name of quota template: `quota_max_10_result_rows_per_10_seconds`. (QUOTA_EXCEEDED)
219-
220-
221-
-- check the time
222-
clickhouse-cloud :) select now()
223-
224-
-- SELECT now()
225-
226-
-- Query id: 87f190f6-3f75-4fe6-bf9c-c80ed88e179f
227-
228-
-- ┌───────────────now()─┐
229-
-- │ 2023-10-25 14:37:45 │
230-
-- └─────────────────────┘
231-
232-
-- 1 row in set. Elapsed: 0.001 sec.
233-
```
234-
235-
Note that the user will need to wait another 5 seconds before can get a new 10 rows resultset "allowance".
236-
237-
238-
## Using Query Complexity {#using-query-complexity}
239-
240-
In this example we create a role to which we'll apply a Query Complexity `SETTING` that allows only 1 rows to be returned for each query.
241-
242-
```sql
243-
-- AS the privileged user
244-
-- create a user
245-
clickhouse-cloud :) CREATE USER user_with_query_complexity IDENTIFIED WITH sha256_password BY 'Dr6P1S8SGaQ@u!BUAnv';
246-
247-
-- CREATE USER user_with_query_complexity IDENTIFIED WITH sha256_hash BY '99AB4976077304554286C43AA47C3BEDA5758EF56282C2FC90C0787DC6FE72BC' SALT '5A50D2B9B1DF7E8A1AA9A2CC00BCF802B7F605281A09E18E237447509B5C7A7C'
248-
249-
-- Query id: 91856182-f2bb-40cc-8902-2786beeeb93d
250-
251-
-- Ok.
252-
253-
-- 0 rows in set. Elapsed: 0.104 sec.
254-
255-
256-
-- create a role with query complexity SETTINGS that allows only one role in resultset
257-
clickhouse-cloud :) CREATE ROLE role_with_query_complexity SETTINGS max_result_rows=1;
258-
259-
-- CREATE ROLE role_with_query_complexity SETTINGS max_result_rows = 1
260-
261-
-- Query id: ec3d89fe-cab8-4cc3-9180-da5c93519643
262-
263-
-- Ok.
264-
265-
-- 0 rows in set. Elapsed: 0.097 sec.
266-
267-
268-
-- grant select privileges
269-
clickhouse-cloud :) GRANT SELECT ON default.* TO role_with_query_complexity;
270-
271-
-- GRANT SELECT ON default.* TO role_with_query_complexity
272-
273-
-- Query id: 230774ad-8073-4e2e-9530-3e90bce41cb1
274-
275-
-- Ok.
276-
277-
-- 0 rows in set. Elapsed: 0.097 sec.
278-
279-
280-
-- grant role to the user
281-
clickhouse-cloud :) GRANT role_with_query_complexity TO user_with_query_complexity
282-
283-
-- GRANT role_with_query_complexity TO user_with_query_complexity
284-
285-
-- Query id: f28c7c7b-61f7-48a8-a281-1f3784764b47
286-
287-
-- Ok.
288-
289-
-- 0 rows in set. Elapsed: 0.096 sec.
290-
```
291-
292-
293-
Now login as the user `user_with_query_complexity`:
294-
295-
```sql
296-
297-
-- login as the user where query complexity is applied through the role
298-
clickhouse-cloud :) SELECT user();
299-
300-
-- SELECT user()
301-
302-
-- Query id: 196c91fc-abff-464d-acce-6af961c233a3
303-
304-
-- ┌─currentUser()──────────────┐
305-
-- │ user_with_query_complexity │
306-
-- └────────────────────────────┘
307-
308-
-- 1 row in set. Elapsed: 0.001 sec.
309-
310-
311-
-- list grants
312-
clickhouse-cloud :) SHOW GRANTS
313-
314-
-- SHOW GRANTS
315-
316-
-- Query id: 87657b99-c3d9-4ffd-90e8-488f04f7f93b
317-
318-
-- ┌─GRANTS─────────────────────────────────────────────────────────┐
319-
-- │ GRANT role_with_query_complexity TO user_with_query_complexity │
320-
-- └────────────────────────────────────────────────────────────────┘
321-
322-
-- 1 row in set. Elapsed: 0.001 sec.
323-
324-
-- attempt to query with 1 row in resultset
325-
clickhouse-cloud :) SELECT * FROM default.test_table LIMIT 1;
326-
327-
-- SELECT *
328-
-- FROM default.test_table
329-
-- LIMIT 1
330-
331-
-- Query id: 7266891b-8611-4342-81b0-fe04766e62fa
332-
333-
-- ┌─name─┬─age─┐
334-
-- │ │ 200 │
335-
-- └──────┴─────┘
336-
337-
-- 1 row in set. Elapsed: 0.002 sec.
338-
339-
340-
-- attempt to query with more than 1 row in resultset
341-
clickhouse-cloud :) SELECT * FROM default.test_table LIMIT 2;
342-
343-
-- SELECT *
344-
-- FROM default.test_table
345-
-- LIMIT 2
346-
347-
-- Query id: ec8ecff3-f731-45bd-bb27-894ba358c7c8
348-
349-
-- 0 rows in set. Elapsed: 0.091 sec.
350-
351-
--Received exception from server (version 23.8.1):
352-
--Code: 396. DB::Exception: Received from dxqjx1s5lt.eu-west-1.aws.clickhouse.cloud:9440.
353-
--DB::Exception: Limit for result exceeded, max rows: 1.00, current rows: 2.00. (TOO_MANY_ROWS_OR_BYTES)
354-
```
355-
356-
Whenever attempting to get more than 1 row in resultset the query complexity constraint will kick in.
49+
- You need deduplicated results immediately
50+
- You can't wait for or don't want permanent optimization
51+
- You only occasionally need deduplicated data
52+
- You're working with frequently changing data
35753

54+
Both are valuable tools, but they serve different purposes in ClickHouse's deduplication strategy.

0 commit comments

Comments
 (0)