You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/en/guides/developer/lightweight-update.md
+2-6Lines changed: 2 additions & 6 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -5,12 +5,8 @@ title: Lightweight Update
5
5
keywords: [lightweight update]
6
6
---
7
7
8
-
import CloudAvailableBadge from '@theme/badges/CloudAvailableBadge';
9
-
10
8
## Lightweight Update
11
9
12
-
<CloudAvailableBadge/>
13
-
14
10
When lightweight updates are enabled, updated rows are marked as updated immediately and subsequent `SELECT` queries will automatically return with the changed values. When lightweight updates are not enabled, you may have to wait for your mutations to be applied via a background process to see the changed values.
15
11
16
12
Lightweight updates can be enabled for `MergeTree`-family tables by enabling the query-level setting `apply_mutations_on_fly`.
@@ -23,7 +19,7 @@ SET apply_mutations_on_fly = 1;
23
19
24
20
Let's create a table and run some mutations:
25
21
```sql
26
-
CREATETABLEtest_on_fly_mutations (id UInt64, v String)
22
+
CREATETABLEtest_on_fly_mutations (id UInt64, v String)
27
23
ENGINE = MergeTree ORDER BY id;
28
24
29
25
-- Disable background materialization of mutations to showcase
@@ -93,4 +89,4 @@ These behaviours are controlled by the following settings:
93
89
-`mutations_execute_nondeterministic_on_initiator` - if true, non-deterministic functions are executed on the initiator replica and are replaced as literals in `UPDATE` and `DELETE` queries. Default value: `false`.
94
90
-`mutations_execute_subqueries_on_initiator` - if true, scalar subqueries are executed on the initiator replica and are replaced as literals in `UPDATE` and `DELETE` queries. Default value: `false`.
95
91
-`mutations_max_literal_size_to_replace` - The maximum size of serialized literals in bytes to replace in `UPDATE` and `DELETE` queries. Default value: `16384` (16 KiB).
Copy file name to clipboardExpand all lines: docs/en/managing-data/updating-data/overview.md
+10-14Lines changed: 10 additions & 14 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -5,11 +5,9 @@ description: How to update data in ClickHouse
5
5
keywords: [update, updating data]
6
6
---
7
7
8
-
import CloudAvailableBadge from '@theme/badges/CloudAvailableBadge';
9
-
10
8
## Differences between updating data in ClickHouse and OLTP databases
11
9
12
-
When it comes to handling updates, ClickHouse and OLTP databases diverge significantly due to their underlying design philosophies and target use cases. For example, PostgreSQL, a row-oriented, ACID-compliant relational database, supports robust and transactional update and delete operations, ensuring data consistency and integrity through mechanisms like Multi-Version Concurrency Control (MVCC). This allows for safe and reliable modifications even in high-concurrency environments.
10
+
When it comes to handling updates, ClickHouse and OLTP databases diverge significantly due to their underlying design philosophies and target use cases. For example, PostgreSQL, a row-oriented, ACID-compliant relational database, supports robust and transactional update and delete operations, ensuring data consistency and integrity through mechanisms like Multi-Version Concurrency Control (MVCC). This allows for safe and reliable modifications even in high-concurrency environments.
13
11
14
12
Conversely, ClickHouse is a column-oriented database optimized for read-heavy analytics and high throughput append-only operations. While it does natively support in-place updates and delete, they must be used carefully to avoid high I/O. Alternatively, tables can be restructured to convert delete and update into appended operations where they are processed asynchronously and/or at read time, thus reflecting the focus on high-throughput data ingestion and efficient query performance over real-time data manipulation.
15
13
@@ -24,15 +22,15 @@ In summary, update operations should be issued carefully, and the mutations queu
|[Update mutation](/en/sql-reference/statements/alter/update)|`ALTER TABLE [table] UPDATE`| Use when data must be updated to disk immediately (e.g. for compliance). Negatively affects `SELECT` performance. |
27
-
|[Lightweight update](/en/guides/developer/lightweight-update)(ClickHouse Cloud)|`ALTER TABLE [table] UPDATE`| Enable using `SET apply_mutations_on_fly = 1;`. Use when updating small amounts of data. Rows are immediately returned with updated data in all subsequent `SELECT` queries but are initially only internally marked as updated on disk. |
25
+
|[Lightweight update](/en/guides/developer/lightweight-update)|`ALTER TABLE [table] UPDATE`| Enable using `SET apply_mutations_on_fly = 1;`. Use when updating small amounts of data. Rows are immediately returned with updated data in all subsequent `SELECT` queries but are initially only internally marked as updated on disk. |
28
26
|[ReplacingMergeTree](/en/engines/table-engines/mergetree-family/replacingmergetree)|`ENGINE = ReplacingMergeTree`| Use when updating large amounts of data. This table engine is optimized for data deduplication on merges. |
29
27
|[CollapsingMergeTree](/en/engines/table-engines/mergetree-family/collapsingmergetree)|`ENGINE = CollapsingMergeTree(Sign)`| Use when updating individual rows frequently, or for scenarios where you need to maintain the latest state of objects that change over time. For example, tracking user activity or article stats. |
30
28
31
29
Here is a summary of the different ways to update data in ClickHouse:
32
30
33
31
## Update Mutations
34
32
35
-
Update mutations can be issued through a `ALTER TABLE … UPDATE` command e.g.
33
+
Update mutations can be issued through a `ALTER TABLE … UPDATE` command e.g.
36
34
37
35
```sql
38
36
ALTERTABLE posts_temp
@@ -44,9 +42,7 @@ Read more about [update mutations](/en/sql-reference/statements/alter/update).
44
42
45
43
## Lightweight Updates
46
44
47
-
<CloudAvailableBadge />
48
-
49
-
Lightweight updates provide a mechanism to update rows such that they are updated immediately, and subsequent `SELECT` queries will automatically return with the changed values (this incurs an overhead and will slow queries). This effectively addresses the atomicity limitation of normal mutations. We show an example below:
45
+
Lightweight updates provide a mechanism to update rows such that they are updated immediately, and subsequent `SELECT` queries will automatically return with the changed values (this incurs an overhead and will slow queries). This effectively addresses the atomicity limitation of normal mutations. We show an example below:
50
46
51
47
```sql
52
48
SET apply_mutations_on_fly =1;
@@ -62,7 +58,7 @@ WHERE Id = 404346
62
58
1 row inset. Elapsed: 0.115 sec. Processed 59.55 million rows, 238.25 MB (517.83 million rows/s., 2.07 GB/s.)
63
59
Peak memory usage: 113.65 MiB.
64
60
65
-
-increment count
61
+
-increment count
66
62
ALTERTABLE posts
67
63
(UPDATE ViewCount = ViewCount +1WHERE Id =404346)
68
64
@@ -84,11 +80,11 @@ Read more about [lightweight updates](/en/guides/developer/lightweight-update).
84
80
## Collapsing Merge Tree
85
81
86
82
Stemming from the idea that updates are expensive but inserts can be leveraged to perform updates,
87
-
the [`CollapsingMergeTree`](/en/engines/table-engines/mergetree-family/collapsingmergetree) table engine
83
+
the [`CollapsingMergeTree`](/en/engines/table-engines/mergetree-family/collapsingmergetree) table engine
88
84
can be used together with a `sign` column as a way to tell ClickHouse to update a specific row by collapsing (deleting)
89
-
a pair of rows with sign `1` and `-1`.
90
-
If `-1` is inserted for the `sign` column, the whole row will be deleted.
91
-
If `1` is inserted for the `sign` column, ClickHouse will keep the row.
85
+
a pair of rows with sign `1` and `-1`.
86
+
If `-1` is inserted for the `sign` column, the whole row will be deleted.
87
+
If `1` is inserted for the `sign` column, ClickHouse will keep the row.
92
88
Rows to update are identified based on the sorting key used in the `ORDER BY ()` statement when creating the table.
93
89
94
90
```sql
@@ -120,7 +116,7 @@ HAVING sum(Sign) > 0
120
116
```
121
117
122
118
:::note
123
-
The approach above for updating requires users to maintain state client side.
119
+
The approach above for updating requires users to maintain state client side.
124
120
While this is most efficient from ClickHouse's perspective, it can be complex to work with at scale.
0 commit comments