Skip to content

Commit d04cfc5

Browse files
authored
Merge branch 'ClickHouse:master' into buzzhouse-update
2 parents 7944fcb + 416b72c commit d04cfc5

File tree

152 files changed

+5927
-4283
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

152 files changed

+5927
-4283
lines changed

README.md

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -51,7 +51,6 @@ Upcoming meetups
5151
* [Los Gatos Meetup](https://www.meetup.com/clickhouse-silicon-valley-meetup-group/events/306445660) - March 12, 2025
5252
* [San Francisco Meetup](https://www.meetup.com/clickhouse-silicon-valley-meetup-group/events/306046697) - March 19, 2025
5353
* [Delhi Meetup](https://www.meetup.com/clickhouse-delhi-user-group/events/306253492/) - March 22, 2025
54-
* [Zurich Meetup](https://www.meetup.com/clickhouse-switzerland-meetup-group/events/306435122/) - March 24, 2025
5554
* [Budapest Meetup](https://www.meetup.com/clickhouse-hungary-user-group/events/306435234/) - March 25, 2025
5655
* [Boston Meetup](https://www.meetup.com/clickhouse-boston-user-group/events/305882607) - March 25, 2025
5756
* [Sao Paulo Meetup](https://www.meetup.com/clickhouse-brasil-user-group/events/306385974/) - March 25, 2025

base/base/defines.h

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -28,6 +28,12 @@
2828
#define NO_INLINE __attribute__((__noinline__))
2929
#define MAY_ALIAS __attribute__((__may_alias__))
3030

31+
#if defined(__x86_64__) || defined(__aarch64__)
32+
# define PRESERVE_MOST __attribute__((preserve_most))
33+
#else
34+
# define PRESERVE_MOST
35+
#endif
36+
3137
#if !defined(__x86_64__) && !defined(__aarch64__) && !defined(__PPC__) && !defined(__s390x__) && !(defined(__loongarch64)) && !(defined(__riscv) && (__riscv_xlen == 64))
3238
# error "The only supported platforms are x86_64 and AArch64, PowerPC (work in progress), s390x (work in progress), loongarch64 (experimental) and RISC-V 64 (experimental)"
3339
#endif

ci/defs/job_configs.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -476,6 +476,7 @@ class JobConfigs:
476476
digest_config=Job.CacheDigestConfig(
477477
include_paths=[
478478
"./tests/queries/0_stateless/",
479+
"./tests/ci/stress.py",
479480
"./tests/clickhouse-test",
480481
"./tests/config",
481482
"./tests/*.txt",

ci/praktika/json.html

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -808,7 +808,11 @@
808808

809809
const infoCell = document.createElement('td');
810810
infoCell.colSpan = row.children.length; // Make it span the full width
811-
infoCell.innerHTML = infoText.replace(/\n/g, '<br>'); // Preserve line breaks
811+
812+
infoCell.innerHTML = '<pre></pre>';
813+
// Append content securely, without interpreting HTML tags.
814+
infoCell.firstChild.textContent = infoText;
815+
812816
infoCell.classList.add('info-text');
813817

814818
infoRow.appendChild(infoCell);
Lines changed: 49 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,49 @@
1+
---
2+
sidebar_position: 1
3+
sidebar_label: 2025
4+
---
5+
6+
# 2025 Changelog
7+
8+
### ClickHouse release v24.12.6.70-stable (834cccbc6e8) FIXME as compared to v24.12.5.81-stable (abbc2854715)
9+
10+
#### Performance Improvement
11+
* Backported in [#76694](https://github.com/ClickHouse/ClickHouse/issues/76694): Fixed unnecessary contention in `parallel_hash` when `max_rows_in_join = max_bytes_in_join = 0`. [#75155](https://github.com/ClickHouse/ClickHouse/pull/75155) ([Nikita Taranov](https://github.com/nickitat)).
12+
13+
#### Improvement
14+
* Backported in [#76428](https://github.com/ClickHouse/ClickHouse/issues/76428): Address some clickhouse-disks usability issues addressed by users. Closes [#67136](https://github.com/ClickHouse/ClickHouse/issues/67136). [#73616](https://github.com/ClickHouse/ClickHouse/pull/73616) ([Daniil Ivanik](https://github.com/divanik)).
15+
* Backported in [#76862](https://github.com/ClickHouse/ClickHouse/issues/76862): Add ability to reload `max_remote_read_network_bandwidth_for_serve` and `max_remote_write_network_bandwidth_for_server` on fly without restart server. [#74206](https://github.com/ClickHouse/ClickHouse/pull/74206) ([Kai Zhu](https://github.com/nauu)).
16+
* Backported in [#76451](https://github.com/ClickHouse/ClickHouse/issues/76451): Allow to parse endpoints like `localhost:1234/handle` in `postgresql` or `mysql` table functions. This fixes a regression introduced in https://github.com/ClickHouse/ClickHouse/pull/52503. [#75944](https://github.com/ClickHouse/ClickHouse/pull/75944) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)).
17+
* Backported in [#76652](https://github.com/ClickHouse/ClickHouse/issues/76652): Added a server setting `throw_on_unknown_workload` that allows to choose behavior on query with `workload` setting set to unknown value: either allow unlimited access (default) or throw a `RESOURCE_ACCESS_DENIED` error. It is useful to force all queries to use workload scheduling. [#75999](https://github.com/ClickHouse/ClickHouse/pull/75999) ([Sergei Trifonov](https://github.com/serxa)).
18+
* Backported in [#76671](https://github.com/ClickHouse/ClickHouse/issues/76671): Use correct fallback when multipart copy to S3 fails during backup with Access Denied. Multi part copy can generate Access Denied error when backup is done between buckets that have different credentials. [#76515](https://github.com/ClickHouse/ClickHouse/pull/76515) ([Antonio Andelic](https://github.com/antonio2368)).
19+
* Backported in [#77200](https://github.com/ClickHouse/ClickHouse/issues/77200): Previously database replicated might print credentials specified in a query to logs. This behaviour is fixed. This closes: [#77123](https://github.com/ClickHouse/ClickHouse/issues/77123). [#77133](https://github.com/ClickHouse/ClickHouse/pull/77133) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)).
20+
21+
#### Bug Fix (user-visible misbehavior in an official stable release)
22+
* Backported in [#74520](https://github.com/ClickHouse/ClickHouse/issues/74520): Skip `metadata_version.txt` in while restoring parts from a backup. [#73768](https://github.com/ClickHouse/ClickHouse/pull/73768) ([Vitaly Baranov](https://github.com/vitlibar)).
23+
* Backported in [#76527](https://github.com/ClickHouse/ClickHouse/issues/76527): Respect `materialized_views_ignore_errors` when a materialized view writes to a URL engine and there is a connectivity issue. [#75679](https://github.com/ClickHouse/ClickHouse/pull/75679) ([Christoph Wurm](https://github.com/cwurm)).
24+
* Backported in [#76538](https://github.com/ClickHouse/ClickHouse/issues/76538): Fix `Block structure mismatch in QueryPipeline stream` error for some queries with `UNION ALL`. [#75715](https://github.com/ClickHouse/ClickHouse/pull/75715) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
25+
* Backported in [#76149](https://github.com/ClickHouse/ClickHouse/issues/76149): Propagate format settings to JSON as string formatting in Native format. [#75832](https://github.com/ClickHouse/ClickHouse/pull/75832) ([Pavel Kruglov](https://github.com/Avogar)).
26+
* Backported in [#76995](https://github.com/ClickHouse/ClickHouse/issues/76995): Fix data loss when enable async insert and insert into ... from file ... with unequal block size if the first block size < async_max_size but the second block > async_max_size, the second block will not be inserted. these data is left in `squashing`. [#76343](https://github.com/ClickHouse/ClickHouse/pull/76343) ([Han Fei](https://github.com/hanfei1991)).
27+
* Backported in [#76401](https://github.com/ClickHouse/ClickHouse/issues/76401): Fix logical error in index analysis if condition in `WHERE` has `pointInPolygon` function. [#76360](https://github.com/ClickHouse/ClickHouse/pull/76360) ([Anton Popov](https://github.com/CurtizJ)).
28+
* Backported in [#76482](https://github.com/ClickHouse/ClickHouse/issues/76482): Removed allocation from the signal handler. [#76446](https://github.com/ClickHouse/ClickHouse/pull/76446) ([Nikita Taranov](https://github.com/nickitat)).
29+
* Backported in [#76966](https://github.com/ClickHouse/ClickHouse/issues/76966): Fix THERE_IS_NO_COLUMN exception when selecting boolean literal from distributed tables. [#76656](https://github.com/ClickHouse/ClickHouse/pull/76656) ([Yakov Olkhovskiy](https://github.com/yakov-olkhovskiy)).
30+
* Backported in [#76856](https://github.com/ClickHouse/ClickHouse/issues/76856): Flush output write buffers before finalizing them. Fix `LOGICAL_ERROR` generated during the finalization of some output format, e.g. `JSONEachRowWithProgressRowOutputFormat`. [#76726](https://github.com/ClickHouse/ClickHouse/pull/76726) ([Antonio Andelic](https://github.com/antonio2368)).
31+
* Backported in [#77254](https://github.com/ClickHouse/ClickHouse/issues/77254): Fix possible crash because of bad JSON column rollback on error during async inserts. [#76908](https://github.com/ClickHouse/ClickHouse/pull/76908) ([Pavel Kruglov](https://github.com/Avogar)).
32+
* Backported in [#77060](https://github.com/ClickHouse/ClickHouse/issues/77060): Fix sorting of `BFloat16` values. This closes [#75487](https://github.com/ClickHouse/ClickHouse/issues/75487). This closes [#75669](https://github.com/ClickHouse/ClickHouse/issues/75669). [#77000](https://github.com/ClickHouse/ClickHouse/pull/77000) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
33+
* Backported in [#77218](https://github.com/ClickHouse/ClickHouse/issues/77218): Bug fix json with variant subcolumn by adding check to skip ephemeral subcolumns in part consistency check. [#72187](https://github.com/ClickHouse/ClickHouse/issues/72187). [#77034](https://github.com/ClickHouse/ClickHouse/pull/77034) ([Smita Kulkarni](https://github.com/SmitaRKulkarni)).
34+
* Backported in [#77246](https://github.com/ClickHouse/ClickHouse/issues/77246): Fix crash during Kafka table creation with exception. [#77121](https://github.com/ClickHouse/ClickHouse/pull/77121) ([Pavel Kruglov](https://github.com/Avogar)).
35+
* Backported in [#77391](https://github.com/ClickHouse/ClickHouse/issues/77391): `SELECT toBFloat16(-0.0) == toBFloat16(0.0)` now correctly returns `true` (from previously `false`). This makes the behavior consistent with `Float32` and `Float64`. [#77290](https://github.com/ClickHouse/ClickHouse/pull/77290) ([Shankar Iyer](https://github.com/shankar-iyer)).
36+
* Backported in [#77331](https://github.com/ClickHouse/ClickHouse/issues/77331): Fix comparison between tuples with nullable elements inside and strings. [#77323](https://github.com/ClickHouse/ClickHouse/pull/77323) ([Alexey Katsman](https://github.com/alexkats)).
37+
38+
#### NOT FOR CHANGELOG / INSIGNIFICANT
39+
40+
* Backported in [#76506](https://github.com/ClickHouse/ClickHouse/issues/76506): Add `prefetch` method to `ReadBufferFromEncryptedFile`. [#76322](https://github.com/ClickHouse/ClickHouse/pull/76322) ([Antonio Andelic](https://github.com/antonio2368)).
41+
* Backported in [#76463](https://github.com/ClickHouse/ClickHouse/issues/76463): Fix `setReadUntilPosition` in `AsynchronousBoundedReadBuffer`. [#76429](https://github.com/ClickHouse/ClickHouse/pull/76429) ([Antonio Andelic](https://github.com/antonio2368)).
42+
* Backported in [#76600](https://github.com/ClickHouse/ClickHouse/issues/76600): Use `MultiRead` when querying `system.distributed_ddl_queue`. [#76575](https://github.com/ClickHouse/ClickHouse/pull/76575) ([Antonio Andelic](https://github.com/antonio2368)).
43+
* Backported in [#76892](https://github.com/ClickHouse/ClickHouse/issues/76892): Add log for HTTP Bad Request. [#76594](https://github.com/ClickHouse/ClickHouse/pull/76594) ([Christoph Wurm](https://github.com/cwurm)).
44+
* Backported in [#76821](https://github.com/ClickHouse/ClickHouse/issues/76821): CI: Disable cross-compile for ARM in release and backport. [#76808](https://github.com/ClickHouse/ClickHouse/pull/76808) ([Max Kainov](https://github.com/maxknv)).
45+
* Backported in [#76957](https://github.com/ClickHouse/ClickHouse/issues/76957): Allow empty chunks in FinishSortingTransfrom. [#76919](https://github.com/ClickHouse/ClickHouse/pull/76919) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
46+
* Backported in [#77043](https://github.com/ClickHouse/ClickHouse/issues/77043): fix fast test 02783_parsedatetimebesteffort leap year. [#76940](https://github.com/ClickHouse/ClickHouse/pull/76940) ([Han Fei](https://github.com/hanfei1991)).
47+
* Backported in [#77080](https://github.com/ClickHouse/ClickHouse/issues/77080): Increase log level for dictionary loading. [#77052](https://github.com/ClickHouse/ClickHouse/pull/77052) ([Michael Lex](https://github.com/mlex)).
48+
* Backported in [#77265](https://github.com/ClickHouse/ClickHouse/issues/77265): Fix uninitialized-value `CoordinationZnode::last_success_duration` in `RefreshTask`. [#77174](https://github.com/ClickHouse/ClickHouse/pull/77174) ([Tuan Pham Anh](https://github.com/tuanpach)).
49+

docs/en/engines/table-engines/integrations/s3.md

Lines changed: 7 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -174,6 +174,10 @@ Received exception from server (version 23.4.1):
174174
Code: 48. DB::Exception: Received from localhost:9000. DB::Exception: Reading from a partitioned S3 storage is not implemented yet. (NOT_IMPLEMENTED)
175175
```
176176

177+
## Inserting Data {#inserting-data}
178+
179+
Note that rows can only be inserted into new files. There are no merge cycles or file split operations. Once a file is written, subsequent inserts will fail. To avoid this you can use `s3_truncate_on_insert` and `s3_create_new_file_on_insert` settings. See more details [here](/integrations/s3#inserting-data).
180+
177181
## Virtual columns {#virtual-columns}
178182

179183
- `_path` — Path to the file. Type: `LowCardinality(String)`.
@@ -340,12 +344,12 @@ FROM s3(
340344
);
341345
```
342346

343-
:::note
347+
:::note
344348
ClickHouse supports three archive formats:
345349
ZIP
346350
TAR
347351
7Z
348-
While ZIP and TAR archives can be accessed from any supported storage location, 7Z archives can only be read from the local filesystem where ClickHouse is installed.
352+
While ZIP and TAR archives can be accessed from any supported storage location, 7Z archives can only be read from the local filesystem where ClickHouse is installed.
349353
:::
350354

351355

@@ -367,3 +371,4 @@ For details on optimizing the performance of the s3 function see [our detailed g
367371
## See also {#see-also}
368372

369373
- [s3 table function](../../../sql-reference/table-functions/s3.md)
374+
- [Integrating S3 with ClickHouse](/integrations/s3)

docs/en/engines/table-engines/mergetree-family/annindexes.md

Lines changed: 5 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -120,16 +120,11 @@ additional techniques are recommended to speed up index creation:
120120
- Index creation can be parallelized. The maximum number of threads can be configured using server setting
121121
[max_build_vector_similarity_index_thread_pool_size](../../../operations/server-configuration-parameters/settings.md#server_configuration_parameters_max_build_vector_similarity_index_thread_pool_size).
122122
- Index creation on newly inserted parts may be disabled using setting `materialize_skip_indexes_on_insert`. Search on such parts will fall
123-
back to exact search but since inserted parts are typically small compared to the total table size, the performance impact is negligible.
124-
- ClickHouse merges multiple parts incrementally in the background into bigger parts. These new parts are potentially merged later into even
125-
bigger parts. Each merge re-builds the vector similarity index the output part (as well as other skip indexes) every time from
126-
scratch. This potentially wastes work for creating vector similarity indexes. To avoid that, it is possible to suppress the creation of
127-
vector similarity indexes during merge using merge tree setting
128-
[materialize_skip_indexes_on_merge](../../../operations/settings/merge-tree-settings.md#materialize_skip_indexes_on_merge). This, in
129-
conjunction with statement [ALTER TABLE \[...\] MATERIALIZE INDEX
130-
\[...\]](../../../sql-reference/statements/alter/skipping-index.md#materialize-index), provides explicit control over the life cycle of
131-
vector similarity indexes. For example, index building can be deferred to periods of low load (e.g. weekends) or after a large data
132-
ingestion.
123+
back to exact search but as inserted parts are typically small compared to the total table size, the performance impact is negligible.
124+
- As parts are incrementally merged into bigger parts, and these new parts are merged into even bigger parts ("write amplification"),
125+
vector similarity indexes are possibly build multiple times for the same vectors. To avoid that, you may suppress merges during insert
126+
using statement [`SYSTEM STOP MERGES`](../../../sql-reference/statements/system.md), respectively start merges once all data has been
127+
inserted using `SYSTEM START MERGES`.
133128

134129
Vector similarity indexes support this type of query:
135130

@@ -146,9 +141,6 @@ To search using a different value of HNSW parameter `hnsw_candidate_list_size_fo
146141
original [HNSW paper](https://doi.org/10.1109/TPAMI.2018.2889473), run the `SELECT` query with `SETTINGS hnsw_candidate_list_size_for_search
147142
= <value>`.
148143

149-
Repeated reads from vector similarity indexes benefit from a large skipping index cache. If needed, you can increase the default cache size
150-
using server setting [skipping_index_cache_size](../../../operations/server-configuration-parameters/settings.md#skipping_index_cache_size).
151-
152144
**Restrictions**: Approximate vector search algorithms require a limit, hence queries without `LIMIT` clause cannot utilize vector
153145
similarity indexes. The limit must also be smaller than setting `max_limit_for_ann_queries` (default: 100).
154146

docs/en/operations/caches.md

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,6 @@ Main cache types:
1212

1313
- `mark_cache` — Cache of marks used by table engines of the [MergeTree](../engines/table-engines/mergetree-family/mergetree.md) family.
1414
- `uncompressed_cache` — Cache of uncompressed data used by table engines of the [MergeTree](../engines/table-engines/mergetree-family/mergetree.md) family.
15-
- `skipping_index_cache` — Cache of in-memory skipping index granules used by table engines of the [MergeTree](../engines/table-engines/mergetree-family/mergetree.md) family.
1615
- Operating system page cache (used indirectly, for files with actual data).
1716

1817
Additional cache types:

docs/en/operations/cluster-discovery.md

Lines changed: 42 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -119,6 +119,48 @@ To enable observer mode, include the `<observer/>` tag within the `<discovery>`
119119
```
120120

121121

122+
### Discovery of clusters {#discovery-of-clusters}
123+
124+
Sometimes you may need to add and remove not only hosts in clusters, but clusters themselves. You can use the `<multicluster_root_path>` node with root path for several clusters:
125+
126+
```xml
127+
<remote_servers>
128+
<some_unused_name>
129+
<discovery>
130+
<multicluster_root_path>/clickhouse/discovery</multicluster_root_path>
131+
<observer/>
132+
</discovery>
133+
</some_unused_name>
134+
</remote_servers>
135+
```
136+
137+
In this case, when some other host registers itself with the path `/clickhouse/discovery/some_new_cluster`, a cluster with name `some_new_cluster` will be added.
138+
139+
You can use both features simultaneously, the host can register itself in cluster `my_cluster` and discovery any other clusters:
140+
141+
```xml
142+
<remote_servers>
143+
<my_cluster>
144+
<discovery>
145+
<path>/clickhouse/discovery/my_cluster</path>
146+
</discovery>
147+
</my_cluster>
148+
<some_unused_name>
149+
<discovery>
150+
<multicluster_root_path>/clickhouse/discovery</multicluster_root_path>
151+
<observer/>
152+
</discovery>
153+
</some_unused_name>
154+
</remote_servers>
155+
```
156+
157+
Limitations:
158+
- You can't use both `<path>` and `<multicluster_root_path>` in the same `remote_servers` subtree.
159+
- `<multicluster_root_path>` can only be with `<observer/>`.
160+
- The last part of path from Keeper is used as the cluster name, while during registration the name is taken from the XML tag.
161+
162+
163+
122164
## Use-Cases and Limitations {#use-cases-and-limitations}
123165

124166
As nodes are added or removed from the specified ZooKeeper path, they are automatically discovered or removed from the cluster without the need for configuration changes or server restarts.

0 commit comments

Comments
 (0)