forked from ClickHouse/ClickHouse
-
Notifications
You must be signed in to change notification settings - Fork 1
ClickHouse 25.8 LTS #22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft
tilman-aiven
wants to merge
62
commits into
v25.8.12.129-lts-aiven
Choose a base branch
from
v25.8.12.129-lts-aiven-core-replication
base: v25.8.12.129-lts-aiven
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
ClickHouse 25.8 LTS #22
tilman-aiven
wants to merge
62
commits into
v25.8.12.129-lts-aiven
from
v25.8.12.129-lts-aiven-core-replication
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
In containerized/cloud environments, the system hostname (e.g., pod-abc123) is not the network-accessible address. This causes replica communication to fail. Use the explicitly configured interserver_http_host from config instead of system hostname. The hostname is part of the Host ID hostname:port:database_uuid and replicated database uses the Host ID for Zookeeper registration, replica discovery and DDL coordination. Co-authored-by: Kevin Michel <[email protected]>
Enable internal_replication=true for clusters created programmatically by DatabaseReplicated to improve performance and consistency of Distributed tables over Replicated databases. Unlike statically configured clusters in config.xml, DatabaseReplicated creates clusters dynamically at runtime. This change ensures these programmatically created clusters also benefit from internal replication. Changes: - Add internal_replication field to ClusterConnectionParameters struct - Set internal_replication=true when creating cluster in DatabaseReplicated - Update both Cluster constructors to pass internal_replication to addShard() This ensures that Distributed tables over Replicated databases write to one replica per shard, allowing ReplicatedMergeTree to handle replication asynchronously, instead of writing to all replicas directly. Benefits: - Reduces network traffic (1× instead of N× writes per shard) - Improves performance (one replica processes INSERT, others replicate async) - Better consistency (uses ReplicatedMergeTree replication mechanism) Co-authored-by: Kevin Michel <[email protected]>
Add support for altering DatabaseReplicated settings at runtime without
requiring database recreation. This is particularly important for
rotating cluster_secret, which previously required dropping and
recreating the entire database.
Changes:
- Override applySettingsChanges() in DatabaseReplicated to handle
MODIFY SETTING commands
- Add applyChange() and has() methods to DatabaseReplicatedSettings
wrapper to expose BaseSettings functionality
- Invalidate cached cluster when cluster_secret is changed to ensure
new authentication credentials are used
Implementation details:
- Thread-safe: protected by DatabaseReplicated::mutex
- Validates setting existence before applying changes
- Automatically resets cluster cache when cluster_secret changes
- Supports all DatabaseReplicatedSettings (max_broken_tables_ratio,
max_replication_lag_to_enqueue, collection_name, etc.)
Usage:
ALTER DATABASE MODIFY SETTING cluster_secret='new_secret'
Co-authored-by: Kevin Michel <[email protected]>
Automatically rewrite non-replicated MergeTree engine names to their Replicated equivalents when creating tables in DatabaseReplicated databases. This ensures all tables in a replicated database are properly replicated, even if users specify ENGINE = MergeTree. Changes: - Add rewriteUnreplicatedMergeTreeEngines() method to StorageFactory - Call rewrite method before engine name extraction in StorageFactory::get() - Rewrite applies only on secondary queries (replica execution) - Supports all MergeTree variants (ReplacingMergeTree, SummingMergeTree, etc.) Implementation details: - Checks if engine name ends with MergeTree (catches all variants) - Verifies engine is not already replicated (starts with Replicated) - Only rewrites when query_kind is SECONDARY_QUERY and database engine is Replicated - Modifies AST in-place before storage engine instantiation This allows users to write ENGINE = MergeTree in replicated databases and automatically get ReplicatedMergeTree behavior without manual engine name specification. Co-authored-by: Kevin Michel <[email protected]> Co-authored-by: Joe Lynch <[email protected]> Co-authored-by: Dmitry Potepalov <[email protected]>
…koff When ZooKeeper restarts (especially during version upgrades), it can be unavailable for approximately 6 seconds. ClickHouse previously failed queries if ZooKeeper was not available within ~3 seconds, leading to inconsistent database state because DDL operations are not fully atomic. This change improves ZooKeeper connection resilience by: - Increasing minimum retry attempts from 3 to 6 - Adding exponential backoff between retry attempts (100ms, 200ms, 400ms...) - Capping maximum backoff at 10 seconds to prevent excessive delays The total retry window now covers typical ZooKeeper restart times (~6 seconds), allowing ClickHouse to successfully reconnect after ZooKeeper restarts without requiring manual intervention. This is particularly important during version upgrades when ZooKeeper nodes restart sequentially, as it prevents DDL operations from failing mid-execution and leaving the database in an inconsistent state. Co-authored-by: Kevin Michel <[email protected]>
When ClickHouse restarts, tables in DatabaseReplicated databases are loaded
from metadata files using ATTACH operations. If a table's ZooKeeper path
contains the {shard} macro (e.g., /clickhouse/tables/{uuid}/{shard}),
the macro expansion fails during server startup because shard and replica
information is not populated in the MacroExpansionInfo structure.
The issue occurs because is_replicated_database is only true for
SECONDARY_QUERY operations (queries executed from the replicated queries
log in ZooKeeper). During server restart, ATTACH operations are not
SECONDARY_QUERY operations, so is_replicated_database is false, and the
shard/replica information is not populated, causing the error:
Code: 62. DB::Exception: No macro 'shard' in config while processing
substitutions in '/clickhouse/tables/{uuid}/{shard}'
This fix adds query.attach as an additional condition to populate shard
and replica information during ATTACH operations, ensuring that macro
expansion succeeds during server startup while maintaining the existing
behavior for replication queries.
The change is safe because:
- query.attach is only true during server startup when loading tables
from disk metadata
- getReplicatedDatabaseShardName() and getReplicatedDatabaseReplicaName()
use assert_cast, so they will fail fast if called on a non-replicated
database
- This pattern is already used elsewhere in the codebase (e.g., line 28
for allow_uuid_macro)
This restores functionality that previously worked in ClickHouse 21.x with
the custom {shard_name} macro, which was removed in later versions.
Co-authored-by: Kevin Michel <[email protected]>
When adding a new replica to an existing DatabaseReplicated cluster, the recoverLostReplica() function reads table metadata from ZooKeeper and recreates tables. However, the metadata stored in ZooKeeper contains only the CREATE TABLE statement and table settings, not the global settings that were active during original table creation. If a table was created with the DEFLATE_QPL codec (which requires the enable_deflate_qpl_codec global setting), a new replica would fail to create the table during recovery because this setting is not enabled. This fix explicitly enables enable_deflate_qpl_codec in the recovery query context, ensuring that tables using this codec can be successfully recreated on new replicas. Co-authored-by: Kevin Michel <[email protected]>
If a replica is struggling to execute all tasks in its replication queue, the queue can grow to a huge size and becomes even slower to manage when it's larger (it's an std::list). This queue is also present in ZooKeeper, for each replica. There is also a shared queue, which is called the log. It's used to communicate changes between replicas. One replica will push an item on the log as part of executing a query, then all other replicas will read that log item and repush it in their own queue. Each replica monitors the log size and which replicas are deemed lost when they are too late to consume log items. However, the replicas do not monitor the size of the queue of the other replicas. If a replica is pushing lots of items in the log and another replica is able to copy them to their own queue, but too slow to consume the queue items, then we have a problem. At some point, if the imbalance persists, ZooKeeper will crash because it won't have enough memory to hold all the queues. It will be very hard to recover: ZooKeeper itself often corrupts its state when OOMing. Manually cleaning hundreds of thousands of items in the queues to know which ones can be removed and which ones are necessary to keep the replicated data in a consistent state is almost impossible. To fix that, we reuse the infrastructure that delays insert queries when there are too many parts. The same logic is applied when any replica queue size is too large for the table, or the grand total of all replicas' queue sizes, over all tables, is too large. We also do the same to fail queries if either counter reaches a second threshold, exactly like it's done with the parts count. In each replica, a thread is updating a local copy of the maximum queue size in all replicas. A map is added to the context to keep track of these per-storage maxima. These values are then used to gate the inserts. We do it like that to avoid adding many ZooKeeper queries in the hot path, and we don't need a very accurate queue size to get the feedback loop we need. Changes: - Add ReplicatedMergeTreeQueueSizeThread to periodically monitor queue sizes - Add settings: queue_size_to_delay_insert, queue_size_to_throw_insert, queues_total_size_to_delay_insert, queues_total_size_to_throw_insert - Add metric: ReplicatedQueuesTotalSize to track total queue size across all tables - Extend delayInsertOrThrowIfNeeded() to check queue sizes in addition to parts count - Add Context methods to track per-storage maximum queue sizes - Integrate queue size monitoring into StorageReplicatedMergeTree startup and shutdown lifecycle Co-authored-by: Kevin Michel <[email protected]>
Previously, MOVE PARTITION commands were not replicated through the DatabaseReplicated DDL log, causing data inconsistency when moving partitions between tables across replicas in a DatabaseReplicated cluster. The issue occurred because shouldReplicateQuery() only checked for AlterCommand (metadata changes) but not for PartitionCommand::MOVE_PARTITION. Other partition commands like DROP PARTITION and ATTACH PARTITION are handled by ReplicatedMergeTree's own replication mechanism, but MOVE PARTITION TO TABLE requires database-level coordination to ensure both source and destination tables are updated consistently across all replicas. Changes: - Add PartitionCommands.h include - Extend shouldReplicateQuery() to detect MOVE_PARTITION commands - Return true for MOVE_PARTITION to enable database-level replication This ensures that when a partition is moved between tables, all replicas execute the operation atomically, maintaining data consistency across the cluster. Co-authored-by: Joe Lynch <[email protected]>
Reduce the default value of setting from 1000 to 300 for DatabaseReplicated databases. This reduces ZooKeeper resource consumption by ~70% while maintaining a 6x safety margin over max_replication_lag_to_enqueue (50). Context: Previously, we implemented a server-level setting (replicated_database_logs_to_keep) to centralize control of this value. However, after analysis, we determined that: 1. Customers do not have ALTER_DATABASE_SETTINGS permission, so they cannot modify database settings via ALTER DATABASE MODIFY SETTING 2. The simpler approach of changing the database-level default is sufficient 3. No additional readonly checks are needed since access control already prevents customer modifications This change affects only newly created databases. Existing databases retain their current logs_to_keep value stored in ZooKeeper. The default value of 300 provides adequate recovery buffer while significantly reducing ZooKeeper memory usage in multi-database managed provider environments. Co-authored-by: Khatskevich <[email protected]>
244dd99 to
4f53d4e
Compare
Main service users are allowed the SHOW DATABASES access because it is necessary for their operation. It is implicitly granted by ClickHouse when giving access to a database. However, we do not want to give them access to SHOW CREATE DATABASE. This query shows the entire create statement, unredacted. This is actually a useful feature for superusers, but can leak credentials to other users. Also only show the create query in system.tables to users that were able to create that table. Co-authored-by: Kevin Michel <[email protected]>
This change enables S3 object storage to use custom CA certificates for HTTPS connections, which is useful for testing environments or private deployments where certificates are not signed by public CAs. The implementation adds a configuration option that can be specified in S3 disk settings. When provided, this path is used to create a custom SSL context for all HTTPS connections to that S3 endpoint. Key changes: - Added field to S3Settings struct with config loading, serialization, and deserialization support - Extended PocoHTTPClientConfiguration to store and pass ca_path - Updated ClientFactory::createClientConfiguration to accept ca_path - Modified PocoHTTPClient to create SSL context from ca_path when provided - Extended HTTP layer (HTTPCommon, HTTPConnectionPool) to accept and use custom SSL contexts - Updated all makeHTTPSession call sites to match new signature - Preserved ca_path through credentials provider chain Backwards compatibility: - ca_path is optional and defaults to std::nullopt - Serialization includes backwards compatibility handling for old data formats without ca_path - All existing call sites continue to work with default (empty) context parameter Co-authored-by: Kevin Michel <[email protected]>
This change enables Azure Blob Storage to use custom CA certificates for HTTPS connections, which is useful for testing environments or private deployments where certificates are not signed by public CAs. The implementation adds a configuration option that can be specified in Azure Blob Storage disk settings. When provided, this path is used to configure Curl's CAInfo option (Azure SDK uses Curl underneath) for all HTTPS connections to that Azure endpoint. Key changes: - Added field to RequestSettings struct - Updated getRequestSettings() to read ca_path from config - Modified getClientOptions() to set curl_options.CAInfo when curl_ca_path is provided - Endpoint-specific settings automatically supported through existing getRequestSettings() call Implementation details: - Uses Curl's built-in CAInfo option via Azure SDK's CurlTransportOptions - Simpler than S3 implementation as no HTTP layer modifications needed - Azure SDK handles HTTP/HTTPS layer, so we only configure Curl options This commit is based on the original patch: 0049-Custom_Azure_certificate_authority.patch Co-authored-by: Kevin Michel <[email protected]>
This change prevents users from bypassing default profile restrictions by creating new users/roles that use less restrictive profiles (like admin profile). The security issue: Normally, ClickHouse prevents switching to settings profiles that are less restrictive than your current profile by verifying the new profiles against the constraints of the current profile. However, if a user has permission to create new users, they can create a user that directly uses the admin profile, then connect directly with this user. This way they can skip the verification that happens while switching profiles and escape the constraints of the default profile. The solution: Enforce that all created profiles must inherit from the default profile (or one of its descendants), unless the current user has the `allow_non_default_profile` setting enabled. This works because the admin profile is pre-created using users.xml and not via SQL, so it can already exist and be configured with `allow_non_default_profile` set to true, while the pre-created default profile has `allow_non_default_profile` set to false. Key changes: - Added `allow_non_default_profile` setting to control profile creation - Added methods to AccessControl and SettingsProfilesCache to check profile inheritance hierarchy - Added circular dependency detection when creating/altering profiles - Auto-inject default profile as parent if no parent is specified and user doesn't have `allow_non_default_profile` permission - Added validation in SettingsConstraints to ensure parent profiles inherit from default Co-authored-by: Kevin Michel <[email protected]>
This commit adds support for delegating AWS S3 signature generation to
an external HTTP service. This enables use cases where signature
generation needs to be controlled by a proxy or external service for
security, compliance, or monitoring purposes.
The process is accessed over HTTP and the URL to access it can be
configured with the signature_delegation_url parameter in the S3 disk
configuration.
ClickHouse will make a POST request with a JSON body:
{
"canonicalRequest": "PUT\n..."
}
And expects a JSON response with status 200 and this structure:
{
"signature": "01234567890abcdef..."
}
The canonical request matches the format defined by AWS for signatures
and contains all the required information (path, host, operation...) to
let the proxy decide if the request is allowed and can be signed.
Changes:
- Added AWSAuthV4DelegatedSigner class that delegates signature
generation to an external HTTP service
- Updated S3Client to use AWSAuthSignerProvider-based constructor
(compatible with newer AWS SDK)
- Added signature_delegation_url configuration parameter to S3 disk
settings
- Updated AWS SDK submodule to aiven/clickhouse-v25.8.12.129 branch
which includes the necessary SDK changes for delegated signatures
- Updated all S3 client creation sites to pass the new parameter
This commit was adapted from the original patch to work with the newer
AWS SDK version (1.7.321) used in ClickHouse v25.8.12.129, which
requires using AWSAuthSignerProvider instead of the older constructor
signature.
Co-authored-by: Kevin Michel <[email protected]>
The process is accessed over HTTP and the URL to access it can be
configured with the signature_delegation_url parameter in the Azure
disk configuration. ClickHouse will make a POST request with a json body:
{
"stringToSign": "..."
}
And expects a JSON response with status 200 and this structure:
{
"signature": "01234567890abcdefg..."
}
The string to sign matches the format defined by Azure for signatures.
The canonical request also contains all the required information
(path, operation...) to let the proxy decide if the request is
allowed and can be signed. It's also enough information to know
the cost of the request.
This commit was applied from the patch file 0067-Delegated_signature_azure.patch
and adapted to work with ClickHouse v25.8.12.129 codebase structure.
Changes include:
- Updated Azure SDK submodule to aiven/azure-sdk-for-cpp fork
with branch aiven/clickhouse-v25.8.12.129 containing the SDK
modifications (virtual GetSignature method in SharedKeyPolicy)
- Added AzureDelegatedKeyPolicy class that extends SharedKeyPolicy
to delegate signature generation via HTTP POST requests
- Added account_name and signature_delegation_url fields to
RequestSettings for configuration
- Added delegated_signature flag to ConnectionParams to control
client creation behavior
- Updated all ConnectionParams creation sites to include the
delegated_signature flag
- Modified getClientOptions to inject AzureDelegatedKeyPolicy into
PerRetryPolicies when signature delegation is enabled
Co-authored-by: Kevin Michel <[email protected]>
This commit enforces SSL/TLS encryption for all MySQL protocol connections to ClickHouse. Previously, clients could connect without SSL and communicate in plaintext. Now, any connection attempt without SSL is rejected with an error message. When a client attempts to connect without SSL: - The connection is immediately rejected - An error packet (MySQL error code 3159) is sent with a clear message - The connection is closed Error message: "SSL support for MySQL TCP protocol is required. If using the MySQL CLI client, please connect with --ssl-mode=REQUIRED." This is a security hardening change that ensures all MySQL protocol traffic is encrypted, protecting credentials and data in transit. This helps meet compliance requirements and prevents man-in-the-middle attacks. This commit was applied from the patch file 0071-MySQL_enforce_SSL.patch Co-authored-by: Joe Lynch <[email protected]>
This commit enforces HTTPS/TLS encryption for all HTTP/HTTPS connections in two ClickHouse features: 1. HTTP Dictionary Sources - dictionaries loaded from remote URLs 2. URL Storage - tables that read/write data from remote URLs Previously, both features allowed unencrypted HTTP connections, which exposed credentials, queries, and data to potential interception. Now, any attempt to use HTTP (non-HTTPS) URLs is rejected with a clear error message. Changes include: - Changed HTTPDictionarySource::Configuration::url from std::string to Poco::URI for better type safety and scheme validation - Added HTTPS validation in HTTPDictionarySource registration: throws UNSUPPORTED_METHOD error if scheme is not "https" - Added HTTPS validation in StorageURL constructor: throws BAD_ARGUMENTS error if scheme is not "https" - Simplified code by removing redundant Poco::URI object creations and using configuration.url directly throughout Security impact: - Prevents unencrypted transmission of dictionary data, table data, credentials, and query parameters - Protects against man-in-the-middle attacks - Helps meet compliance requirements (PCI-DSS, HIPAA, GDPR) - Fails fast with clear error messages if HTTP is used Breaking change: Existing configurations using http:// URLs will fail and must be updated to use https:// URLs. This commit was applied from the patch file 0091-HTTP-dictionary-source.patch Co-authored-by: Joe Lynch <[email protected]>
This commit enables a special user (configured via
user_with_indirect_database_creation setting) to create and drop
Replicated databases through SQL with automatic parameter setup and
privilege management.
Main changes:
1. Allow special user calling CREATE DATABASE and DROP DATABASE, which
overrides during execution phase, ensuring correct parameters are
passed and running it with elevated privileges.
2. Introduce GRANT DEFAULT REPLICATED DATABASE PRIVILEGES statement,
which grants a default set of privileges for a database. This
maintains a common source of truth while creating databases from
different environments.
3. Add server settings:
- reserved_replicated_database_prefixes: Prohibits database names
with certain prefixes
- user_with_indirect_database_creation: User allowed simplified
database creation
- cluster_database: Reference database for cluster operations
4. Enhance DatabaseReplicated to store shard_macros for reuse when
creating new databases.
5. Enforce ON CLUSTER requirement for non-admin users when dropping
databases to ensure complete removal across the cluster.
Technical details:
- Added createReplicatedDatabaseByClient() to handle automatic database
creation with proper cluster configuration
- Added checkDatabaseNameAllowed() to validate database names
- Modified executeDDLQueryOnCluster() to support skipping distributed
checks for internal operations
- Added setGlobalContext() to Context for executing queries with
elevated privileges
[DDB-1615] [DDB-1839] [DDB-1968]
CHECK TABLE requires an explicit grant since e96e0ae. This commit adds CHECK to the default privileges granted when using GRANT DEFAULT REPLICATED DATABASE PRIVILEGES, ensuring users can run CHECK TABLE on tables in replicated databases without requiring an additional grant. The change adds "CHECK, " to the privilege list in InterpreterGrantQuery::execute() when handling default replicated database privileges. Co-authored-by: Aliaksei Khatskevich <[email protected]>
This commit adds comprehensive SSL/TLS configuration capabilities for
PostgreSQL and MySQL database connections in ClickHouse, along with a
security fix for the MariaDB connector.
Changes:
1. MariaDB Connector/C Security Fix:
- Updated submodule to aiven/mariadb-connector-c fork
- Fixed X509_check_host call to include hostname length parameter
- Prevents potential certificate validation bypass vulnerabilities
2. PostgreSQL SSL Configuration:
- Added SSLMode enum (DISABLE, ALLOW, PREFER, REQUIRE, VERIFY_CA, VERIFY_FULL)
- Added server settings:
* postgresql_connection_pool_ssl_mode (default: PREFER)
* postgresql_connection_pool_ssl_root_cert (default: empty)
- Updated PoolWithFailover to accept SSL mode and CA certificate path
- Modified formatConnectionString to include sslmode and sslrootcert parameters
- Integrated SSL settings across all PostgreSQL integration points:
* DatabasePostgreSQL
* DatabaseMaterializedPostgreSQL
* StoragePostgreSQL
* StorageMaterializedPostgreSQL
* TableFunctionPostgreSQL
* PostgreSQLDictionarySource
3. MySQL SSL Configuration:
- Added MySQLSSLMode enum (DISABLE, PREFER, VERIFY_FULL)
- Updated Connection, Pool, and PoolWithFailover classes to accept SSL mode
- Added ssl_mode and ssl_root_cert to StorageMySQL::Configuration
- Enhanced MySQL dictionary source to support ssl_mode in named collections
- Integrated SSL settings in MySQLHelpers and StorageMySQL
Security Benefits:
- Enables encrypted connections to prevent data interception
- Supports certificate validation to prevent man-in-the-middle attacks
- Provides flexible SSL mode selection for different security requirements
- Fixes critical certificate hostname validation bug in MariaDB connector
The changes maintain backward compatibility with default SSL mode set to
PREFER, which attempts SSL but falls back gracefully if unavailable.
Co-authored-by: Joe Lynch <[email protected]>
1080c5a to
3132b13
Compare
Remote calls (like `azureBlobStorageCluster`) format `AST` into `String` using `formatWithSecretsOneLine`. This function wipes sensitive data, which can lead to incorrect remote call. This commit makes wiping optional and stops it during the remote calls.
Add an extra flag to each user, called "Protected". Users with the "Protected" flag can only be created, altered, removed, or have privileges granted/revoked by users with the extra `PROTECTED_ACCESS_MANAGEMENT` privilege. This privilege is only meant to be granted to our internal admin user, and the "Protected" flag will be given to the main service user, allowing it to exist as an SQL user (which is necessary to have less privileges than hardcoded XML users), but not be removable or alterable by cluster users. Changes: - Added `protected_flag` field to User entity and `isProtected()` method to IAccessEntity interface - Added `PROTECTED_ACCESS_MANAGEMENT` access type for controlling operations on protected users - Added `PROTECTED` keyword to SQL parser for CREATE USER statements - Updated all access storage implementations (Disk, Memory, Multiple, Replicated, AccessControl) to support CheckFunc parameter for validation during insert/remove operations - Updated interpreters to enforce PROTECTED_ACCESS_MANAGEMENT privilege: - InterpreterCreateUserQuery: Check when creating/altering protected users - InterpreterDropAccessEntityQuery: Check when dropping protected users - InterpreterMoveAccessEntityQuery: Check when moving protected users - InterpreterGrantQuery: Check when granting/revoking to protected users - Updated User equality comparison to include protected_flag - Updated SHOW CREATE USER to display PROTECTED keyword when applicable API Adaptations: - Vector-based insert/insertOrReplace methods don't support CheckFunc in this ClickHouse version, so manual checks were added before calling these methods - ContextAccess::checkGranteeIsAllowed signature differs from original patch (doesn't accept ContextPtr), so the protected check is enforced at the interpreter level instead Co-authored-by: Kevin Michel <[email protected]>
We need to keep track of existing named collections. In order to do that, we wish to add some metadata to each collection. The metadata keys are added as optional collection parameters for the storages and functions that validate the keys. This change allows `integration_id` and `integration_hash` keys to be present in named collections without triggering validation errors. These metadata keys are whitelisted in the validation function, allowing integrations to track which collections they own or manage without breaking existing validation logic. Changes: - Added whitelist check for `integration_id` and `integration_hash` keys in validateNamedCollection() function - These keys are now silently ignored during validation, allowing them to be stored in named collections without being listed as required or optional keys Co-authored-by: Aris Tritas <[email protected]>
Collaborator
|
Great job. Thanks for describing the process and progress in detail. |
* Allow using DDL created named collection for PostgreSQL dictionary. Currently this fails upstream. * Ensure that TLS works. * Enforce using named collection. This patch simplifies the PostgreSQL dictionary source implementation by removing support for config file-based configuration and enforcing the use of named collections. This improves consistency, security, and enables proper TLS/SSL support for dictionary sources. Changes: - Removed validateConfigKeys() function and all config file parsing logic - Enforced named collection requirement (throws UNSUPPORTED_METHOD if not provided) - Simplified dictionary registration to use StoragePostgreSQL::processNamedCollectionResult() - Added overloaded processNamedCollectionResult() method that accepts additional_allowed_args - Changed pool creation from replicas_by_priority to single common_configuration - Removed replica support from config files (only works with named collections now) Breaking change: Users using config file-based PostgreSQL dictionary configuration must migrate to named collections. The old configuration method is no longer supported. Co-authored-by: Joe Lynch <[email protected]>
This is a convenience patch to avoid polluting logs on developer laptops with GPU drivers that advertise non-working thermal sensors. The patch disables initialization of EDAC (Error Detection And Correction) and hardware monitoring chip sensors by removing the calls to openEDAC() and openSensorsChips() from the AsynchronousMetrics constructor. Co-authored-by: Kevin Michel <[email protected]>
This patch fixes getTCPPortSecure() to read the port from the registered server ports map instead of directly from the configuration file. This ensures it works correctly when the port is obtained from ZooKeeper or other dynamic sources, not just from static configuration. The patch also improves thread safety by adding proper mutex protection for the server_ports map and introduces a new tryGetServerPort() helper method for non-throwing port lookups. Co-authored-by: Kevin Michel <[email protected]>
This patch disables the ThreadFuzzer pthread wrapping feature by always setting THREAD_FUZZER_WRAP_PTHREAD to 0, regardless of platform or sanitizer settings. The pthread wrapping feature has compatibility issues with newer glibc versions (especially glibc 2.36+) and is a testing feature that should not be enabled in production builds. This patch simplifies the code by removing conditional compilation logic and ensuring consistent behavior across all platforms. Co-authored-by: Kevin Michel <[email protected]>
This patch removes the default registration of the /replicas_status HTTP endpoint to reduce the default attack surface and prevent exposure of replication state information by default. The /replicas_status endpoint provides information about the status of replicated MergeTree tables, including replication lag and detailed state information. While useful for monitoring, this information should not be exposed by default for security and privacy reasons. Co-authored-by: Kevin Michel <[email protected]>
d909e2e to
78c563d
Compare
Even with low swappiness, sometimes ClickHouse pages go to swap. By itself, it's not a problem, unused pages can move to swap without any issue. However, this interacts badly with the memory tracker of ClickHouse. The memory tracker is in charge of limiting the total memory usage of the server without interrupting the service (unlike the OOM killer or an earlyOOM daemon). Instead, individual queries are delayed or rejected until more memory is available. The tracker normally only looks at the RSS of the process, which excludes the swap. If part of ClickHouse memory is moved to swap, then the tracker stops counting that memory towards the limit and allows more memory allocations. Then a little bit more memory is moved to swap, and this cycle keeps going until all swap is consumed. In case of low swappiness configuration of the system it can take days or weeks before ClickHouse consumes all swap. At that point, ClickHouse is still below its configured maximum ram usage, as configured. But, if some swapped page suddenly needs to move back to ram, then we have a real problem: there is not enough room in ram because all processes were allocated a portion of it and there is no room in swap because ClickHouse slowly consumed everything. We have to move things around but no room to move, it's swap hell. Fix it by taking in account the swapped memory of the process in addition to its RSS. To get this data, ClickHouse needs to be modified to look at `/proc/self/status` instead of `/proc/self/statm`, and adapt to the different file format. (statm does not contain swap information). As a bonus, the swapped memory is exposed as a metric like the other memory metrics already present. Impact: - Memory tracker now accounts for swap memory, preventing swap drift - Prevents swap exhaustion and "swap hell" scenarios - Memory limits are correctly enforced including swap usage - MemorySwap metric available for monitoring - More accurate memory tracking in systems with swap enabled Co-authored-by: Kevin Michel <[email protected]>
When a replicated table is created and then deleted, the immediate parent ZooKeeper znode may remain empty and not be cleaned up, causing a node leak in ZooKeeper. This can lead to accumulation of orphaned empty znodes over time, polluting the ZooKeeper namespace. The existing `dropAncestorZnodesIfNeeded()` method in `TableZnodeInfo` removes ancestor znodes from the table path up to `path_prefix_for_drop`, but it may not handle the immediate parent znode in all cases. Fix by adding a new method `dropAncestorTableZnodeIfNeeded()` that specifically removes the immediate parent znode of the table path if it becomes empty after table deletion. This complements the existing cleanup logic and ensures no orphaned znodes are left behind. Co-authored-by: Joe Lynch <[email protected]>
The CPU validation check for ARMv8.1+ was requiring both `atomic` and `ssbs` features to be present in /proc/cpuinfo. However, SSBS (Speculative Store Bypass Safe) is optional in ARMv8.0 and only mandatory in ARMv8.5, which means some valid ARMv8.1 CPUs may not have this feature. This caused false rejections during build validation, preventing builds on legitimate ARMv8.1+ CPUs that don't have SSBS. The `atomic` feature (Large System Extensions - LSE) is a sufficient indicator of ARMv8.1+ support and doesn't have the same optionality issues. Remove the SSBS requirement from the validation check, keeping only the `atomic` check. This allows ARMv8.1 CPUs without SSBS to pass validation while still ensuring the build machine can run intermediate binaries (protoc, llvm-tablegen) that require ARMv8.1+ features.
Make IPv6 support in curl configurable via CMake option, enabled by default. Previously, IPv6 support was hardcoded in curl_config.h, making it impossible to disable IPv6 at build time. Add a CMake option `ENABLE_IPV6` that controls IPv6 support in curl, with a default value of 1 (enabled). This maintains the current behavior while providing the flexibility to disable IPv6 for environments that don't support it or have specific requirements.
Add compile-time flags to enable/disable individual dictionary sources, allowing builds to exclude specific sources to reduce dependencies, binary size, and attack surface. This provides flexibility for creating minimal builds or excluding dictionary sources that are not needed in specific deployments. Previously, all dictionary sources were always compiled and registered, regardless of whether they were needed. This patch adds conditional compilation guards around each dictionary source registration, controlled by CMake defines that default to enabled (maintaining backward compatibility). Co-authored-by: Joe Lynch <[email protected]>
When adding a replica to an existing cluster, the replica will add many GET_PART tasks to its replication queue. These tasks are in charge of downloading the data that existed before the creation of the replica. Meanwhile, if the cluster is still receiving updates, new GET_PART tasks will be created to add the extra data. These new parts are usually small, and in a cluster where the inserts are a bit too frequent, we can have a lot of them. However, there is a limit to the number of GET_PART tasks that a replica will simultaneously execute (by default 8). The early GET_PART tasks, for existing data, are often very large and can take a lot of time to execute. If there are more existing parts than the maximum number of simultanenous downloads, then the early parts will take all slots in the pool for download tasks and the small additional GET_PART tasks will have to wait for the very large early parts to be downloaded. The accumulation of many small GET_PART tasks then becomes an issue for ZooKeeper, each of these tasks is a znode and it's quite easy to end up with >100k pending tasks for a single table. We cannot simply increase the download pool size. A large cluster will always have more existing large parts than the pool size, unless we configure the pool with an unreasonably large size. However, we can split the pool in two parts: one pool for the existing parts that are added early to sync the replica, and another pool for the normal GET_PART tasks that happen because of normal inserts. Co-authored-by: Kevin Michel <[email protected]>
During a maintenance upgrade, we do not wait for the completion of merges and mutations since they are not required to make sure we have all the data from the previous nodes. We do wait for `GET_PART` and similar tasks, since we need the new nodes to get parts from the old nodes. This is implemented using `SYSTEM SYNC REPLICA ... LIGHTWEIGHT`. Some merges and mutations prevent the execution of `GET_PART` tasks that overlap the range of the merge or the mutation. So, even if we are not waiting for `MERGE_PARTS` tasks, these tasks can slow down the completion of a maintenance upgrade. We need to execute `MERGE_PARTS` tasks to avoid having too many parts in the same partition or table. We also need to execute the TTL delete rules and ensure disk usage does not grow too much (TTL deletes are implemented as a subtype of the merge tasks). A possible tradeoff is to only execute merge tasks if they are not too large, the small merges are the ones that keep the number of parts low when the are many small INSERT queries. We can't use the normal merge tree settings like `max_bytes_to_merge_at_min/max_space_in_pool` because they would be applied with `ALTER TABLE` which would step onto user-managed objects and become persisted as part of the table definition. They are also replicated, we can't set them only on new nodes. The patch implements per-server overrides, that we can use to limit the size of merge and mutate tasks, both when a node creates new tasks and when it decides which task to execute. This is only usable with a `LIGHTWEIGHT` sync: if we don't execute some tasks, we also need to not wait for them, or we would wait forever. `SYNC REPLICA` with the `LIGHTWEIGHT` flag allows us to not wait for these tasks. This commit was applied from the patch file 0078-Global_merge_and_mutate_override.patch Co-authored-by: Kevin Michel <[email protected]>
1. Fix sharding race condition with zero copy. Locks are shared between
parts with the same name on different shards. This can lead to a race
condition when committing the same part on different shards for the
znode `/clickhouse/zero_copy/zero_copy_s3/{uuid}/all_0_0_0`. The fix
is to create this znode before committing the part and using
createIfNotExists style operations. Especially prevalent on SYSTEM
RESTORE REPLICA on the first partition.
2. Remove zero copy replication error when dropping table - occurs when
Astacus has left files in the table directory after a failed restore.
Changes:
- Added zero copy lock znode pre-creation in ReplicatedMergeTreeSink.cpp
before part commit using createAncestors() and createIfNotExists()
- Removed error check in MergeTreeData::dropAllData() that blocked table
drop when directory was not empty
- Added extern declaration for allow_remote_fs_zero_copy_replication in
ReplicatedMergeTreeSink.cpp
The zero copy lock pre-creation prevents race conditions when multiple
shards commit parts with the same name simultaneously. The createIfNotExists()
operation is idempotent, allowing safe concurrent execution across shards.
Removing the strict validation in dropAllData() allows tables to be dropped
even when backup/restore tools leave leftover files, as removeRecursive()
will clean them up anyway.
Co-authored-by: Joe Lynch <[email protected]>
…e target table
Refreshable materialized views use ZooKeeper coordination paths that are
expanded from server settings like default_replica_path and default_replica_name.
These paths can contain macros such as {shard}, {database}, {table}, and {replica}
that need to be expanded to actual values.
When a refreshable materialized view is created in a DatabaseReplicated database,
the coordination path may contain the {shard} macro. However, the macro expansion
was not including the shard name in MacroExpansionInfo, causing the {shard} macro
to remain unexpanded in the coordination path.
This fix:
- Retrieves the database from DatabaseCatalog
- Checks if it's a DatabaseReplicated database
- If so, sets info.shard to the shard name from the database
- This ensures {shard} macros are properly expanded in coordination paths
Without this fix, refreshable materialized views in DatabaseReplicated databases
would fail to coordinate correctly across replicas when the coordination path
contains shard macros, leading to incorrect ZooKeeper paths and coordination
failures.
Co-authored-by: Joe Lynch <[email protected]>
…n ClickHouse Keeper)
Refreshable materialized views previously required the MULTI_READ feature which
is only available in ClickHouse Keeper, not in standard Apache ZooKeeper. This
prevented users with ZooKeeper clusters from using refreshable materialized views.
The MULTI_READ feature provides atomic operations for reading multiple ZooKeeper
paths simultaneously. However, for refreshable materialized views, this atomicity
is not strictly necessary:
1. Znode creation: The `running` znode uses ephemeral mode, which means only one
replica can create it at a time, providing natural coordination. The other
persistent znodes (coordination path, replicas directory, paused znode) can be
created independently without atomicity concerns.
2. Znode reading: The only multi-read operation reads three paths:
`coordination.path`, `coordination.path + "/running"`, and
`coordination.path + "/paused"`. Minor inconsistencies in these reads are
acceptable:
- `running` znode: Ephemeral, checked separately anyway
- `paused` znode: Timing issues are acceptable (may miss early refresh or
refresh once after pause)
This patch removes the MULTI_READ requirement by:
- Replacing atomic `multi(ops)` with async `asyncTryCreateNoThrow()` calls
- Removing MULTI_READ checks in constructor and readZnodesIfNeeded
- Using the same async pattern already used elsewhere in ClickHouse
(e.g., StorageReplicatedMergeTree)
The asyncTryCreateNoThrow() method works with both ZooKeeper and ClickHouse
Keeper, enabling refreshable materialized views to work with either coordination
service. Each create operation is handled independently, with ZNODEEXISTS errors
gracefully handled (idempotent operations).
Changes:
- Commented out unused `attach` parameter in RefreshTask constructor
- Removed MULTI_READ feature check in constructor (lines 119-121)
- Replaced `multi(ops)` with `asyncTryCreateNoThrow()` futures pattern
- Removed MULTI_READ feature check in readZnodesIfNeeded (lines 945-946)
- Added error handling for individual async create operations
This enables refreshable materialized views to work with standard ZooKeeper
clusters without requiring migration to ClickHouse Keeper, providing better
deployment flexibility.
Co-authored-by: Aliaksei Khatskevich <[email protected]>
Co-authored-by: Joe Lynch <[email protected]>
This commit adds CMake configuration flags to disable specific table engines and table functions at compile time, allowing builds with a reduced feature set. This enables: 1. Security: Disable risky engines/functions (e.g., Executable, URL, File, Remote) 2. Compliance: Remove features that don't meet regulatory requirements 3. Minimal Builds: Create smaller binaries by excluding unused features 4. Cloud Deployments: Optimize for cloud environments The implementation adds two-level checks: - Dependency check: Is the library available? (e.g., USE_MONGODB) - Registration flag: Should the engine/function be registered? (e.g., REGISTER_MONGODB_TABLE_ENGINE) Both conditions must be true for an engine/function to be registered. If the registration flag is not set in CMake, it defaults to enabled (backward compatible). Changes: - Added 33 CMake flags in config.h.in (17 for engines, 16 for functions) - Updated registerStorages.cpp to check registration flags for: Azure Blob Queue, Azure Blob, Executable, File, FileLog, Iceberg, KeeperMap, MongoDB, MySQL, NATS, ODBC, Redis, S3 Queue, S3, URL - Updated registerStorageObjectStorage.cpp for Azure Blob Storage - Updated registerTableFunctions.cpp to check registration flags for: Executable, File, MongoDB, ODBC, Redis, Hive, URL, URL Cluster - Updated TableFunctionRemote.cpp to conditionally register "remote" function Usage: cmake -DREGISTER_MONGODB_TABLE_ENGINE=OFF ... cmake -DREGISTER_URL_FUNCTION=OFF ... cmake -DREGISTER_EXECUTABLE_TABLE_ENGINE=OFF ... Co-authored-by: Kevin Michel <[email protected]> Co-authored-by: Joe Lynch <[email protected]> Co-authored-by: Aliaksei Khatskevich <[email protected]>
REGISTER_TIMESERIES_TABLE_ENGINE, REGISTER_TIMESERIES_FUNCTION, REGISTER_OBJECT_STORAGE_TABLE_ENGINE, REGISTER_OBJECT_STORAGE_FUNCTION, and REGISTER_DATALAKE_FUNCTION
The constraints were previously only checked when executing the query from the replicated DDL queue. At that point the DDLWorker was using the system profile for settings constraints. This meant that a non-admin user, using the default profile, could use a MergeTree settings that was not allowed by its profile. Fix that by checking the MergeTree settings constraints before the query is enqueued, when we still have the query context attached to the user running the query. We check both Replicated*MergeTree and apparently non-replicated MergeTree because this check happens before we rewrite the query to enforce Replicated*MergeTree engine types. This commit fixes a security vulnerability where users could bypass profile-based settings restrictions by submitting DDL queries to DatabaseReplicated. The fix ensures that: 1. Constraints are checked early (before enqueuing) when user context is still available 2. User's profile restrictions are properly enforced, not system profile 3. Both CREATE TABLE and ALTER TABLE SETTING queries are validated 4. All MergeTree variants are covered (before query rewrite) Co-authored-by: Kevin Michel <[email protected]> Co-authored-by: Aliaksei Khatskevich <[email protected]>
This commit adds a `read_only` setting to KeeperMap storage that prevents write operations (INSERT, UPDATE, DELETE, TRUNCATE, ALTER) when enabled. This is useful for creating read-only views of KeeperMap data or preventing accidental modifications in production environments. The implementation follows the BaseSettings framework pattern used by MergeTreeSettings, ensuring consistency with the ClickHouse codebase. Co-authored-by: Salvatore Mesoraca <[email protected]> Co-authored-by: Aliaksei Khatskevich <[email protected]>
This patch adds a new `named_collection` column to the `system.tables` system table, which displays the name of the named collection (if any) that was used to create each table. This enables users to track which named collections are associated with their tables, particularly useful for external storage engines like MySQL, PostgreSQL, S3, and Azure Blob Storage. Co-authored-by: Aliaksei Khatskevich <[email protected]>
The metadata_version.txt file was added to each part folder to help with concurrency issues when mutating the table schema, but it was not included in the frozen files when creating backups. We need to include this file in the frozen shadow/ folder because we use these files to know what to backup. When using object storage, this file is actually a pointer to the file in object storage. We use these pointers to know which files are still referenced by ClickHouse and should not be deleted. By omitting this file, we would not know about it, and delete the metadata_version.txt file in object storage. This was causing latent issues: ClickHouse doesn't immediately notice the missing file, but instead complains loudly when adding a new replica - the new replica tries to download this file when syncing from existing replicas. The fix sets keep_metadata_version = true in ClonePartParams when freezing parts, ensuring the metadata_version.txt file is preserved in frozen backups. Co-authored-by: Kevin Michel <[email protected]>
Removed: - Cloud mode flags and settings - Distributed cache settings - SharedMergeTree settings - Filesystem cache and cache warmer settings - Experimental cloud features - Cloud-specific MergeTree part storage settings - All cloud references from settings history Co-authored-by: Joe Lynch <[email protected]> Co-authored-by: Aliaksei Khatskevich <[email protected]>
Enforce that ReplicatedMergeTree tables must use default ZooKeeper path and replica name from server settings. Disallows any custom values to ensure consistent replication configuration. Co-authored-by: Dmitry Potepalov <[email protected]>
If both the certificate file and key file are defined, even when they are not used, ClickHouse will try to read and parse them. We can't even work around the issue by creating empty files. There were multiple iterations of improvements for this patch. The latest approach to the problem is to make sure there's one successful call for `tryLoad` before calling `tryReloadAll`. During the research it was noticed that `tryLoad` loads exactly one prefix `"openSSL.server."` and there's no place in the code that adds another prefix (at least in our usecase). This means that `tryReloadAll` acts effectively the same as `tryLoad`. This commit was applied from the patch file 0064-Lazy_certificates.patch [DDB-1898] Co-authored-by: Kevin Michel <[email protected]> Co-authored-by: Joe Lynch <[email protected]> Co-authored-by: Aris Tritas <[email protected]> Co-authored-by: Tilman Möller <[email protected]>
`alter code` is detached from `create table` code, which makes it necessary to copy field initializaiton logic. This commit makes `alter table` to produce the same `sorting key` ZooKeeper metadata as `create table`. This commit was applied from the patch file 0087-Fix-alter-order-by.patch Co-authored-by: Aliaksei Khatskevich <[email protected]>
Remote: * Allow use of addresses_expr in named collection configuration. * Enforce `secure` parameter for ClickHouse dictionary source when not using named collections. Local: * New server setting `dictionary_user` to specify which user to run the dictionary query, rather than just the default user (since we don't have a default user) This commit was applied from the patch file 0089-Dictionary-source-ClickHouse.patch Co-authored-by: Joe Lynch <[email protected]>
2571c1c to
fcf2b2a
Compare
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Patches
Core replication (patch group 1)
(1) Advertise host from config for replicated databases
(2) Enable internal replication for DatabaseReplicated clusters
(5) Enable ALTER DATABASE MODIFY SETTING for Replicated databases
(8) Replace MergeTree with ReplicatedMergeTree in Replicated databases
(14) Tolerate ZooKeeper restart with increased retries and exponential backoff
(15) Fix ClickHouse restart with replicated tables containing {shard} macro
(16) Add missing settings to recoverLostReplica
(23) Fix unbounded replication queue growth
(37) Replicate ALTER TABLE MOVE PARTITION queries through DatabaseReplicated
(55) Change default logs_to_keep from 1000 to 300 for DatabaseReplicated
Security features (patch group 2)
(6) Added support for protected users
(13) Restrict SHOW CREATE DATABASE access
(20) Allow custom CA certificate path for S3 connections
(22) Allow custom CA certificate path for Azure Blob Storage connections
(26) Fix default profile escape vulnerability
(32) Allow delegating S3 signature to a separate process
(33) Allow delegating Azure signature to a separate process
(35) Enforce SSL in the MySQL handler
(45) Enforce HTTPS for URL storage and HTTPDictionarySource
(54) Allow avnadmin creating database using sql
(57) Add CHECK TABLE to default privileges
(58) Add SSL/TLS configuration support for PostgreSQL and MySQL connections
(61) stop wiping secrets from remote calls
S3/Azure
(19) Fix IPv6 S3 object storage host
(21) IPv6 Azure object storage host
(24) Add support for Azure object storage path prefix
(28) Add Backup disk type
(34) Fix uncaught exception if S3 storage fails
(51) Skip attempt to create a container in azure blob storage
(53) Fix use after move in Azure new settings application
Kafka integration
(12) Add Kafka configuration support for SASL and SSL settings
(25) Allow decreasing number of Kafka consumers to zero
(29) Support per-table schema registry with authentication
(30)Add kafka_auto_offset_reset and kafka_date_time_input_format settings
(56) Add extra settings to Kafka Table Engine
PostgreSQL/MySQL
(11) Unlock PostgreSQL database
(36) Add support for integration metadata to named collections validation
(58) Multiple changes in PostgreSQL dictionary
Zookeeper and infrastructure
(3) Ignore unreadable sensors
(4) Fix tcp_port_secure from ZK
(10) Disable thread fuzzer
(17) Disable replicas_status endpoint
(18) Fix swap drift
Dictionaries & Collections
(41) Allow disabling of individual dictionary sources
Performance & Optimization
(38) Add early fetch pool
(39) Add per-server override for max bytes to merge/mutate
(48) Zero copy fixes
(64) bloomfilter: use libdivide to compute the bit location
(67) set enable_job_stack_trace=0 to avoid overhead
Materialized Views
(65) Fix refreshable mat views where there is a shard macro in the target table
(66) Allow refreshable materialized views when using ZooKeeper
Engine features
(7) Added support for disabling various table engines and table functions
(27) Check MergeTree settings constraints before enqueuing DDL queries
(49) Add read-only setting to KeeperMap storage
(63) Add named_collection column to system.tables
Misc
(40) Include metadata_version.txt when freezing
(46) remove all cloud-specific settings
(9) Disallow replication parameters customization
(31) Fix ClickHouse trying to read non-existent certificate files
(42) Fix alter order by
(43) Changes for ClickHouse dictionary source, both remote and local.
Appendix