You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Allows a more general join planning algorithm that can handle more complex conditions, but only works with hash join. If hash join is not enabled, then the usual join planning algorithm is used regardless of the value of this setting.
Controls how data is split into tasks when executing a CLUSTER TABLE FUNCTION.
6795
+
6796
+
This setting defines the granularity of work distribution across the cluster:
6797
+
- `file` — each task processes an entire file.
6798
+
- `bucket` — tasks are created per internal data block within a file (for example, Parquet row groups).
6799
+
6800
+
Choosing finer granularity (like `bucket`) can improve parallelism when working with a small number of large files.
6801
+
For instance, if a Parquet file contains multiple row groups, enabling `bucket` granularity allows each group to be processed independently by different workers.
Defines the approximate size of a batch (in bytes) used in distributed processing of tasks in cluster table functions with `bucket` split granularity. The system accumulates data until at least this amount is reached. The actual size may be slightly larger to align with data boundaries.
When creating a `Merge` table without an explicit schema or when using the `merge` table function, infer schema as a union of not more than the specified number of matching tables.
0 commit comments