diff --git a/fluss-flink/fluss-flink-common/src/main/java/com/alibaba/fluss/flink/FlinkConnectorOptions.java b/fluss-flink/fluss-flink-common/src/main/java/com/alibaba/fluss/flink/FlinkConnectorOptions.java index 94fccdec7e..c3f913b477 100644 --- a/fluss-flink/fluss-flink-common/src/main/java/com/alibaba/fluss/flink/FlinkConnectorOptions.java +++ b/fluss-flink/fluss-flink-common/src/main/java/com/alibaba/fluss/flink/FlinkConnectorOptions.java @@ -44,11 +44,10 @@ public class FlinkConnectorOptions { .noDefaultValue() .withDescription( "Specific the distribution policy of the Fluss table. " - + "Data will be distributed to each bucket according to the hash value of bucket-key. " + + "Data will be distributed to each bucket according to the hash value of bucket-key (It must be a subset of the primary keys excluding partition keys of the primary key table). " + "If you specify multiple fields, delimiter is ','. " - + "If the table is with primary key, you can't specific bucket key currently. " - + "The bucket keys will always be the primary key. " - + "If the table is not with primary key, you can specific bucket key, and when the bucket key is not specified, " + + "If the table has a primary key and a bucket key is not specified, the bucket key will be used as primary key(excluding the partition key). " + + "If the table has no primary key and the bucket key is not specified, " + "the data will be distributed to each bucket randomly."); public static final ConfigOption BOOTSTRAP_SERVERS = diff --git a/website/docs/engine-flink/options.md b/website/docs/engine-flink/options.md index bfaa1f999a..b92824e170 100644 --- a/website/docs/engine-flink/options.md +++ b/website/docs/engine-flink/options.md @@ -79,7 +79,7 @@ ALTER TABLE log_table SET ('table.log.ttl' = '7d'); | Option | Type | Default | Description | |-----------------------------------------|----------|-------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | bucket.num | int | The bucket number of Fluss cluster. | The number of buckets of a Fluss table. | -| bucket.key | String | (none) | Specific the distribution policy of the Fluss table. Data will be distributed to each bucket according to the hash value of bucket-key. If you specify multiple fields, delimiter is ','. If the table is with primary key, you can't specific bucket key currently. The bucket keys will always be the primary key(excluding partition key). If the table is not with primary key, you can specific bucket key, and when the bucket key is not specified, the data will be distributed to each bucket randomly. | +| bucket.key | String | (none) | Specific the distribution policy of the Fluss table. Data will be distributed to each bucket according to the hash value of bucket-key (It must be a subset of the primary keys excluding partition keys of the primary key table). If you specify multiple fields, delimiter is ','. If the table has a primary key and a bucket key is not specified, the bucket key will be used as primary key(excluding the partition key). If the table has no primary key and the bucket key is not specified, the data will be distributed to each bucket randomly. | | table.log.ttl | Duration | 7 days | The time to live for log segments. The configuration controls the maximum time we will retain a log before we will delete old segments to free up space. If set to -1, the log will not be deleted. | | table.auto-partition.enabled | Boolean | false | Whether enable auto partition for the table. Disable by default. When auto partition is enabled, the partitions of the table will be created automatically. | | table.auto-partition.time-unit | ENUM | DAY | The time granularity for auto created partitions. The default value is 'DAY'. Valid values are 'HOUR', 'DAY', 'MONTH', 'QUARTER', 'YEAR'. If the value is 'HOUR', the partition format for auto created is yyyyMMddHH. If the value is 'DAY', the partition format for auto created is yyyyMMdd. If the value is 'MONTH', the partition format for auto created is yyyyMM. If the value is 'QUARTER', the partition format for auto created is yyyyQ. If the value is 'YEAR', the partition format for auto created is yyyy. | diff --git a/website/docs/table-design/table-types/pk-table/index.md b/website/docs/table-design/table-types/pk-table/index.md index 8713e02368..5f7647f729 100644 --- a/website/docs/table-design/table-types/pk-table/index.md +++ b/website/docs/table-design/table-types/pk-table/index.md @@ -51,8 +51,8 @@ partition key. ## Bucket Assigning -For primary key tables, Fluss always determines which bucket the data belongs to based on the hash value of the primary -key for each record. +For primary key tables, Fluss always determines which bucket the data belongs to based on the hash value of the bucket +key (It must be a subset of the primary keys excluding partition keys of the primary key table) for each record. If the bucket key is not specified, the bucket key will used as the primary key (excluding the partition key). Data with the same hash value will be distributed to the same bucket. ## Partial Update