Skip to content

Commit 9f6fba9

Browse files
author
huyuanfeng
committed
Add a new NumKeyGroupsOrPartitionsParallelismAdjuster to make the logic clearer
1 parent f4df8be commit 9f6fba9

File tree

5 files changed

+206
-193
lines changed

5 files changed

+206
-193
lines changed

docs/layouts/shortcodes/generated/auto_scaler_configuration.html

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -189,10 +189,10 @@
189189
<td>Time interval to resend the identical event</td>
190190
</tr>
191191
<tr>
192-
<td><h5>job.autoscaler.scaling.radical.enabled</h5></td>
193-
<td style="word-wrap: break-word;">false</td>
194-
<td>Boolean</td>
195-
<td>If this option is enabled, The determination of parallelism will be more radical, which will maximize resource utilization, but may also cause data skew in some vertex.</td>
192+
<td><h5>job.autoscaler.scaling.key-group.partitions.adjust.mode</h5></td>
193+
<td style="word-wrap: break-word;">DEFAULT</td>
194+
<td><p>Enum</p></td>
195+
<td>How to adjust the parallelism of Source vertex or upstream shuffle is keyBy<br /><br />Possible values:<ul><li>"DEFAULT": This mode ensures that the parallelism adjustment attempts to evenly distribute data across subtasks. It is particularly effective for source vertices that are aware of partition counts or vertices after 'keyBy' operation. The goal is to have the number of key groups or partitions be divisible by the set parallelism, ensuring even data distribution and reducing data skew.</li><li>"MAXIMIZE_UTILISATION": This model is to maximize resource utilization. In this mode, an attempt is made to set the minimum degree of parallelism that meets the current consumption rate requirements. Unlike the default mode, it is not enforced that the number of key groups or partitions is divisible by the degree of parallelism.</li></ul></td>
196196
</tr>
197197
<tr>
198198
<td><h5>job.autoscaler.stabilization.interval</h5></td>

0 commit comments

Comments
 (0)