|
| 1 | +--- |
| 2 | +title: "Autotuning" |
| 3 | +weight: 4 |
| 4 | +type: docs |
| 5 | +aliases: |
| 6 | +- /custom-resource/autotuning.html |
| 7 | +--- |
| 8 | +<!-- |
| 9 | +Licensed to the Apache Software Foundation (ASF) under one |
| 10 | +or more contributor license agreements. See the NOTICE file |
| 11 | +distributed with this work for additional information |
| 12 | +regarding copyright ownership. The ASF licenses this file |
| 13 | +to you under the Apache License, Version 2.0 (the |
| 14 | +"License"); you may not use this file except in compliance |
| 15 | +with the License. You may obtain a copy of the License at |
| 16 | +
|
| 17 | + http://www.apache.org/licenses/LICENSE-2.0 |
| 18 | +
|
| 19 | +Unless required by applicable law or agreed to in writing, |
| 20 | +software distributed under the License is distributed on an |
| 21 | +"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY |
| 22 | +KIND, either express or implied. See the License for the |
| 23 | +specific language governing permissions and limitations |
| 24 | +under the License. |
| 25 | +--> |
| 26 | + |
| 27 | +# Flink Autotuning |
| 28 | + |
| 29 | +Flink Autotuning aims at fully automating the configuration of Apache Flink. |
| 30 | + |
| 31 | +One of the biggest challenges with deploying new Flink pipelines is to write an adequate Flink configuration. The most |
| 32 | +important configuration values are: |
| 33 | + |
| 34 | +- memory configuration (heap memory, network memory, managed memory, JVM off-heap, etc.) |
| 35 | +- number of task slots |
| 36 | + |
| 37 | +## Memory Autotuning |
| 38 | + |
| 39 | +As a first step, we have tackled the memory configuration which, according to users, is the most frustrating part of |
| 40 | +the configuration process. The most important aspect of the memory configuration is the right-sizing of the |
| 41 | +various [Flink memory pools](https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/memory/mem_tuning/). |
| 42 | +These memory pools include: heap memory, network memory, managed memory, and JVM off-heap memory settings. Non-optimal |
| 43 | +configuration of these pools can cause application crashes, or block large amounts of memory which remain unused. |
| 44 | + |
| 45 | +### How It Works |
| 46 | + |
| 47 | +With Flink Autoscaling and Flink Autotuning, all users need to do is set a max memory size for the TaskManagers, just |
| 48 | +like they would normally configure TaskManager memory. Flink Autotuning then automatically adjusts the various memory |
| 49 | +pools and brings down the total container memory size. It does that by observing the actual max memory usage on the |
| 50 | +TaskMangers or by calculating the exact number of network buffers required for the job topology. The adjustments are |
| 51 | +made together with Flink Autoscaling, so there is no extra downtime involved. |
| 52 | + |
| 53 | +It is important to note that adjusting the container memory only works on Kubernetes and that the initially provided |
| 54 | +memory settings represent the maximum amount of memory Flink Autotuning will use. You may want to be more conservative |
| 55 | +than usual when initially assigning memory with Autotuning. We never go beyond the initial limits to ensure we can |
| 56 | +safely create TaskManagers without running into pod memory quotas or limits. |
| 57 | + |
| 58 | +### Getting Started |
| 59 | + |
| 60 | +#### Dry-run Mode |
| 61 | + |
| 62 | +As soon as Flink Autoscaling is enabled, Flink Autotuning will provide recommendations via events |
| 63 | +(e.g. Kubernetes events): |
| 64 | +``` |
| 65 | +# Autoscaling needs to be enabled |
| 66 | +job.autoscaler.enabled: true |
| 67 | +# Disable automatic memory tuning (only get recommendations) |
| 68 | +job.autoscaler.memory.tuning.enabled: false |
| 69 | +``` |
| 70 | + |
| 71 | +#### Automatic Mode |
| 72 | + |
| 73 | +Automatic memory tuning via can be enabled by setting: |
| 74 | + |
| 75 | +``` |
| 76 | +# Autoscaling needs to be enabled |
| 77 | +job.autoscaler.enabled: true |
| 78 | +# Turn on Autotuning and apply memory config changes |
| 79 | +job.autoscaler.memory.tuning.enabled: true |
| 80 | +``` |
| 81 | + |
| 82 | +### Advanced Options |
| 83 | + |
| 84 | +#### Maximize Managed Memory |
| 85 | + |
| 86 | +Enabling the following option allows to return all saved memory as managed memory. This is beneficial |
| 87 | +when running with RocksDB to maximize its performance. |
| 88 | + |
| 89 | +``` |
| 90 | +job.autoscaler.memory.tuning.maximize-managed-memory: true |
| 91 | +``` |
| 92 | + |
| 93 | +#### Setting Memory Overhead |
| 94 | + |
| 95 | +Memory Autotuning uses a constant amount of memory overhead for heap and metaspace to allow the memory to grow beyond |
| 96 | +the determined maximum size. The default of 20% can be changed to 50% by setting: |
| 97 | + |
| 98 | +``` |
| 99 | +job.autoscaler.memory.tuning.overhead: 0.5 |
| 100 | +``` |
| 101 | + |
| 102 | +## Future Work |
| 103 | + |
| 104 | +### Task Slots Autotuning |
| 105 | + |
| 106 | +The number of task slots are partially taken care by Flink Autoscaling which adjusts the task parallelism and hence |
| 107 | +changes the total number of slots and the number of TaskManagers. |
| 108 | + |
| 109 | +In future versions of Flink Autotuning, we will try to further optimize the number of task slots depending on the |
| 110 | +number of tasks running inside a task slot. |
| 111 | + |
| 112 | +### JobManager Memory Tuning |
| 113 | + |
| 114 | +Currently, only TaskManager memory is adjusted. |
| 115 | + |
| 116 | +### RocksDB Memory Tuning |
| 117 | + |
| 118 | +Currently, if no managed memory is used, e.g. the heap-based state backend is used, managed memory will be set to |
| 119 | +zero by Flink Autotuning which helps save a lot of memory. However, if managed memory is used, e.g. via RocksDB, the |
| 120 | +configured managed memory will be kept constant because Flink currently lacks metrics to accurately measure the usage of |
| 121 | +managed memory. |
| 122 | + |
| 123 | +Nevertheless, users already benefit from the resource savings and optimizations for heap, metaspace, and |
| 124 | +network memory. RocksDB users can solely focus their attention on configuring managed memory. |
| 125 | + |
| 126 | +We already added an option to add all saved memory to the managed memory. This is beneficial when running with RocksDB |
| 127 | +to maximize the in-memory performance. |
0 commit comments