|
| 1 | +--- |
| 2 | +title: "Release Notes - Flink 2.2" |
| 3 | +--- |
| 4 | + |
| 5 | +<!-- |
| 6 | +Licensed to the Apache Software Foundation (ASF) under one |
| 7 | +or more contributor license agreements. See the NOTICE file |
| 8 | +distributed with this work for additional information |
| 9 | +regarding copyright ownership. The ASF licenses this file |
| 10 | +to you under the Apache License, Version 2.0 (the |
| 11 | +"License"); you may not use this file except in compliance |
| 12 | +with the License. You may obtain a copy of the License at |
| 13 | +
|
| 14 | + http://www.apache.org/licenses/LICENSE-2.0 |
| 15 | +
|
| 16 | +Unless required by applicable law or agreed to in writing, |
| 17 | +software distributed under the License is distributed on an |
| 18 | +"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY |
| 19 | +KIND, either express or implied. See the License for the |
| 20 | +specific language governing permissions and limitations |
| 21 | +under the License. |
| 22 | +--> |
| 23 | + |
| 24 | +# Release notes - Flink 2.2 |
| 25 | + |
| 26 | +These release notes discuss important aspects, such as configuration, behavior or dependencies, |
| 27 | +that changed between Flink 2.1 and Flink 2.2. Please read these notes carefully if you are |
| 28 | +planning to upgrade your Flink version to 2.2. |
| 29 | + |
| 30 | +### Table SQL / API |
| 31 | + |
| 32 | +#### Support VECTOR_SEARCH in Flink SQL |
| 33 | + |
| 34 | +##### [FLINK-38422](https://issues.apache.org/jira/browse/FLINK-38422) |
| 35 | + |
| 36 | +Apache Flink has supported leveraging LLM capabilities through the `ML_PREDICT` function in Flink SQL |
| 37 | +since version 2.1, enabling users to perform semantic analysis in a simple and efficient way. This |
| 38 | +integration has been technically validated in scenarios such as log classification and real-time |
| 39 | +question-answering systems. However, the current architecture allows Flink to only use embedding |
| 40 | +models to convert unstructured data (e.g., text, images) into high-dimensional vector features, |
| 41 | +which are then persisted to downstream storage systems. It lacks real-time online querying and |
| 42 | +similarity analysis capabilities for vector spaces. The VECTOR_SEARCH function is provided in Flink |
| 43 | +2.2 to enable users to perform streaming vector similarity searches and real-time context retrieval |
| 44 | +directly within Flink. |
| 45 | + |
| 46 | +See more details about the capabilities and usages of |
| 47 | +Flink's [Vector Search](https://nightlies.apache.org/flink/flink-docs-release-2.2/docs/dev/table/sql/queries/vector-search/). |
| 48 | + |
| 49 | +#### Realtime AI Function |
| 50 | + |
| 51 | +##### [FLINK-38104](https://issues.apache.org/jira/browse/FLINK-38104) |
| 52 | + |
| 53 | +Apache Flink has supported leveraging LLM capabilities through the `ML_PREDICT` function in Flink SQL |
| 54 | +since version 2.1. In Flink 2.2, the Table API also supports model inference operations that allow |
| 55 | +you to integrate machine learning models directly into your data processing pipelines. |
| 56 | + |
| 57 | +See more details about the capabilities and usages of |
| 58 | +Flink's [Model Inference](https://nightlies.apache.org/flink/flink-docs-release-2.2/docs/dev/table/tableapi/#model-inference). |
| 59 | + |
| 60 | +#### Materialized Table |
| 61 | + |
| 62 | +##### [FLINK-38532](https://issues.apache.org/jira/browse/FLINK-38532), [FLINK-38311](https://issues.apache.org/jira/browse/FLINK-38311) |
| 63 | + |
| 64 | +By specifying data freshness and query when creating [Materialized Table](https://nightlies.apache.org/flink/flink-docs-release-2.2/docs/dev/table/materialized-table/overview/), |
| 65 | +the engine automatically derives the schema for the materialized table and creates corresponding |
| 66 | +data refresh pipeline to achieve the specified freshness. |
| 67 | + |
| 68 | +From Flink 2.2, the FRESHNESS clause is not a mandatory part of the CREATE MATERIALIZED TABLE and |
| 69 | +CREATE OR ALTER MATERIALIZED TABLE DDL statements. Flink 2.2 introduces a new MaterializedTableEnricher |
| 70 | +interface. This provides a formal extension point for customizable default logic, allowing advanced |
| 71 | +users and vendors to implement "smart" default behaviors (e.g., inferring freshness from upstream tables). |
| 72 | + |
| 73 | +Besides this, users can use `DISTRIBUTED BY` or`DISTRIBUTED INTO` to support bucketing concept |
| 74 | +for Materialized tables. Users can use `SHOW MATERIALIZED TABLES` to show all Materialized tables. |
| 75 | + |
| 76 | +#### SinkUpsertMaterializer V2 |
| 77 | + |
| 78 | +##### [FLINK-38459](https://issues.apache.org/jira/browse/FLINK-38459) |
| 79 | + |
| 80 | +SinkUpsertMaterializer is an operator in Flink that reconciles out of order changelog events before |
| 81 | +sending them to an upsert sink. Performance of this operator degrades exponentially in some cases. |
| 82 | +Flink 2.2 introduces a new implementation that is optimized for such cases. |
| 83 | + |
| 84 | +#### Delta Join |
| 85 | + |
| 86 | +##### [FLINK-38495](https://issues.apache.org/jira/browse/FLINK-38495), [FLINK-38511](https://issues.apache.org/jira/browse/FLINK-38511), [FLINK-38556](https://issues.apache.org/jira/browse/FLINK-38556) |
| 87 | + |
| 88 | +In 2.1, Apache Flink has introduced a new delta join operator to mitigate the challenges caused by |
| 89 | +big state in regular joins. It replaces the large state maintained by regular joins with a |
| 90 | +bidirectional lookup-based join that directly reuses data from the source tables. |
| 91 | + |
| 92 | +Flink 2.2 enhances support for converting more SQL patterns into delta joins. Delta joins now |
| 93 | +support consuming CDC sources without DELETE operations, and allow projection and filter operations |
| 94 | +after the source. Additionally, delta joins include support for caching, which helps reduce requests |
| 95 | +to external storage. |
| 96 | + |
| 97 | +See more details about the capabilities and usages of Flink's |
| 98 | +[Delta Joins](https://nightlies.apache.org/flink/flink-docs-release-2.2/docs/dev/table/tuning/#delta-joins). |
| 99 | + |
| 100 | +#### SQL Types |
| 101 | + |
| 102 | +##### [FLINK-20539](https://issues.apache.org/jira/browse/FLINK-20539), [FLINK-38181](https://issues.apache.org/jira/browse/FLINK-38181) |
| 103 | + |
| 104 | +Before Flink 2.2, row types defined in SQL e.g. `SELECT CAST(f AS ROW<i NOT NULL>)` did ignore |
| 105 | +the `NOT NULL` constraint. This was more aligned with the SQL standard but caused many type |
| 106 | +inconsistencies and cryptic error message when working on nested data. For example, it prevented |
| 107 | +using rows in computed columns or join keys. The new behavior takes the nullability into consideration. |
| 108 | +The config option `table.legacy-nested-row-nullability` allows to restore the old behavior if required, |
| 109 | +but it is recommended to update existing queries that ignored constraints before. |
| 110 | + |
| 111 | +Casting to TIME type now considers the correct precision (0-3). Casting incorrect strings to time |
| 112 | +(e.g. where the hour component is higher than 24) leads to a runtime exception now. Casting between |
| 113 | +BINARY and VARBINARY should now correctly consider the target length. |
| 114 | + |
| 115 | +#### Use UniqueKeys instead of Upsertkeys for state management |
| 116 | + |
| 117 | +##### [FLINK-38209](https://issues.apache.org/jira/browse/FLINK-38209) |
| 118 | + |
| 119 | +This is considerable optimization and an breaking change for the StreamingMultiJoinOperator. |
| 120 | +As noted in the release notes, the operator was launched in an experimental state for Flink 2.1 |
| 121 | +since we're working on relevant optimizations that could be breaking changes. |
| 122 | + |
| 123 | +### Runtime |
| 124 | + |
| 125 | +#### Balanced Tasks Scheduling |
| 126 | + |
| 127 | +##### [FLINK-31757](https://issues.apache.org/jira/browse/FLINK-31757) |
| 128 | + |
| 129 | +Introducing a balanced tasks scheduling strategy to achieve task load balancing for TMs and reducing |
| 130 | +job bottlenecks. |
| 131 | + |
| 132 | +See more details about the capabilities and usages of |
| 133 | +Flink's [Balanced Tasks Scheduling](https://nightlies.apache.org/flink/flink-docs-release-2.2/docs/deployment/tasks-scheduling/balanced_tasks_scheduling/). |
| 134 | + |
| 135 | +#### Enhanced Job History Retention Policies for HistoryServer |
| 136 | + |
| 137 | +##### [FLINK-38229](https://issues.apache.org/jira/browse/FLINK-38229) |
| 138 | + |
| 139 | +Before Flink 2.2, HistoryServer supports only a quantity-based job archive retention policy and |
| 140 | +is insufficient for scenarios, requiring time-based retention or combined rules. Users can use |
| 141 | +the new configuration `historyserver.archive.retained-ttl` combining with `historyserver.archive.retained-jobs` |
| 142 | +to fulfill more scenario requirements. |
| 143 | + |
| 144 | +#### Metrics |
| 145 | + |
| 146 | +##### [FLINK-38158](https://issues.apache.org/jira/browse/FLINK-38158), [FLINK-38353](https://issues.apache.org/jira/browse/FLINK-38353) |
| 147 | + |
| 148 | +Since 2.2.0 users can now assign custom metric variables for each operator/transformation used in the |
| 149 | +Job. Those variables are later converted to tags/labels by the metric reporters, allowing users to |
| 150 | +tab/label specific operator's metrics. For example, you can use this to name and differentiate sources. |
| 151 | + |
| 152 | +Users can now control the level of details of checkpoint spans via [traces.checkpoint.span-detail-level](https://nightlies.apache.org/flink/flink-docs-release-2.2/docs/deployment/config/#traces-checkpoint-span-detail-level). |
| 153 | +Highest levels report tree of spans for each task and subtask. Reported custom spans can now contain |
| 154 | +children spans. See more details in [Traces](https://nightlies.apache.org/flink/flink-docs-release-2.2/docs/ops/traces/). |
| 155 | + |
| 156 | +#### Introduce Event Reporting |
| 157 | + |
| 158 | +##### [FLINK-37426](https://issues.apache.org/jira/browse/FLINK-37426) |
| 159 | + |
| 160 | +Since 2.1.0 users are able to report custom events using the EventReporters. Since 2.2.0 Flink reports |
| 161 | +some built-in/system events. |
| 162 | + |
| 163 | +### Connectors |
| 164 | + |
| 165 | +#### Introduce RateLimiter for Source |
| 166 | + |
| 167 | +##### [FLINK-38497](https://issues.apache.org/jira/browse/FLINK-38497) |
| 168 | + |
| 169 | +Flink jobs frequently exchange data with external systems, which consumes their network bandwidth |
| 170 | +and CPU. When these resources are scarce, pulling data too aggressively can disrupt other workloads. |
| 171 | +In Flink 2.2, we introduce a RateLimiter interface to provide request rate limiting for Scan Sources |
| 172 | +and connector developers can integrate with rate limiting frameworks to implement their own read |
| 173 | +restriction strategies. This feature is currently only available in the DataStream API. |
| 174 | + |
| 175 | +#### Balanced splits assignment |
| 176 | + |
| 177 | +##### [FLINK-38564](https://issues.apache.org/jira/browse/FLINK-38564) |
| 178 | + |
| 179 | +SplitEnumerator is responsible for assigning splits, but it lacks visibility into the actual runtime |
| 180 | +status or distribution of these splits. This makes it impossible for SplitEnumerator to guarantee |
| 181 | +that the sharding is evenly distributed, and data skew is very likely to occur. From Flink 2.2, |
| 182 | +SplitEnumerator has the information of the splits distribution and provides the ability to evenly |
| 183 | +assign splits at runtime. |
| 184 | + |
| 185 | +### Python |
| 186 | + |
| 187 | +#### Support async function in Python DataStream API |
| 188 | + |
| 189 | +##### [FLINK-38190](https://issues.apache.org/jira/browse/FLINK-38190) |
| 190 | + |
| 191 | +In Flink 2.2, we have added support of async function in Python DataStream API. This enables Python |
| 192 | +users to efficiently query external services in their Flink jobs, e.g. large-sized LLM which is |
| 193 | +typically deployed in a standalone GPU cluster, etc. |
| 194 | + |
| 195 | +Furthermore, we have provided comprehensive support to ensure the stability of external service |
| 196 | +access. On one hand, we support limiting the number of concurrent requests sent to the external |
| 197 | +service to avoid overwhelming it. On the other hand, we have also added retry support to tolerate |
| 198 | +temporary unavailability which maybe caused by network jitter or other transient issues. |
| 199 | + |
| 200 | +### Dependency upgrades |
| 201 | + |
| 202 | +#### Upgrade commons-lang3 to version 3.18.0 |
| 203 | + |
| 204 | +##### [FLINK-38193](https://issues.apache.org/jira/browse/FLINK-38193) |
| 205 | + |
| 206 | +Upgrade org.apache.commons:commons-lang3 from 3.12.0 to 3.18.0 to mitigate CVE-2025-48924. |
| 207 | + |
| 208 | +#### Upgrade protobuf-java from 3.x to 4.32.1 with compatibility patch for parquet-protobuf |
| 209 | + |
| 210 | +##### [FLINK-38547](https://issues.apache.org/jira/browse/FLINK-38547) |
| 211 | + |
| 212 | +Flink now uses protobuf-java 4.32.1 (corresponding to Protocol Buffers version 32), upgrading from |
| 213 | +protobuf-java 3.21.7 (Protocol Buffers version 21). This major upgrade enables: |
| 214 | + |
| 215 | +- **Protobuf Editions Support**: Full support for the new `edition = "2023"` and `edition = "2024"` |
| 216 | + syntax introduced in Protocol Buffers v27+. Editions provide a unified approach that combines |
| 217 | + proto2 and proto3 functionality with fine-grained feature control. |
| 218 | +- **Improved Proto3 Field Presence**: Better handling of optional fields in proto3 without the |
| 219 | + limitations of older protobuf versions, eliminating the need to set `protobuf.read-default-values` |
| 220 | + to `true` for field presence checking. |
| 221 | +- **Enhanced Performance**: Leverages performance improvements and bug fixes from 11 Protocol |
| 222 | + Buffers releases (versions 22-32). |
| 223 | +- **Modern Protobuf Features**: Access to newer protobuf capabilities including Edition 2024 |
| 224 | + features and improved runtime behavior. |
| 225 | + |
| 226 | +Users with existing proto2 and proto3 `.proto` files will continue to work without changes. |
0 commit comments