|
| 1 | += Schema Changes and Migration Guide for Iceberg Topics in Redpanda v25.3 |
| 2 | +:description: Information about breaking schema changes for Iceberg topics in Redpanda v25.3, and actions to take when upgrading. |
| 3 | + |
| 4 | +Redpanda v25.3 introduces changes that break table compatibility for Iceberg topics. If you have existing Iceberg topics and want to retain the data in the corresponding Iceberg tables, you must take specific actions while upgrading to v25.3 to ensure that your Iceberg topics and their associated tables continue to function correctly. |
| 5 | + |
| 6 | +== Breaking changes |
| 7 | + |
| 8 | +The following table lists the schema changes introduced in Redpanda v25.3. |
| 9 | + |
| 10 | +|=== |
| 11 | +| Field | Iceberg type translation before v25.3 | Iceberg type translation starting in v25.3 | Impact |
| 12 | + |
| 13 | +| `redpanda.timestamp` column |
| 14 | +| `timestamp` type |
| 15 | +| `timestamptz` (timestamp with time zone) type |
| 16 | +| Affects all tables created by Iceberg topics, including dead-letter queue tables. |
| 17 | + |
| 18 | +| `redpanda.headers.key` column |
| 19 | +| `binary` type |
| 20 | +| `string` type |
| 21 | +| Affects all tables created by Iceberg topics, including dead-letter queue tables. |
| 22 | + |
| 23 | +| Avro optionals (two-field union of `[null, <FIELD>]`) |
| 24 | + |
| 25 | +Example: `"type": ["null", "long"]` |
| 26 | + |
| 27 | +| Single-field struct type |
| 28 | + |
| 29 | +Example: `struct<union_opt_1:bigint>` |
| 30 | + |
| 31 | +| Optional `FIELD` |
| 32 | + |
| 33 | +Example: `bigint` |
| 34 | + |
| 35 | +| Affects tables created by Iceberg topics that use Avro optionals. |
| 36 | + |
| 37 | +| Avro non-optional unions |
| 38 | + |
| 39 | +Example: `"type": ["string", "long"]` |
| 40 | + |
| 41 | +| Column names used a naming convention based on the ordering of the union fields |
| 42 | + |
| 43 | +Example: `struct<union_opt_0:string,union_opt_1:bigint>` |
| 44 | + |
| 45 | +| Column names use the type names |
| 46 | + |
| 47 | +Example: `struct<string:string,long:bigint>` |
| 48 | + |
| 49 | +| Affects tables created by Iceberg topics that use Avro unions. |
| 50 | + |
| 51 | +| Avro and Protobuf enums |
| 52 | +| `integer` type |
| 53 | +| `string` type |
| 54 | +| Affects tables created by Iceberg topics that use Avro or Protobuf enums. |
| 55 | + |
| 56 | +|=== |
| 57 | + |
| 58 | +== Upgrade steps |
| 59 | + |
| 60 | +When upgrading to Redpanda v25.3, you must perform these steps to migrate Iceberg topics to the new schema translation and ensure your topics continue to function correctly. Failure to perform these steps will result in data being sent to the dead-letter queue (DLQ) table until you make the Iceberg tables conformant to the new schemas (step 4). |
| 61 | + |
| 62 | +. Before upgrading to v25.3, disable Iceberg on all Iceberg topics by setting the `redpanda.iceberg.mode` topic property to `disabled`. This step ensures that no additional Parquet files are written by Iceberg topics. |
| 63 | ++ |
| 64 | +NOTE: Don't set the `iceberg_enabled` cluster property to `false`. Disabling Iceberg at the cluster level would prevent pending Iceberg commits from being finalized post-upgrade. |
| 65 | +. xref:upgrade:rolling-upgrade.adoc#perform-a-rolling-upgrade[Perform a rolling upgrade] to v25.3, restarting the cluster in the process. |
| 66 | +. Query the `GetCoordinatorState` Admin API endpoint repeatedly for these Iceberg topics to migrate to the new schema, until there are no more pending entries in the coordinator for the given topics. This step confirms that all Parquet files written pre-upgrade have been committed to the Iceberg tables. |
| 67 | ++ |
| 68 | +[,bash] |
| 69 | +---- |
| 70 | +# Pass the comma-separated list of Iceberg topics into "topics_filter" |
| 71 | +curl -s \ |
| 72 | + --header 'Content-Type: application/json' \ |
| 73 | + --data '{"topics_filter": ["<list-of-topics-to-migrate>"]}' \ |
| 74 | + localhost:9644/redpanda.core.admin.internal.datalake.v1.DatalakeService/GetCoordinatorState | jq |
| 75 | +---- |
| 76 | ++ |
| 77 | +.Sample output |
| 78 | +[,bash,.no-copy] |
| 79 | +---- |
| 80 | +{ |
| 81 | + "state": { |
| 82 | + "topicStates": { |
| 83 | + "topic_to_migrate": { |
| 84 | + "revision": "9", |
| 85 | + "partitionStates": { |
| 86 | + "0": { |
| 87 | + "pendingEntries": [ |
| 88 | + { |
| 89 | + "data": { |
| 90 | + "startOffset": "12", |
| 91 | + "lastOffset": "15", |
| 92 | + "dataFiles": [ |
| 93 | + { |
| 94 | + "remotePath": "redpanda-iceberg-catalog/redpanda/topic_to_migrate/data/0-871734c9-e266-41fa-a34d-2afba2828c0d.parquet", |
| 95 | + "rowCount": "4", |
| 96 | + "fileSizeBytes": "1426", |
| 97 | + "tableSchemaId": 0, |
| 98 | + "partitionSpecId": 0, |
| 99 | + "partitionKey": [] |
| 100 | + } |
| 101 | + ], |
| 102 | + "dlqFiles": [], |
| 103 | + "kafkaProcessedBytes": "289" |
| 104 | + }, |
| 105 | + "addedPendingAt": "6" |
| 106 | + } |
| 107 | + ], |
| 108 | + "lastCommitted": "11" |
| 109 | + } |
| 110 | + }, |
| 111 | + "lifecycleState": "LIFECYCLE_STATE_LIVE", |
| 112 | + "totalKafkaProcessedBytes": "79" |
| 113 | + } |
| 114 | + } |
| 115 | + } |
| 116 | +} |
| 117 | +---- |
| 118 | ++ |
| 119 | +To check for remaining pending files: |
| 120 | ++ |
| 121 | +[,bash] |
| 122 | +---- |
| 123 | +curl -s \ |
| 124 | + --header 'Content-Type: application/json' \ |
| 125 | + --data '{}' \ |
| 126 | + localhost:9644/redpanda.core.admin.internal.datalake.v1.DatalakeService/GetCoordinatorState \ |
| 127 | + | jq '[.state.topicStates[].partitionStates[].pendingEntries | length] | any(. > 0)' |
| 128 | +---- |
| 129 | ++ |
| 130 | +If the query returns `true`, there are pending files and you need to wait longer before proceeding to the next step. |
| 131 | + |
| 132 | +. Migrate Iceberg topics to the new schema translation and ensure they are conformant with the breaking change. |
| 133 | ++ |
| 134 | +Run SQL queries to rename affected columns for each Iceberg table you want to migrate to the <<breaking-changes,new schema>>. In addition to renaming the existing columns, Redpanda automatically adds new columns that use the original name, but with the new types: |
| 135 | ++ |
| 136 | +[,sql] |
| 137 | +---- |
| 138 | +/* |
| 139 | +`redpanda.timestamp` renamed to `redpanda.timestamp_v1` (`timestamp` type), |
| 140 | +new `redpanda.timestamp` (`timestamptz` type) column added |
| 141 | +*/ |
| 142 | +ALTER TABLE redpanda.<name-of-topic-to-migrate> |
| 143 | +RENAME COLUMN redpanda.timestamp TO timestamp_v1; |
| 144 | +
|
| 145 | +/* |
| 146 | +`redpanda.headers.key` renamed to `key_v1` (`binary` type), |
| 147 | +new `redpanda.headers.key` (`string` type) column added |
| 148 | +*/ |
| 149 | +ALTER TABLE redpanda.<name-of-topic-to-migrate> |
| 150 | +RENAME COLUMN redpanda.headers.key TO key_v1; |
| 151 | +
|
| 152 | +/* |
| 153 | +Rename any additional affected columns according to the list of |
| 154 | +breaking changes in the first section of this guide. |
| 155 | +*/ |
| 156 | +ALTER TABLE redpanda.<name-of-topic-to-migrate> |
| 157 | +RENAME COLUMN <column1> TO <column1-new-name>; |
| 158 | +---- |
| 159 | ++ |
| 160 | +NOTE: Redpanda will not write new data to the renamed columns. You must take care to avoid adding fields to the Kafka schema that collide with the new names. |
| 161 | ++ |
| 162 | +You can then continue to query the data in the original columns, but using their new column names only. To query both older data and new data that use the new types, you must update your queries to account for both the renamed columns and the new columns that use the original name. |
| 163 | ++ |
| 164 | +[,sql] |
| 165 | +---- |
| 166 | +/* |
| 167 | +Adjust the range condition as needed. |
| 168 | +
|
| 169 | +Tip: Using the same time range for both columns helps ensure that you capture |
| 170 | +all data without needing to specify the exact cutoff point for the upgrade. |
| 171 | +*/ |
| 172 | +SELECT count(*) FROM redpanda.<name-of-migrated-topic> |
| 173 | + WHERE redpanda.timestamp >= '2025-01-01 00:00:00' |
| 174 | + OR redpanda.timestamp_v1 >= '2025-01-01 00:00:00'; |
| 175 | +---- |
| 176 | + |
| 177 | +. Re-enable Iceberg on all Iceberg topics in your upgraded cluster. |
0 commit comments