Skip to content

Commit 066db4c

Browse files
kbatuigasasimms41
andcommitted
Apply suggestions from code review
Co-authored-by: Angela Simms <102690377+asimms41@users.noreply.github.com>
1 parent a7db75e commit 066db4c

File tree

1 file changed

+7
-7
lines changed

1 file changed

+7
-7
lines changed

modules/manage/pages/topic-iceberg-integration.adoc

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -55,7 +55,7 @@ In the Redpanda Iceberg integration, the manifest files are in JSON format.
5555

5656
image::shared:iceberg-integration.png[]
5757

58-
When you enable the Iceberg integration for a Redpanda topic, Redpanda brokers store streaming data in the Iceberg-compatible format in Parquet files in object storage, in addition to the log segments uploaded via Tiered Storage. Storing the streaming data in Iceberg tables in the cloud allows you to derive real-time insights through many compatible data lakehouse, data engineering, and business intelligence https://iceberg.apache.org/vendors/[tools^].
58+
When you enable the Iceberg integration for a Redpanda topic, Redpanda brokers store streaming data in the Iceberg-compatible format in Parquet files in object storage, in addition to the log segments uploaded using Tiered Storage. Storing the streaming data in Iceberg tables in the cloud allows you to derive real-time insights through many compatible data lakehouse, data engineering, and business intelligence https://iceberg.apache.org/vendors/[tools^].
5959

6060
== Enable Iceberg integration
6161

@@ -89,7 +89,7 @@ new-topic-name OK
8989
. Enable the integration for the topic by configuring `redpanda.iceberg.mode`. You can choose one of the following modes:
9090
+
9191
--
92-
* `key_value`: Creates an Iceberg table using a simple schema, consisting two columns, one for the record metadata including the key, and another binary column for the record's value.
92+
* `key_value`: Creates an Iceberg table using a simple schema, consisting of two columns, one for the record metadata including the key, and another binary column for the record's value.
9393
* `value_schema_id_prefix`: Creates an Iceberg table whose structure matches the Redpanda schema for this topic, with columns corresponding to each field. You must register a schema in the Schema Registry (see next step), and producers must write to the topic using the Schema Registry wire format. Redpanda parses the schema used by the record based on the schema ID encoded in the payload header, and stores the topic values in the corresponding table columns.
9494
* `disabled` (default): Disables writing to an Iceberg table for this topic.
9595
--
@@ -222,7 +222,7 @@ Avro::
222222
| timestamp | timestamp
223223
|===
224224
225-
* Different flavors of time (such as time-millis) and timestamp (such as timestamp-millis) types are translated to the same Iceberg `time` and `timestamp` types respectively.
225+
* Different flavors of time (such as `time-millis`) and timestamp (such as `timestamp-millis`) types are translated to the same Iceberg `time` and `timestamp` types respectively.
226226
* Avro unions are flattened to Iceberg structs with optional fields:
227227
** For example, the union `["int", "long", "float"]` is represented as an Iceberg struct `struct<0 INT NULLABLE, 1 LONG NULLABLE, 2 FLOAT NULLABLE>`.
228228
** The union `["int", null, "float"]` is represented as an Iceberg struct `struct<0 INT NULLABLE, 1 FLOAT NULLABLE>`.
@@ -322,9 +322,9 @@ SELECT * FROM streaming.redpanda.ClickEvent;
322322

323323
Spark can use the REST catalog to automatically discover the topic's Iceberg table.
324324

325-
==== Filesystem-based catalog (`object_storage`)
325+
==== File system-based catalog (`object_storage`)
326326

327-
If using the `object_storage` catalog type, you must set up the catalog integration in your processing engine accordingly. For example, you can configure Spark to use a filesystem-based catalog with at least the following properties:
327+
If you are using the `object_storage` catalog type, you must set up the catalog integration in your processing engine accordingly. For example, you can configure Spark to use a file system-based catalog with at least the following properties:
328328

329329
```
330330
spark.sql.catalog.streaming.type = hadoop
@@ -374,7 +374,7 @@ You can register the schema under the `ClickEvent-value` subject:
374374
rpk registry schema create ClickEvent-value --schema path/to/schema.avsc --type avro
375375
----
376376

377-
If you produce to the `ClickEvent` topic in the following format:
377+
You can then produce to the `ClickEvent` topic using the following format:
378378

379379
[,bash]
380380
----
@@ -404,7 +404,7 @@ FROM <catalog-name>.ClickEvent;
404404

405405
You can also forgo using a schema, which means using semi-structured data in Iceberg.
406406

407-
If you produce to the `ClickEvent_key_value` topic in the following format:
407+
You can produce to the `ClickEvent_key_value` topic using the following format:
408408

409409
[,bash]
410410
----

0 commit comments

Comments
 (0)