DOC-232 TS topics as Iceberg tables #800

kbatuigas · 2024-10-07T18:31:55Z

Description

Cluster config reference entries will be added in this PR #846

Resolves https://github.com/redpanda-data/documentation-private/issues/
Review deadline: 26 November 2024

Page previews

Topics as Iceberg Tables

Checks

New feature
Content gap
Support Follow-up
Small fix (typos, links, copyedits, etc)

Feediver1 · 2024-10-10T19:38:48Z

@kbatuigas Please update the issue this resolves (above in Description) and add a review deadline. Thx.

modules/manage/pages/topic-iceberg-integration.adoc

netlify · 2024-10-10T20:11:59Z

✅ Deploy Preview for redpanda-docs-preview ready!

Name	Link
🔨 Latest commit	`903785c`
🔍 Latest deploy log	https://app.netlify.com/sites/redpanda-docs-preview/deploys/674dea9fa376df000892a3ff
😎 Deploy Preview	https://deploy-preview-800--redpanda-docs-preview.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

kbatuigas · 2024-10-10T20:18:50Z

This preview currently has the new doc under the Manage > Tiered Storage section (with a page URL manage/topic-iceberg-integration), is a different section more appropriate, such as Develop?

And should the page URL be changed? For example, the blog post is at blog/apache-iceberg-topics-streaming-data

lf-rep

Hi Kat -- I added my comments in-place, as answers to your questions.

mattschumpert · 2024-10-11T20:43:58Z

@kbatuigas can you please add me as a reviewer? Thanks!

modules/manage/pages/topic-iceberg-integration.adoc

asimms41 · 2024-10-15T09:28:21Z

modules/manage/pages/topic-iceberg-integration.adoc

+In the Redpanda Iceberg integration, the manifest files are in JSON format.
+* Catalog: Contains the current metadata pointer for the table. Clients reading and writing data to the table see the same version of the current state of the table. You'll configure your Iceberg catalog to point to your object storage bucket or container where the Redpanda data in Iceberg format is located. Redpanda uses the https://iceberg.apache.org/concepts/catalog/#catalog-implementations[Iceberg REST catalog^] endpoint to update your catalog when there are changes to the Iceberg data and metadata.
+
+image::shared:iceberg-integration.png[]


I know there's probably not time for this but I could do with some numbering on this diagram.

@mattschumpert does adding numbering in the provided diagram (same as the one used in the blog post from a while back) sound good? Is there anything else we should add to the design request in Monday?

not sure I understood this. if you want to narrate the diagram in the text and add numbers , fine with me.

makes sense to add numbers

modules/manage/pages/topic-iceberg-integration.adoc

Co-authored-by: Tyler Rockwood <[email protected]>

Co-authored-by: Angela Simms <[email protected]>

Deflaimun · 2024-12-02T16:23:05Z

modules/manage/pages/topic-iceberg-integration.adoc

+
+== Enable Iceberg integration
+
+To create an Iceberg table for a Redpanda topic, you must set the cluster configuration property `iceberg_enabled` to `true`, and also configure the topic property `redpanda.iceberg.mode`. You can choose to provide a schema if you need the Iceberg table to be structured with defined columns.


Should link to the cluster and topic property reference

Deflaimun · 2024-12-02T16:24:37Z

modules/manage/pages/topic-iceberg-integration.adoc

+
+[,bash,]
+----
+rpk topic create <new-topic-name> --partitions 1 --replicas 1


are the flags for partitions and replicas necessary? if not, I would not add them.

Deflaimun · 2024-12-02T16:25:18Z

modules/manage/pages/topic-iceberg-integration.adoc

+[,bash,role=no-copy]
+----
+TOPIC            STATUS
+new-topic-name   OK


Suggested change

new-topic-name OK

<new-topic-name> OK

whenever mentioning the same variable, try to keep the tags. If you add this, check if the preview looks good before merging

modules/manage/pages/topic-iceberg-integration.adoc

Deflaimun · 2024-12-02T16:41:03Z

modules/manage/pages/topic-iceberg-integration.adoc

+* `rest`: Connect to and update an Iceberg catalog using a REST API. See the https://github.com/apache/iceberg/blob/main/open-api/rest-catalog-open-api.yaml[Iceberg REST Catalog API specification].
+* `object_storage`: Write catalog files to the same object storage bucket as the data files. Use the object storage URL with an Iceberg client to access the catalog and data files for your Redpanda Iceberg tables.
+
+Switching catalog types is not supported.


Suggested change

Switching catalog types is not supported.

Switching catalog types is not supported.

switching when? mid-flight? That's what I assume but we could clarify
Consider adding this to the limitations section if that makes sense

@bharathv would you be able to confirm?

Once a single topic has iceberg enabled, then you cannot change the catalog type. I don't think we enforce this

sorry just checking.. right we don't enforce it.

Deflaimun · 2024-12-02T16:44:37Z

modules/manage/pages/topic-iceberg-integration.adoc

+
+==== File system-based catalog (`object_storage`)
+
+If you are using the `object_storage` catalog type, you must set up the catalog integration in your processing engine accordingly. For example, you can configure Spark to use a file system-based catalog with at least the following properties:


should mention that the example is for aws s3

Deflaimun · 2024-12-02T16:47:25Z

modules/manage/pages/topic-iceberg-integration.adoc

+* It is not possible to append topic data to an existing Iceberg table that is not created by Redpanda.
+* If you enable the Iceberg integration on an existing Redpanda topic, Redpanda does not backfill the generated Iceberg table with topic data.
+* JSON schemas are not currently supported. If the topic data is in JSON, use the `key_value` mode to store the JSON in Iceberg, which then can be parsed by most query engines.
+* If you are using Avro or Protobuf data, you must use the Schema Registry wire format, where producers include the magic byte and schema ID in the message payload header. See also: xref:manage:schema-reg/schema-id-validation.adoc[] and the https://www.redpanda.com/blog/schema-registry-kafka-streaming#how-does-serialization-work-with-schema-registry-in-kafka[Understanding Apache Kafka Schema Registry^] blog post.


which Schema Registry? Redpanda or any? Does it make a difference?

@rockwotj could you clarify here as well please?

only the built in one to Redpanda. External registries are not supported

Deflaimun · 2024-12-02T16:49:55Z

modules/manage/pages/topic-iceberg-integration.adoc

+
+=== Query topic in key-value mode
+
+You can also forgo using a schema, which means using semi-structured data in Iceberg. 


Suggested change

You can also forgo using a schema, which means using semi-structured data in Iceberg.

You can also choose not to use a schema, allowing you to work with semi-structured data in Iceberg.

Consider using a more simple phrase. 'Forgo' is not common for non-native speakers

Deflaimun

Approved. Check the most important topics as we discussed before merging

Co-authored-by: Paulo Borges <[email protected]>

Co-authored-by: Angela Simms <[email protected]> Co-authored-by: Tyler Rockwood <[email protected]> Co-authored-by: Paulo Borges <[email protected]>

kbatuigas requested a review from a team as a code owner October 7, 2024 18:31

kbatuigas marked this pull request as draft October 7, 2024 18:32

kbatuigas marked this pull request as ready for review October 10, 2024 19:26

kbatuigas changed the title ~~TS topics as Iceberg tables~~ DOC-232 TS topics as Iceberg tables Oct 10, 2024

kbatuigas commented Oct 10, 2024

View reviewed changes

modules/manage/pages/topic-iceberg-integration.adoc Show resolved Hide resolved

kbatuigas commented Oct 10, 2024

View reviewed changes

modules/manage/pages/topic-iceberg-integration.adoc Outdated Show resolved Hide resolved

kbatuigas commented Oct 10, 2024

View reviewed changes

modules/manage/pages/topic-iceberg-integration.adoc Outdated Show resolved Hide resolved

kbatuigas commented Oct 10, 2024

View reviewed changes

modules/manage/pages/topic-iceberg-integration.adoc Show resolved Hide resolved

kbatuigas commented Oct 10, 2024

View reviewed changes

modules/manage/pages/topic-iceberg-integration.adoc Show resolved Hide resolved

kbatuigas requested a review from lf-rep October 10, 2024 20:50

lf-rep approved these changes Oct 11, 2024

View reviewed changes

kbatuigas commented Oct 14, 2024

View reviewed changes

modules/manage/pages/topic-iceberg-integration.adoc Outdated Show resolved Hide resolved

kbatuigas requested a review from mattschumpert October 15, 2024 03:42

asimms41 reviewed Oct 15, 2024

View reviewed changes

modules/manage/pages/topic-iceberg-integration.adoc Outdated Show resolved Hide resolved

asimms41 reviewed Oct 15, 2024

View reviewed changes

modules/manage/pages/topic-iceberg-integration.adoc Outdated Show resolved Hide resolved

asimms41 reviewed Oct 15, 2024

View reviewed changes

modules/manage/pages/topic-iceberg-integration.adoc Outdated Show resolved Hide resolved

asimms41 reviewed Oct 15, 2024

View reviewed changes

modules/manage/pages/topic-iceberg-integration.adoc Show resolved Hide resolved

asimms41 reviewed Oct 15, 2024

View reviewed changes

modules/manage/pages/topic-iceberg-integration.adoc Outdated Show resolved Hide resolved

asimms41 reviewed Oct 15, 2024

View reviewed changes

modules/manage/pages/topic-iceberg-integration.adoc Outdated Show resolved Hide resolved

asimms41 reviewed Oct 15, 2024

View reviewed changes

modules/manage/pages/topic-iceberg-integration.adoc Outdated Show resolved Hide resolved

asimms41 reviewed Oct 15, 2024

View reviewed changes

modules/manage/pages/topic-iceberg-integration.adoc Outdated Show resolved Hide resolved

asimms41 reviewed Oct 15, 2024

View reviewed changes

modules/manage/pages/topic-iceberg-integration.adoc Outdated Show resolved Hide resolved

asimms41 reviewed Oct 15, 2024

View reviewed changes

modules/manage/pages/topic-iceberg-integration.adoc Outdated Show resolved Hide resolved

asimms41 reviewed Oct 15, 2024

View reviewed changes

modules/manage/pages/topic-iceberg-integration.adoc Outdated Show resolved Hide resolved

asimms41 reviewed Oct 15, 2024

View reviewed changes

modules/manage/pages/topic-iceberg-integration.adoc Outdated Show resolved Hide resolved

kbatuigas and others added 10 commits December 2, 2024 11:01

Apply suggestions from review

c478d1b

Flesh out catalog section more

5b7572a

Match nav with page title

2ad812b

Missed an edit per SME review

889ba9e

Apply suggestions from code review

f3be0bf

Co-authored-by: Tyler Rockwood <[email protected]>

Edits per review

a7db75e

Apply suggestions from code review

066db4c

Co-authored-by: Angela Simms <[email protected]>

Update modules/manage/pages/topic-iceberg-integration.adoc

4a777b5

Co-authored-by: Angela Simms <[email protected]>

Edits per PR review

4745b22

Add Iceberg to What's new

1710e31

kbatuigas force-pushed the 2428_ts-topics-iceberg branch from 755c619 to 1710e31 Compare December 2, 2024 16:13

Deflaimun reviewed Dec 2, 2024

View reviewed changes

modules/manage/pages/topic-iceberg-integration.adoc Outdated Show resolved Hide resolved

Deflaimun reviewed Dec 2, 2024

View reviewed changes

modules/manage/pages/topic-iceberg-integration.adoc Outdated Show resolved Hide resolved

Deflaimun reviewed Dec 2, 2024

View reviewed changes

modules/manage/pages/topic-iceberg-integration.adoc Outdated Show resolved Hide resolved

Deflaimun reviewed Dec 2, 2024

View reviewed changes

modules/manage/pages/topic-iceberg-integration.adoc Outdated Show resolved Hide resolved

Deflaimun reviewed Dec 2, 2024

View reviewed changes

Deflaimun approved these changes Dec 2, 2024

View reviewed changes

kbatuigas and others added 4 commits December 2, 2024 12:02

Edits per review

14b6ee2

Edits per review

866e439

Edits per review

776a3b4

Apply suggestions from code review

903785c

Co-authored-by: Paulo Borges <[email protected]>

kbatuigas merged commit 38b7b2d into v-WIP/24.3 Dec 2, 2024
7 checks passed

kbatuigas deleted the 2428_ts-topics-iceberg branch December 2, 2024 17:19

Deflaimun added a commit that referenced this pull request Dec 2, 2024

DOC-232 TS topics as Iceberg tables (#800)

be9743f

Co-authored-by: Angela Simms <[email protected]> Co-authored-by: Tyler Rockwood <[email protected]> Co-authored-by: Paulo Borges <[email protected]>


		== Enable Iceberg integration

		To create an Iceberg table for a Redpanda topic, you must set the cluster configuration property `iceberg_enabled` to `true`, and also configure the topic property `redpanda.iceberg.mode`. You can choose to provide a schema if you need the Iceberg table to be structured with defined columns.

	Switching catalog types is not supported.
	Switching catalog types is not supported.


		==== File system-based catalog (`object_storage`)

		If you are using the `object_storage` catalog type, you must set up the catalog integration in your processing engine accordingly. For example, you can configure Spark to use a file system-based catalog with at least the following properties:


		=== Query topic in key-value mode

		You can also forgo using a schema, which means using semi-structured data in Iceberg.

	You can also forgo using a schema, which means using semi-structured data in Iceberg.
	You can also choose not to use a schema, allowing you to work with semi-structured data in Iceberg.

DOC-232 TS topics as Iceberg tables #800

DOC-232 TS topics as Iceberg tables #800

Uh oh!

Conversation

kbatuigas commented Oct 7, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Page previews

Checks

Uh oh!

Feediver1 commented Oct 10, 2024

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

netlify bot commented Oct 10, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for redpanda-docs-preview ready!

Uh oh!

kbatuigas commented Oct 10, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lf-rep left a comment

Choose a reason for hiding this comment

Uh oh!

mattschumpert commented Oct 11, 2024

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kbatuigas Nov 26, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Deflaimun Dec 2, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Deflaimun Dec 2, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Deflaimun Dec 2, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Deflaimun Dec 2, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kbatuigas commented Oct 7, 2024 •

edited

Loading

netlify bot commented Oct 10, 2024 •

edited

Loading

kbatuigas commented Oct 10, 2024 •

edited

Loading

kbatuigas Nov 26, 2024 •

edited

Loading

Deflaimun Dec 2, 2024 •

edited

Loading

Deflaimun Dec 2, 2024 •

edited

Loading

Deflaimun Dec 2, 2024 •

edited

Loading

Deflaimun Dec 2, 2024 •

edited

Loading