Skip to content

Conversation

@kbatuigas
Copy link
Contributor

@kbatuigas kbatuigas commented Mar 25, 2025

Description

Resolves https://redpandadata.atlassian.net/browse/
Review deadline: 3 April

This pull request includes significant changes to the Iceberg documentation, with a focus on restructuring and updating content related to Iceberg table access, catalog integration, and query examples. The most important changes include the addition of new content and reorganization of existing content into partials for better modularity.

This PR reorganizes content so that it can be shared with Cloud docs. The ifdef::env-byoc and ifndef::env-byoc directives indicate when content should or should not display specifically on a Cloud doc page. The AsciiDoc files for Cloud are added in this PR https://github.com/redpanda-data/cloud-docs/pull/240/files and will contain env-byoc page attributes which the directives will evaluate and make Cloud-specific content display.

Documentation Updates:

  • Iceberg Table Access and Query Examples:

    • Consolidated content into a new partial file to single source for Cloud documentation.
  • Catalog Integration:

    • Moved catalog integration details to a new partial file to single source for Cloud documentation.
  • Branch Update:

    • Changed the branch for cloud-docs to DOC-805-Document-feature-Iceberg-Beta-on-Cloud in the local-antora-playbook.yml file.

Page previews

Self-Managed:
https://deploy-preview-1032--redpanda-docs-preview.netlify.app/25.1/manage/iceberg/topic-iceberg-integration
https://deploy-preview-1032--redpanda-docs-preview.netlify.app/25.1/manage/iceberg/use-iceberg-catalogs
https://deploy-preview-1032--redpanda-docs-preview.netlify.app/25.1/manage/iceberg/query-iceberg-topics

Cloud doc previews available in redpanda-data/cloud-docs#240

Checks

  • New feature
  • Content gap
  • Support Follow-up
  • Small fix (typos, links, copyedits, etc)

@netlify
Copy link

netlify bot commented Mar 25, 2025

Deploy Preview for redpanda-docs-preview ready!

Name Link
🔨 Latest commit b73da56
🔍 Latest deploy log https://app.netlify.com/sites/redpanda-docs-preview/deploys/67f065672409f90008bb004b
😎 Deploy Preview https://deploy-preview-1032--redpanda-docs-preview.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

@hyperlint-ai-deprecated
Copy link
Contributor

hyperlint-ai-deprecated bot commented Mar 25, 2025

PR Change Summary

Introduced the Iceberg integration for Redpanda, enabling cloud storage of topic data in the Iceberg format for improved analytics.

  • Added comprehensive documentation for the Iceberg integration, including concepts, prerequisites, and limitations.
  • Provided detailed instructions for enabling Iceberg integration and configuring topics.
  • Included examples for querying Iceberg tables and managing schema support.

Added Files

  • modules/manage/partials/iceberg/about-iceberg-topics.adoc
  • modules/manage/partials/iceberg/query-iceberg-topics.adoc

How can I customize these reviews?

Check out the Hyperlint AI Reviewer docs for more information on how to customize the review.

If you just want to ignore it on this PR, you can add the hyperlint-ignore label to the PR. Future changes won't trigger a Hyperlint review.

Note specifically for link checks, we only check the first 30 links in a file and we cache the results for several hours (for instance, if you just added a page, you might experience this). Our recommendation is to add hyperlint-ignore to the PR to ignore the link check for this PR.

@kbatuigas kbatuigas requested a review from simon0191 March 25, 2025 16:13
@kbatuigas kbatuigas marked this pull request as ready for review April 1, 2025 14:01
@kbatuigas kbatuigas requested a review from a team as a code owner April 1, 2025 14:01
@kbatuigas kbatuigas force-pushed the 805-single-source-iceberg branch from c390cdd to 53a0809 Compare April 1, 2025 14:04
Copy link
Member

@simon0191 simon0191 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM


By default, Iceberg topics use the file-system based catalog (config_ref:iceberg_catalog_type,true,properties/cluster-properties[`iceberg_catalog_type`] cluster configuration set to `object_storage`). Redpanda stores the table metadata in https://iceberg.apache.org/javadoc/1.5.0/org/apache/iceberg/hadoop/HadoopCatalog.html[HadoopCatalog^] format in the same object storage bucket or container as the data files.

If using the `object_storage` catalog type, you provide the object storage URI of the table's metadata.json file to an Iceberg client so it can access the catalog and data files for your Redpanda Iceberg tables.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
If using the `object_storage` catalog type, you provide the object storage URI of the table's metadata.json file to an Iceberg client so it can access the catalog and data files for your Redpanda Iceberg tables.
If using the `object_storage` catalog type, you provide the object storage URI of the table's `metadata.json` file to an Iceberg client so it can access the catalog and data files for your Redpanda Iceberg tables.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably put a reminder note in here about the fact that this metadata.json file only point to a specific table snapshot. Due to the limitations of the object storage catalog specification in Apache Iceberg, tables must be updated anytime a new snapshot is created using this catalog type (effectively, any time new data is written to the table). For more information, see the Apache Iceberg documentation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated


=== Specify metadata location

The config_ref:iceberg_catalog_base_location,true,properties/cluster-properties[`iceberg_catalog_base_location`] property stores the base path for the file-system based catalog if using the `object_storage` catalog type. The default value is `redpanda-iceberg-catalog`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The config_ref:iceberg_catalog_base_location,true,properties/cluster-properties[`iceberg_catalog_base_location`] property stores the base path for the file-system based catalog if using the `object_storage` catalog type. The default value is `redpanda-iceberg-catalog`.
The config_ref:iceberg_catalog_base_location,true,properties/cluster-properties[`iceberg_catalog_base_location`] property stores the base path for the file system-based catalog if using the `object_storage` catalog type. The default value is `redpanda-iceberg-catalog`.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part shouldn't be in CLOUD docs as this will not be editable (but read-only)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

@micheleRP
Copy link
Contributor

@kbatuigas the beta badge is still showing in the preview docs!

----
+
The `value_schema_id_prefix` requires that you produce to a topic using the Schema Registry wire format, which includes the magic byte and schema ID in the prefix of the message payload. This allows Redpanda to identify the correct schema version in the Schema Registry for a record. See the https://www.redpanda.com/blog/schema-registry-kafka-streaming#how-does-serialization-work-with-schema-registry-in-kafka[Understanding Apache Kafka Schema Registry^] blog post to learn more about the wire format.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The link goes to a subsection of the blog, but maybe the whole sentence should change. I searched the blog for "wire format" and "wire" and got nothing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated - still mention wire format, but not explicitly linking it to the referenced blog post

@kbatuigas kbatuigas force-pushed the 805-single-source-iceberg branch from 18cbbb9 to 2c66f3d Compare April 4, 2025 22:05
@micheleRP
Copy link
Contributor

@kbatuigas the beta badge still appears in these Self-Managed files

@micheleRP
Copy link
Contributor

@kbatuigas the beta badge still appears in these Self-Managed files

I'll look for that in the next PR!

Copy link
Contributor

@micheleRP micheleRP left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@kbatuigas kbatuigas merged commit 44b720c into beta Apr 4, 2025
5 checks passed
@kbatuigas kbatuigas deleted the 805-single-source-iceberg branch April 4, 2025 23:06
JakeSCahill pushed a commit that referenced this pull request Apr 6, 2025
JakeSCahill pushed a commit that referenced this pull request Apr 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants