Skip to content

feat: Add iceberg connector + Tests + Docs#8902

Open
royendo wants to merge 19 commits intomainfrom
add-iceberg-connector
Open

feat: Add iceberg connector + Tests + Docs#8902
royendo wants to merge 19 commits intomainfrom
add-iceberg-connector

Conversation

@royendo
Copy link
Contributor

@royendo royendo commented Feb 21, 2026

Adds Apache Iceberg as a new connector, leveraging DuckDB's native Iceberg support. Tested with local filesystem and GCS catalogs.

Backend:

  • Register `iceberg` as a DuckDB source connector in `runtime/drivers/duckdb/duckdb.go`
  • Add Iceberg connector schema to `runtime/parser/schema/project.schema.yaml` with catalog type, storage type, and credential properties

Frontend:

  • Multi-step form schema (`iceberg.ts`) for configuring Iceberg connections (catalog URI, storage type, credentials)
  • Apache Iceberg logo SVG icons and connector icon mapping
  • Dynamic field rendering in `JSONSchemaFormRenderer` and `GroupedFieldsRenderer` for grouped/conditional form fields
  • GCS and Azure Blob Storage icon components

Documentation:

  • User guide at `docs/developers/build/connectors/data-source/iceberg.md`
  • Reference docs for Iceberg connector YAML properties

Integration tests:

  • Resolver test (`connector_iceberg.yaml`) validating Iceberg → DuckDB materialization with TPC-H lineitem data (51,793 rows, 16 columns)
Screenshot 2026-02-20 at 20 59 40 Screenshot 2026-02-20 at 20 59 43 Screenshot 2026-02-20 at 21 00 00

Checklist:

  • Covered by tests
  • Ran it and it works as intended
  • Reviewed the diff before requesting a review
  • Checked for unhandled edge cases
  • Linked the issues it closes
  • Checked if the docs need to be updated. If so, create a separate Linear DOCS issue
  • Intend to cherry-pick into the release branch
  • I'm proud of this work!

@royendo royendo requested a review from mindspank February 21, 2026 02:00
@royendo
Copy link
Contributor Author

royendo commented Feb 21, 2026

Make changes:

Tab 1: Direct Table Path
• Storage dropdown
• Storage connector
• Table path input
• Done

Tab 2: Managed Table (Catalog)
• Catalog dropdown (REST, PG, MySQL)
• Catalog connector
• Storage dropdown
• Storage connector
• Table identifier input (db.table)
• Done

Results look correct:

51,793 rows — matches TPC-H lineitem SF0.01 with the l_extendedprice < 10000 rows deleted (per the README, full SF0.01 has ~60K rows)
16 columns — standard TPC-H lineitem schema (l_orderkey, l_partkey, ... l_comment)
Types are reasonable — INTEGER, DECIMAL(15,2), VARCHAR, DATE
Both models (with and without secrets) return the same 51,793 count
@royendo royendo changed the title feat: Add iceberg connector feat: Add iceberg connector + Tests Feb 23, 2026
@royendo royendo changed the title feat: Add iceberg connector + Tests feat: Add iceberg connector + Tests + Docs Feb 24, 2026
@royendo royendo requested review from ericpgreen2 and removed request for mindspank February 27, 2026 15:59
@ericpgreen2 ericpgreen2 requested review from AdityaHegde and removed request for ericpgreen2 March 9, 2026 09:58
royendo and others added 3 commits March 10, 2026 09:59
Resolve conflicts in JSONSchemaFormRenderer and GroupedFieldsRenderer.
Update ConnectionTypeSelector to use v2 runtime client API.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@royendo royendo requested a review from AdityaHegde March 10, 2026 15:37
driver:
type: string
description: Must be `duckdb`. Iceberg tables are read through DuckDB's native Iceberg extension.
const: duckdb
Copy link
Contributor

@NamanMahor NamanMahor Mar 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this might create problem in json schema. because now we have multiple const: duckdb in this oneof(line 104). actually it should be only one const: duckdb in a oneOf

- # Example: Iceberg model reading from GCS
type: model
connector: duckdb
create_secrets_from_connectors: gcs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rill support google json credendtial but duckdb only support hmac so I think iceberg_scan only support hmac too. May be better to mention here.

materialize: true
sql: |
SELECT *
FROM iceberg_scan('gs://rilldata-public/iceberg/lineitem_iceberg',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we are using the public bucket. this test should use private to test the secrets. check connector_gcs for private bucket.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants