You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
| data_cache_storage | The type of storage used for table data cache. Available options: "none" (disables table data cache), "disk" (enables disk cache). Defaults to "none". |
231
+
| iceberg_table_meta_count | Controls the number of Iceberg table metadata entries to cache. Set to `0` to disable metadata caching. |
Copy file name to clipboardExpand all lines: docs/en/guides/51-access-data-lake/02-iceberg.md
+221Lines changed: 221 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -7,6 +7,202 @@ import FunctionDescription from '@site/src/components/FunctionDescription';
7
7
8
8
Databend supports the integration of an [Apache Iceberg](https://iceberg.apache.org/) catalog, enhancing its compatibility and versatility for data management and analytics. This extends Databend's capabilities by seamlessly incorporating the powerful metadata and storage management capabilities of Apache Iceberg into the platform.
9
9
10
+
## Quick Start with Apache Iceberg
11
+
12
+
If you want to quickly try out Apache Iceberg and experiment with table operations locally, a [Docker-based starter project](https://github.com/databendlabs/iceberg-quick-start) is available. This setup allows you to:
13
+
14
+
- Run Spark with Iceberg support
15
+
- Use a REST catalog (Iceberg REST Fixture)
16
+
- Simulate an S3-compatible object store using MinIO
17
+
- Load sample TPC-H data into Iceberg tables for query testing
18
+
19
+
### Prerequisites
20
+
21
+
Before you start, make sure Docker and Docker Compose are installed on your system.
WARN[0000] /Users/eric/iceberg-quick-start/docker-compose.yml: the attribute `version` is obsolete, it will be ignored, please remove it to avoid potential confusion
40
+
[+] Running 5/5
41
+
✔ Network iceberg-quick-start_iceberg_net Created 0.0s
42
+
✔ Container iceberg-rest-test Started 0.4s
43
+
✔ Container minio Started 0.4s
44
+
✔ Container mc Started 0.6s
45
+
✔ Container spark-iceberg S... 0.7s
46
+
```
47
+
48
+
### Load TPC-H Data via Spark Shell
49
+
50
+
Run the following command to generate and load sample TPC-H data into the Iceberg tables:
This table maps data types between Apache Iceberg and Databend. Please note that Databend does not currently support Iceberg data types that are not listed in the table.
@@ -115,6 +311,31 @@ Switches the current session to the specified catalog.
115
311
USE CATALOG <catalog_name>
116
312
```
117
313
314
+
## Caching Iceberg Catalog
315
+
316
+
Databend offers a Catalog Metadata Cache specifically designed for Iceberg catalogs. When a query is executed on an Iceberg table for the first time, the metadata is cached in memory. By default, this cache remains valid for 10 minutes, after which it is asynchronously refreshed. This ensures that queries on Iceberg tables are faster by avoiding repeated metadata retrieval.
317
+
318
+
If you need fresh metadata, you can manually refresh the cache using the following commands:
319
+
320
+
```sql
321
+
USE CATALOG iceberg;
322
+
ALTERDATABASE tpch REFRESH CACHE; -- Refresh metadata cache for the tpch database
323
+
ALTERTABLEtpch.lineitem REFRESH CACHE; -- Refresh metadata cache for the lineitem table
324
+
```
325
+
326
+
If you prefer not to use the metadata cache, you can disable it entirely by configuring the `iceberg_table_meta_count` setting to `0` in the [databend-query.toml](https://github.com/databendlabs/databend/blob/main/scripts/distribution/configs/databend-query.toml) configuration file:
327
+
328
+
```toml
329
+
...
330
+
# Cache config.
331
+
[cache]
332
+
...
333
+
iceberg_table_meta_count = 0
334
+
...
335
+
```
336
+
337
+
In addition to metadata caching, Databend also supports table data caching for Iceberg catalog tables, similar to Fuse tables. For more information on data caching, refer to the `[cache] Section` in the [Query Configurations](../10-deploy/04-references/02-node-config/02-query-config.md) reference.
338
+
118
339
## Iceberg Table Functions
119
340
120
341
Databend provides the following table functions for querying Iceberg metadata, allowing users to inspect snapshots and manifests efficiently:
0 commit comments