Running multiple Iceberg catalogs act as single catalog #25420
-
Environment:
My containerized test deployment consists of 1 trino coordinator / 1 worker with iceberg connector, 1 minio service, 1 mysql and 1 hive metastore service (hms) . Prepared and attached files to be used to reproduce the problem: Context:
CREATE SCHEMA iceberg1.testschema
WITH (
"location" = 's3a://testbucketone/testschema'
);
CREATE TABLE iceberg1.testschema.testtbl (
c1,
c2
)
WITH (
format = 'PARQUET'
)
AS VALUES
(2021, 10000),
(2022, 20000);
select * from iceberg1.testschema.testtbl;
If you need more info - let me know. Thanks in advance. |
Beta Was this translation helpful? Give feedback.
Replies: 3 comments 4 replies
-
Experiencing the same issue when switched to Trino v474 |
Beta Was this translation helpful? Give feedback.
-
OK. I found for example in MongoDB connector docs (as well as in MySQL, Kafka, etc.) explicit statement that Is it possible to have this feature for the Iceberg connector? |
Beta Was this translation helpful? Give feedback.
-
During investigation, I came up with the deployment architecures depicted below. Assume we have 2 or more products (or any other required unit of separation). Depending if a total data separation is mandatory and if data is accessible in a sql query ran against trino, we have: With some modifications, you might got it useful - so sharing. At the moment for various of reasong option 3 & 4 is the way to go for me. It'd be nice, if trino w/ iceberg could support option no 1 (single HMS with multiple iceberg catalogs), maybe with some additional restrictions (using different buckets or different schema/table names). |
Beta Was this translation helpful? Give feedback.
You can use a single hms instance for different iceberg catalogs in trino. The following two steps must be taken (as far as I know):
Create the catalog entries in the hms backing database (stackoverflow reference)
Set the property hive.metastore.thrift.catalog-name in the properties file of the connector in trino