-
Notifications
You must be signed in to change notification settings - Fork 456
Description
Specs from rudolfix
(written by zilto, adapted from: #2709 (comment))
Tasks
Create a new ducklake destination; it should inherit a lot of features from duckdbdestination.
Configuration
To allow to configure the catalog (a database) and storage (a filesystem), DuckLakeCredentials should derive from duckdb configuration and have the following signature
@configspec(init=False)
class DuckLakeCredentials(DuckDbBaseCredentials):
drivername: Final[str] = dataclasses.field( # type: ignore
default="ducklake", init=False, repr=False, compare=False
)
username: str
password: TSecretStrValue = None
database: str # the name of the ducklake; required by DuckLakeSqlClient
catalog: ConnectionStringCredentials # for catalog; like postgres
storage: FilesystemConfiguration # for data;Users will be able to configure the catalog and the storage from their config.toml and secrets.toml
Resolve catalog and storage secrets
The duckdb connection needs credentials to the catalog (postgres example) and the storage (supported storage)
- for storage, the duckdb destination already has a feature to get secrets from
FilesystemConfiguration(i.e., the mechanism that allows to query S3 with duckdb) - for catalog, we need to implement the function to get secrets from
ConnectionStringCredentials
Configure DuckDB instance to support DuckLake
The class DuckDbBaseCredentials allows to set extensions, pragmas, global config, and local config. This allows to load the ducklake extension, but not to install it.
DuckLakeCredentialsshould inherit fromDuckDbBaseCredentials, but enforce theducklakeextension to be installed- Set the current database to the ducklake name (details here).
- A lot of filesystem are supported as duckdb extensions.
httpfssupports all S3-compliant APIs. Also, it supports python fsspec. We can progressively add support here
Out-of-scope
- ducklake table maintenance; this should be done by the user directly against their ducklake instance
Future work
- Add
ducklaketoTTableFormat - DuckLakeClient should implement
SupportsOpenTablesto allow users to get authenticated catalog and table relation from the pipeline/destination ie. to do table maintenance. this is how delta and iceberg work.
Original issue
Feature description
we should support ducklake
User reported it currently doesn't work with our duckdb destination and we likely need to make some adjustments.
Metadata
Metadata
Assignees
Labels
Type
Projects
Status