Skip to content

Commit 6762c43

Browse files
authored
Ducklake docs (#1052)
1 parent 233c18e commit 6762c43

File tree

9 files changed

+94
-0
lines changed

9 files changed

+94
-0
lines changed
Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
---
2+
slug: ducklake
3+
version: v1.516.0
4+
title: Ducklake
5+
tags: ['assets', 'storage']
6+
description: First class support for Ducklake
7+
features:
8+
- Configure ducklakes in your workspace settings with a catalog SQL database and an S3 storage
9+
- Use the windmill database as a catalog
10+
- Datalakes are assets and can be explored like any database
11+
- Use a ducklake in DuckDB with the `ATTACH 'ducklake://name' AS dl` syntax
12+
- "`ATTACH 'ducklake' AS dl` shorthand for the main ducklake"
13+
docs: /docs/core_concepts/ducklake
14+
---
165 KB
Loading
115 KB
Loading
119 KB
Loading
127 KB
Loading
Lines changed: 74 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,74 @@
1+
# Ducklake
2+
3+
Ducklake allows you to store massive amounts of data in S3, but still query it efficiently using DuckDB in natural SQL language.
4+
5+
<video
6+
className="border-2 rounded-lg object-cover w-full h-full dark:border-gray-800"
7+
autoPlay
8+
controls
9+
id="main-video"
10+
src="/videos/ducklake_demo.mp4"
11+
/>
12+
<br />
13+
14+
[Learn more about Ducklake](https://ducklake.select//)
15+
16+
## Getting started
17+
18+
Prerequisites:
19+
20+
- A workspace storage configured
21+
- A Postgres or MySQL resource (Optional for superusers)
22+
23+
Superusers can use the Windmill database as a catalog for Ducklake with no additional configuration.
24+
25+
Go to the workspace settings and configure a Ducklake :
26+
27+
![Ducklake settings](./ducklake_settings.png 'Ducklake settings')
28+
29+
Clicking the "Explore" button will open the database manager. Your ducklake behaves like any other database : you can perform all CRUD operations through the UI or with the SQL Repl. You can also create and delete new tables.
30+
31+
![Explore ducklake](./ducklake_db_manager.png 'Explore ducklake')
32+
33+
If you explore your catalog database, you will see that Ducklake created some tables for you :
34+
35+
![Catalog database](./ducklake_catalog_db.png 'Catalog database')
36+
37+
These metadata tables store information about your data and where it is located in S3.
38+
If you go to your workspace storage settings, you can explore your selected workspace storage at the configured location and see your tables and their contents :
39+
40+
![S3 content](./ducklake_s3_content.png 'S3 content')
41+
42+
## Using Ducklake in DuckDB scripts
43+
44+
Ducklakes can be accessed in DuckDB scripts using the `ATTACH` syntax. You can use the Ducklake button in the editor bar for convenience.
45+
46+
In the example below, we pass a list of messages with positive, neutral or negative sentiment.
47+
This list might come from a Python script which queries new reviews from the Google My Business API,
48+
and sends them to an LLM to determine their sentiment.
49+
The messages are then inserted into a Ducklake table, which effectively creates a new parquet file.
50+
51+
```sql
52+
-- $messages (json[])
53+
54+
ATTACH 'ducklake://main' AS dl;
55+
USE dl;
56+
57+
CREATE TABLE IF NOT EXISTS messages (
58+
content STRING NOT NULL,
59+
author STRING NOT NULL,
60+
date STRING NOT NULL,
61+
sentiment STRING
62+
);
63+
64+
CREATE TEMP TABLE new_messages AS
65+
SELECT
66+
value->>'content' AS content,
67+
value->>'author' AS author,
68+
value->>'date' AS date,
69+
value->>'sentiment' AS sentiment
70+
FROM json_each($messages);
71+
72+
INSERT INTO messages
73+
SELECT * FROM new_messages;
74+
```

docs/core_concepts/index.mdx

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -222,6 +222,11 @@ On top of its editors to build endpoints, flows and apps, Windmill comes with a
222222
description="Visualize your data flow and automatically track where your assets are used"
223223
href="/docs/core_concepts/assets"
224224
/>
225+
<DocCard
226+
title="Ducklake"
227+
description="Store massive amounts of data in parquet files on S3 while querying it with natural SQL syntax through DuckDB"
228+
href="/docs/core_concepts/ducklake"
229+
/>
225230
</div>
226231

227232
## Hosting & advanced

sidebars.js

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -382,6 +382,7 @@ const sidebars = {
382382
'core_concepts/ai_generation/index',
383383
'core_concepts/workspace_settings/index',
384384
'core_concepts/assets/index',
385+
'core_concepts/ducklake/index',
385386
{
386387
type: 'category',
387388
label: 'Integrations',

static/videos/ducklake_demo.mp4

1.22 MB
Binary file not shown.

0 commit comments

Comments
 (0)