Skip to content

Commit 906be44

Browse files
authored
docs: Spill Data Storage (#1632)
* Spill Data Storage * updates
1 parent 1606ad2 commit 906be44

File tree

2 files changed

+68
-0
lines changed

2 files changed

+68
-0
lines changed

docs/en/guides/10-deploy/04-references/02-node-config/02-query-config.md

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -256,3 +256,23 @@ The following is a list of the parameters available within the [cache.disk] sect
256256
| --------- | -------------------------------------------------------------------------------------------------------- |
257257
| path | The path where the cache is stored when using disk cache. |
258258
| max_bytes | The maximum amount of cached data in bytes when using disk cache. Defaults to 21474836480 bytes (20 GB). |
259+
260+
## [spill] Section
261+
262+
The following is a list of the parameters available within the [spill] section:
263+
264+
| Parameter | Description |
265+
|--------------------------------------------|---------------------------------------------------------------------------------------------------------------|
266+
| spill_local_disk_path | Specifies the directory path where spilled data will be stored on the local disk. |
267+
| spill_local_disk_reserved_space_percentage | Defines the percentage of disk space that will be reserved and not used for spill. The default value is `30`. |
268+
| spill_local_disk_max_bytes | Sets the maximum number of bytes allowed for spilling data to the local disk. Defaults to unlimited. |
269+
270+
### [spill.storage] Section
271+
272+
The following is a list of the parameters available within the [spill.storage] section:
273+
274+
| Parameter | Description |
275+
|-----------|--------------------------------------------------------------------|
276+
| type | Specifies the storage type for remote spilling, for example, `s3`. |
277+
278+
To specify a specific storage, use the parameters in the [storage Section](#storage-section). For examples, see [Configuring Spill Storage](/guides/data-management/data-recycle#configuring-spill-storage).

docs/en/guides/57-data-management/04-data-recycle.md

Lines changed: 48 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,54 @@ There are two types of data:
1212

1313
If the data size is significant, you can run several commands ([Enterprise Edition Features](/guides/products/dee/enterprise-features)) to delete these data and free up storage space.
1414

15+
## Spill Data Storage
16+
17+
Self-hosted Databend supports spilling intermediate query results to disk when memory usage exceeds available limits. Users can configure where spill data is stored, choosing between local disk storage and a remote S3-compatible bucket.
18+
19+
### Spill Storage Options
20+
21+
Databend provides the following spill storage configurations:
22+
23+
- Local Disk Storage: Spilled data is written to a specified local directory in the query node. Please note that local disk storage is supported only for [Windows Functions](/sql/sql-functions/window-functions/).
24+
- Remote S3-Compatible Storage: Spilled data is stored in an external bucket.
25+
- Default Storage: If no spill storage is configured, Databend spills data to the default storage bucket along with your table data.
26+
27+
### Spill Priority
28+
29+
If both local and S3-compatible spill storage are configured, Databend follows this order:
30+
31+
1. Spill to local disk first (if configured).
32+
2. Spill to remote S3-compatible storage when local disk space is insufficient.
33+
3. Spill to Databend’s default storage bucket if neither local nor external S3-compatible storage is configured.
34+
35+
### Configuring Spill Storage
36+
37+
To configure spill storage, update the [databend-query.toml](https://github.com/databendlabs/databend/blob/main/scripts/distribution/configs/databend-query.toml) configuration file.
38+
39+
This example sets Databend to use up to 1 TB of local disk space for spill operations, while reserving 40% of the disk for system use:
40+
41+
```toml
42+
[spill]
43+
spill_local_disk_path = "/data1/databend/databend_spill"
44+
spill_local_disk_reserved_space_percentage = 40
45+
spill_local_disk_max_bytes = 1099511627776
46+
```
47+
48+
This example sets Databend to use MinIO as an S3-compatible storage service for spill operations:
49+
50+
```toml
51+
[spill]
52+
[spill.storage]
53+
type = "s3"
54+
[spill.storage.s3]
55+
bucket = "databend"
56+
root = "admin"
57+
endpoint_url = "http://127.0.0.1:9900"
58+
access_key_id = "minioadmin"
59+
secret_access_key = "minioadmin"
60+
allow_insecure = true
61+
```
62+
1563
## Purge Drop Table Data
1664

1765
Deletes data files of all dropped tables, freeing up storage space.

0 commit comments

Comments
 (0)