Skip to content

Commit e807d09

Browse files
authored
PCX Review
1 parent aa90ed0 commit e807d09

File tree

2 files changed

+11
-7
lines changed

2 files changed

+11
-7
lines changed

src/content/docs/r2/data-catalog/about-compaction.mdx

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -30,12 +30,12 @@ You can configure the target file size for compaction. Currently, the minimum is
3030
Different compute engines have different optimal file sizes, so check their documentation.
3131

3232
Performance tradeoffs depend on your use case. For example, queries that return small amounts of data may perform better with smaller files, as larger files could result in reading unnecessary data.
33-
- For workloads that are more latency sensitive, consider a smaller target file size (for example, 64MB - 128MB)
34-
- For streaming ingest workloads, consider medium file sizes (for example, 128MB - 256MB)
35-
- For OLAP style queries that need to scan a lot of data, consider larger file sizes (for example, 256MB - 512MB)
33+
- For workloads that are more latency sensitive, consider a smaller target file size (for example, 64 MB - 128 MB)
34+
- For streaming ingest workloads, consider medium file sizes (for example, 128 MB - 256 MB)
35+
- For OLAP style queries that need to scan a lot of data, consider larger file sizes (for example, 256 MB - 512 MB)
3636

3737
## Current limitations
38-
- During open beta, compaction will compact up to 2GB worth of files once per hour for each table.
38+
- During open beta, compaction will compact up to 2 GB worth of files once per hour for each table.
3939
- Only data files stored in parquet format are currently supported with compaction.
4040
- Snapshot expiration and orphan file cleanup is not supported yet.
4141
- Minimum target file size is 64 MB and maximum is 512 MB.

src/content/docs/r2/data-catalog/manage-catalogs.mdx

Lines changed: 7 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -88,7 +88,7 @@ Compaction improves query performance by combining the many small files created
8888
<DashButton url="/?to=/:account/r2/overview" />
8989
2. Select the bucket you want to enable compaction on.
9090
3. Switch to the **Settings** tab, scroll down to **R2 Data Catalog**, and click on the **Edit** icon next to the compaction card.
91-
4. Enable compaction and optionally set a target file size. The default is 128MB.
91+
4. Enable compaction and optionally set a target file size. The default is 128 MB.
9292
5. (Optional) Provide a Cloudflare API token for compaction to access and rewrite files in your bucket.
9393
6. Select **Save**.
9494
</Steps>
@@ -103,9 +103,13 @@ npx wrangler r2 bucket catalog compaction enable <BUCKET_NAME> --target-size 12
103103
```
104104
</TabItem>
105105
</Tabs>
106-
Compaction requires a Cloudflare API token with **both** R2 storage and R2 Data Catalog read/write permissions to act as a service credential. The compaction process uses this token to read files, combine them, and update table metadata. Refer to [Authenticate your Iceberg engine](#authenticate-your-iceberg-engine) for details on creating a token with the required permissions.
107106

108-
Once enabled, compaction applies retroactively to all existing tables and automatically to newly created tables. During open beta, we currently compact up to 2GB worth of files once per hour for each table.
107+
:::note[API token permission requirements]
108+
Compaction requires a Cloudflare API token with both R2 storage and R2 Data Catalog read/write permissions to act as a service credential. The compaction process uses this token to read files, combine them, and update table metadata.
109+
110+
Refer to [Authenticate your Iceberg engine](#authenticate-your-iceberg-engine) for details on creating a token with the required permissions.
111+
112+
Once enabled, compaction applies retroactively to all existing tables and automatically to newly created tables. During open beta, we currently compact up to 2 GB worth of files once per hour for each table.
109113

110114
## Disable compaction
111115
Disabling compaction will prevent the process from running for all tables managed by the catalog. You can re-enable it at any time.

0 commit comments

Comments
 (0)