Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
---
title: R2 Data Catalog table-level compaction
description: Control compaction settings for individual Iceberg tables
products:
- r2
date: 2025-10-06
hidden: false
---

You can now enable compaction for individual [Apache Iceberg](https://iceberg.apache.org/) tables in [R2 Data Catalog](/r2/data-catalog/), giving you fine-grained control over different workloads.

```bash
# Enable compaction for a specific table (no token required)
npx wrangler r2 bucket catalog compaction enable <BUCKET> <NAMESPACE> <TABLE> --target-size 256
```

This allows you to:

- Apply different target file sizes per table
- Disable compaction for specific tables
- Optimize based on table-specific access patterns

Learn more at [Manage catalogs](/r2/data-catalog/manage-catalogs/).
6 changes: 6 additions & 0 deletions src/content/docs/r2/data-catalog/about-compaction.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,12 @@ Every write operation in [Apache Iceberg](https://iceberg.apache.org/), no matte

R2 Data Catalog can now [manage compaction](/r2/data-catalog/manage-catalogs) for Apache Iceberg tables stored in R2. When enabled, compaction runs automatically and combines new files that have not been compacted yet.

You can enable compaction at two levels:
- **Catalog-level**: Applies to all tables in your R2 bucket
- **Table-level**: Fine-grained control for specific tables

Table-level settings allow you to customize compaction behavior for different workloads within the same bucket.

Compacted files are prefixed with `compacted-` in the `/data/` directory.

### Choosing the right target file size
Expand Down
34 changes: 28 additions & 6 deletions src/content/docs/r2/data-catalog/manage-catalogs.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -49,10 +49,10 @@ npx wrangler r2 bucket catalog enable <BUCKET_NAME>
```

After enabling, Wrangler will return your catalog URI and warehouse name.

</TabItem>
</Tabs>


## Disable R2 Data Catalog on a bucket

When you disable the catalog on a bucket, it immediately stops serving requests from the catalog interface. Any Iceberg table references stored in that catalog become inaccessible until you re-enable it.
Expand All @@ -74,11 +74,14 @@ To disable the catalog on your bucket, run the [`r2 bucket catalog disable comma
```bash
npx wrangler r2 bucket catalog disable <BUCKET_NAME>
```

</TabItem>
</Tabs>

## Enable compaction

Compaction improves query performance by combining the many small files created during data ingestion into fewer, larger files according to the set `target file size`. For more information about compaction and why it's valuable, refer to [About compaction](/r2/data-catalog/about-compaction/).

<Tabs syncKey='CLIvDash'>
<TabItem label='Dashboard'>

Expand All @@ -96,23 +99,37 @@ Compaction improves query performance by combining the many small files created
</TabItem>
<TabItem label='Wrangler CLI'>

To enable the compaction on your catalog, run the [`r2 bucket catalog enable command`](/workers/wrangler/commands/#r2-bucket-catalog-compaction-enable):
To enable the compaction on your catalog, run the [`r2 bucket catalog compaction enable` command](/workers/wrangler/commands/#r2-bucket-catalog-compaction-enable):

```bash
npx wrangler r2 bucket catalog compaction enable <BUCKET_NAME> --target-size 128 --token <API_TOKEN>
# Enable catalog-level compaction (all tables)
npx wrangler r2 bucket catalog compaction enable <BUCKET_NAME> --target-size 128 --token <API_TOKEN>

# Enable compaction for a specific table
npx wrangler r2 bucket catalog compaction enable <BUCKET_NAME> <NAMESPACE> <TABLE> --target-size 128
```

:::note[Table-level vs Catalog-level compaction]

- **Catalog-level**: Applies to all tables in the bucket; requires an API token as a service credential.
- **Table-level**: Applies to a specific table only.

:::

</TabItem>
</Tabs>

:::note[API token permission requirements]
Compaction requires a Cloudflare API token with both R2 storage and R2 Data Catalog read/write permissions to act as a service credential. The compaction process uses this token to read files, combine them, and update table metadata.

Refer to [Authenticate your Iceberg engine](#authenticate-your-iceberg-engine) for details on creating a token with the required permissions.
:::

Once enabled, compaction applies retroactively to all existing tables and automatically to newly created tables. During open beta, we currently compact up to 2 GB worth of files once per hour for each table.
Once enabled, compaction applies retroactively to all existing tables (for catalog-level compaction) or the specified table (for table-level compaction). During open beta, we currently compact up to 2 GB worth of files once per hour for each table.

## Disable compaction
Disabling compaction will prevent the process from running for all tables managed by the catalog. You can re-enable it at any time.

Disabling compaction will prevent the process from running for all tables (catalog level) or a specific table (table level). You can re-enable it at any time.

<Tabs syncKey='CLIvDash'>
<TabItem label='Dashboard'>
Expand All @@ -130,11 +147,16 @@ Disabling compaction will prevent the process from running for all tables manage
</TabItem>
<TabItem label='Wrangler CLI'>

To disable the compaction on your catalog, run the [`r2 bucket catalog disable command`](/workers/wrangler/commands/#r2-bucket-catalog-compaction-disable):
To disable the compaction on your catalog, run the [`r2 bucket catalog compaction disable` command](/workers/wrangler/commands/#r2-bucket-catalog-compaction-disable):

```bash
# Disable catalog-level compaction (all tables)
npx wrangler r2 bucket catalog compaction disable <BUCKET_NAME>

# Disable compaction for a specific table
npx wrangler r2 bucket catalog compaction disable <BUCKET_NAME> <NAMESPACE> <TABLE>
```

</TabItem>
</Tabs>

Expand Down
46 changes: 37 additions & 9 deletions src/content/partials/workers/wrangler-commands/r2.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -112,33 +112,61 @@ wrangler r2 bucket catalog get <NAME> [OPTIONS]
depth={3}
/>

Enable compaction on a [R2 Data Catalog](/r2/data-catalog/).
Enable compaction on a [R2 Data Catalog](/r2/data-catalog/) or a specific table.

```txt
wrangler r2 bucket catalog compaction enable <BUCKET> [OPTIONS]
wrangler r2 bucket catalog compaction enable <BUCKET> [NAMESPACE] [TABLE] [OPTIONS]
```

- `BUCKET` <Type text="string" /> <MetaInfo text="required" />
- The name of the bucket to enable R2 Data Catalog compaction for.
- `--token` <Type text="string" /> <MetaInfo text="required" />
- The R2 API token with R2 Data Catalog edit permissions
- `--target-size` <Type text="number" /> <MetaInfo text="required" />
- The target file size (in MB) compaction will attempt to generate. Default: 128.
- `NAMESPACE` <Type text="string" /> <MetaInfo text="optional" />
- The namespace containing the table (for table-level compaction). Must be provided together with `TABLE`.
- `TABLE` <Type text="string" /> <MetaInfo text="optional" />
- The name of the table (for table-level compaction). Must be provided together with `NAMESPACE`.
- `--token` <Type text="string" /> <MetaInfo text="optional" />
- The R2 API token with R2 Data Catalog edit permissions. Required for catalog-level compaction only.
- `--target-size` <Type text="number" /> <MetaInfo text="optional" />
- The target file size (in MB) compaction will attempt to generate. Default: 128. Allowed values: 64, 128, 256, 512.

Examples:

```bash
# Enable catalog-level compaction (requires token)
npx wrangler r2 bucket catalog compaction enable my-bucket --token <TOKEN>

# Enable table-level compaction
npx wrangler r2 bucket catalog compaction enable my-bucket my-namespace my-table --target-size 256
```

<AnchorHeading
title="`catalog compaction disable`"
slug="r2-bucket-catalog-compaction-disable"
depth={3}
/>

Disable compaction on a [R2 Data Catalog](/r2/data-catalog/).
Disable compaction on a [R2 Data Catalog](/r2/data-catalog/) or a specific table.

```txt
wrangler r2 bucket catalog compaction disable <BUCKET> [OPTIONS]
wrangler r2 bucket catalog compaction disable <BUCKET> [NAMESPACE] [TABLE] [OPTIONS]
```

- `BUCKET` <Type text="string" /> <MetaInfo text="required" />
- The name of the bucket to enable R2 Data Catalog compaction for.
- The name of the bucket to disable R2 Data Catalog compaction for.
- `NAMESPACE` <Type text="string" /> <MetaInfo text="optional" />
- The namespace containing the table (for table-level compaction). Must be provided together with `TABLE`.
- `TABLE` <Type text="string" /> <MetaInfo text="optional" />
- The name of the table (for table-level compaction). Must be provided together with `NAMESPACE`.

Examples:

```bash
# Disable catalog-level compaction
npx wrangler r2 bucket catalog compaction disable my-bucket

# Disable table-level compaction
npx wrangler r2 bucket catalog compaction disable my-bucket my-namespace my-table
```

<AnchorHeading title="`cors set`" slug="r2-bucket-cors-set" depth={3} />

Expand Down
Loading