`ducklake_merge_adjacent_files` very slow

### What happens?

Apologies, I can't provide a reprex for this.   I've got a large (500Gb+) datalake, with postgres as the metadata catalog, and about 2k partitions.  I've tried to manage writes in large chunks, to avoid lots of small parquet, but it is sometimes hard to avoid.  

When I run `ducklake_merge_adjacent_files` it can take a very long time to run and finish - at least 3 hours, sometimes much longer.  Are there ways to manage this process, or to point the compaction at specific partitions (for example).   I tried `max_compacted_files => 100` but it didn't appear to speed things up, and I'm a little unclear from the docs what this does to know whether it will help for my case.   

While this compaction runs, I noticed that both RAM and CPU usage is very low - should I expect the process to run in parallel, respecting the `SET threads TO X` parameter.  Any tips or pointers would be great, I suspect I might have misunderstood how some of this works!

### To Reproduce

na

### OS:

windows

### DuckDB Version:

1.4.3

### DuckLake Version:

de813ff

### DuckDB Client:

python

### Hardware:

_No response_

### Full Name:

alastair rushworth

### Affiliation:

creative data technolgies, ltd

### What is the latest build you tested with? If possible, we recommend testing with the latest nightly build.

I have not tested with any build

### Did you include all relevant data sets for reproducing the issue?

No - I cannot share the data sets because they are confidential

### Did you include all code required to reproduce the issue?

- [ ] Yes, I have

### Did you include all relevant configuration (e.g., CPU architecture, Python version, Linux distribution) to reproduce the issue?

- [ ] Yes, I have

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`ducklake_merge_adjacent_files` very slow #687

What happens?

To Reproduce

OS:

DuckDB Version:

DuckLake Version:

DuckDB Client:

Hardware:

Full Name:

Affiliation:

What is the latest build you tested with? If possible, we recommend testing with the latest nightly build.

Did you include all relevant data sets for reproducing the issue?

Did you include all code required to reproduce the issue?

Did you include all relevant configuration (e.g., CPU architecture, Python version, Linux distribution) to reproduce the issue?

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

ducklake_merge_adjacent_files very slow #687

Description

What happens?

To Reproduce

OS:

DuckDB Version:

DuckLake Version:

DuckDB Client:

Hardware:

Full Name:

Affiliation:

What is the latest build you tested with? If possible, we recommend testing with the latest nightly build.

Did you include all relevant data sets for reproducing the issue?

Did you include all code required to reproduce the issue?

Did you include all relevant configuration (e.g., CPU architecture, Python version, Linux distribution) to reproduce the issue?

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

`ducklake_merge_adjacent_files` very slow #687