Skip to content

Improve appender speed v2#41

Open
JAicewizard wants to merge 4 commits intoduckdb:mainfrom
JAicewizard:improve_appender_speed_v2
Open

Improve appender speed v2#41
JAicewizard wants to merge 4 commits intoduckdb:mainfrom
JAicewizard:improve_appender_speed_v2

Conversation

@JAicewizard
Copy link

This currently also includes the changes of marcboeker#483, but I will rebase once that is merged.

This adds multiple ways to append, using all the variations of table sources.
Using just row-based is actually slower than the current solution, which may be expected as it operates very similarly but with extra steps. The real performance gains come from the parallel implementations.
The benchmarks don't show much improvement (~25% for both parallel row and parallel chunk), however this is due to the fact that they are both bottlenecked on appending the chunks in DuckDB itself. If instead the bottleneck would be the data ingestion/computation on the user-side (instead of a simple counter), this would be much more favorable to the parallel variations. Currently they barely show up as parallel on my laptop.

Another improvement would be setting entire vectors of data at a time, instead of individual values, this could be done in a future PR.

@JAicewizard
Copy link
Author

Let me know if this is still welcome

@taniabogatsch
Copy link
Collaborator

Hi @JAicewizard - now that the dust has settled a bit on the migration: yes, this is still welcome! Feel free to ping me once there's a review to be left here. :)

@JAicewizard JAicewizard force-pushed the improve_appender_speed_v2 branch from 84954f6 to 39958b1 Compare December 28, 2025 11:53
@JAicewizard JAicewizard force-pushed the improve_appender_speed_v2 branch from 39958b1 to 8e9fbd2 Compare December 28, 2025 12:27
@JAicewizard JAicewizard force-pushed the improve_appender_speed_v2 branch from 8e9fbd2 to 957d854 Compare December 28, 2025 12:47
@JAicewizard
Copy link
Author

@taniabogatsch I really havn't been able to work on much on this over the past couple of months, but this should work now! I also noticed some nice improvements like projected chunks :)

Copy link
Member

@mlafeldt mlafeldt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this big contribution!

Here's my initial review.

In addition to the inline remarks, I also noticed that this could use more comprehensive tests, e.g.

  • FillRow() returns error mid-stream
  • FillChunk() returns error
  • AppendDataChunk() fails (StateError)
  • Multiple goroutines fail simultaneously

Some work needed before merge, but the direction is solid.

@JAicewizard
Copy link
Author

Thanks for the review! I see you also worked on getting zig extensions to build nicely, I will definitely try that out as my current setup is quite outdated and annoying to work with!

Besides that, applied all the suggestions locally, still working on getting the error tests in place. With the error workgroup, I don't think testing multiple simultaneous errors is necessary, as the error handling itself is in a package from the Go project and I expect it to function. We could add one, but it would be a maintenance burden (defining a parallel table source + test takes up quite a bit of code) and be flaky at best.

@JAicewizard JAicewizard force-pushed the improve_appender_speed_v2 branch from 294e8ee to c7a7ee4 Compare January 20, 2026 23:27
@JAicewizard
Copy link
Author

I am not entirely sure how to test "AppendDataChunk() fails (StateError)", so that is not yet tested. Besides that I think adding error tests was a good idea, as it made me realise we could handle it a bit more gracefully.

@JAicewizard JAicewizard force-pushed the improve_appender_speed_v2 branch 2 times, most recently from a5db2c8 to ba322ae Compare January 20, 2026 23:41
@JAicewizard JAicewizard force-pushed the improve_appender_speed_v2 branch from ba322ae to 66b94ce Compare January 20, 2026 23:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants