Skip to content

Conversation

@pmattione-nvidia
Copy link
Contributor

@pmattione-nvidia pmattione-nvidia commented Jan 26, 2026

This lifts the switch/if's on the column type out of the decode loop in the parquet fixed-width decode kernel. Prior to looping we instead identify the type and assign it an enum, and in the loop use if-constexpr's instead. This improves the performance of the decode.

Benchmark Results

  • Non-chunked Int: 6-9% Faster
  • Non-chunked float: 2-14% Faster
  • Chunked Int: 10% Slower
  • Chunked float: No change

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@pmattione-nvidia pmattione-nvidia self-assigned this Jan 26, 2026
@pmattione-nvidia pmattione-nvidia added Performance Performance related issue improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Jan 26, 2026
@copy-pr-bot
Copy link

copy-pr-bot bot commented Jan 26, 2026

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@github-actions github-actions bot added the libcudf Affects libcudf (C++/CUDA) code. label Jan 26, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

improvement Improvement / enhancement to an existing function libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change Performance Performance related issue

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant