Skip to content

Conversation

@bleakley
Copy link
Contributor

No description provided.

@bleakley bleakley requested review from platypii and severo June 20, 2025 23:09
Copy link
Contributor

@platypii platypii left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically this could be made faster if we used parquetReadColumn which would return the column data as typed arrays, instead of transposing into rows and back. It would need to be added to the worker as a message type though. This is fine for now, but something to think about for the future.
https://github.com/hyparam/hyparquet/blob/v1.16.2/src/read.js#L105

Copy link
Contributor

@severo severo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @platypii that it could be more efficient by avoiding reading all the columns + by using typedarrays, but as we're reworking the DataFrame (hyparam/hightable#208), we will have to refactor again soon.

@bleakley bleakley merged commit 6b16362 into master Jun 23, 2025
8 checks passed
@bleakley bleakley deleted the getcolumn-parquetdataframe branch June 23, 2025 15:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants