parquet file parallel read settings #13635
Unanswered
jiangjiguang
asked this question in
Q&A
Replies: 1 comment
-
We can't create multiple splits from a single row group in the current code. In theory it might be possible to divide a row group into smaller chunks by looking at column level page offsets, but it would significantly complicate the reader code. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
I have a parquet file whose size is 251.86MB, the block is 128MB. so the file has two "Row group".
The problem is: no matter how much drivers I set. has two drivers to read the file, each driver reads one Row group.
I have read the code, find the problem below:

How can I set up multiple drivers read one row group?
Beta Was this translation helpful? Give feedback.
All reactions