Need Advice on How to Handle Async Reader in Pyo3 Python bindings #4717
Closed
abstractqqq
started this conversation in
General
Replies: 1 comment 5 replies
-
I think you probably want to have something like |
Beta Was this translation helpful? Give feedback.
5 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hello, I am trying to build a reader that automatically decompresses data and chunks data in a certain way and output the bytes to Python. This works perfectly for local files and I can easily get a 5x speed up. But I am facing a lot of technical difficulty when it comes to cloud files, which I am reading through object_store/aws-sdk-s3. The problem I am facing is that inevitably I need something like the following:
because I need to keep the state of the reader in order for
read_one_chunk()
to work. The whole point of my project is to stream the data because it can be extremely large. All Rust cloud data store crates provide async APIs, and they often return an "impl AsyncBufRead" instead of a concrete type for data streams.I have done enough reading and googling and I know Pyo3 won't be able to make the struct above into a Pyclass. I have also thought about creating the reader on Python side and passing it to Rust but it looks like passing file handle may slow down the process a lot because the bytes need to be copied...
Any advice on how should I proceed? Thank you in advance.
Beta Was this translation helpful? Give feedback.
All reactions