-
Notifications
You must be signed in to change notification settings - Fork 158
Open
Labels
enhancementThe issue is a request for improvement or a new featureThe issue is a request for improvement or a new featurestatus-triage_doneInitial triage done, will be further handled by the driver teamInitial triage done, will be further handled by the driver team
Description
What is the current behavior?
The current implementation of ArrowBatch is not serializable due to members such as snowflakeChunkDownloader, preventing distribution of reads across multiple workers.
What is the desired behavior?
An implementation of ArrowBatch that can be serialized/deserialized. It would be nice if this could be done via implementing json.Marshaler and json.Unmarshaler interfaces.
How would this improve gosnowflake?
End users will have more flexibility when consuming result sets, and distributing the reads can yield significant performance benefits. Concretely this has also been blocking implementation of the ExecutePartitions method in the Snowflake ADBC driver.
References, Other Background
- Relevant comment on original distributed reads PR proposing this same refactor: Arrow Record Distributed Result Batches #544 (review)
- Docs describing serializable batch distribution for python implementation: https://docs.snowflake.com/en/developer-guide/python-connector/python-connector-distributed-fetch#serializing-result-batches
- Perhaps this could be added to the set of features for the V2 release: SNOW-1881542 Upcoming Gosnowflake v2 changes #1586
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
enhancementThe issue is a request for improvement or a new featureThe issue is a request for improvement or a new featurestatus-triage_doneInitial triage done, will be further handled by the driver teamInitial triage done, will be further handled by the driver team