-
Notifications
You must be signed in to change notification settings - Fork 46
Open
Description
I'm working through writing a service that downloads n files concurrently - then writes them to a single avro file.
- Is there a recommended way to parallel serialize data? I see
Writercallsmaybe_write_headerin all public append APIs, along withinto_inner, which makes it hard to just get the raw bytes of serialized rows without the header attached. It doesn't look like the rawSerializerimpl inser.rsis public either. What is the recommended way to split serialization work across cores? - Schema validation per-value appended is expensive - it would be really nice to have compile flags around it so it can be stripped out for production, or have a sampling rate attached to it to retain some runtime safety?
Metadata
Metadata
Assignees
Labels
No labels