Skip to content

[GO] parquet: dict spaced eof exception #619

@Sunyue

Description

@Sunyue

Describe the bug, including details regarding any error messages, version, and platform.

Hi, I am reading parquet from S3 using the parquet reader, basically by providing a reader with ReadAt(buf []byte, off int64) implemented with Range header like bytes=<off>-<off+len(buf)-1>. When I enable BufferedStreamEnabled, when reading certain columns parquet reader either ends in error parquet: dict spaced eof exception or panic. But when I disable BufferedStreamEnabled reading same parquet file works fine.
I can't see where the error is, could you kindly help on this? Thanks!

Panic looks like

panic: snappy: corrupt input

goroutine 87202 [running]:
github.com/apache/arrow-go/v18/parquet/compress.snappyCodec.Decode(...)

Component(s)

Parquet

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type: bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions