Skip to content

"KeyError: key 7 not found" when loading parquet files generated with parquet and arrow crates #189

@RomanHargrave

Description

@RomanHargrave

Stacktrace below. I have not yet done a deep dive on this, but do know that pyarrow.parquet seems fine with the same file, and was able to read its contents.

Problematic file: test.pqt.gz

julia> read_parquet("../test.pqt")
ERROR: KeyError: key 7 not found
Stacktrace:
  [1] getindex(h::Dict{Int64, Thrift.ThriftMetaAttribs}, key::Int64)
    @ Base ./dict.jl:477
  [2] read_container(p::Thrift.TCompactProtocol, val::Parquet.PAR2.Statistics)
    @ Thrift ~/.julia/packages/Thrift/TPU8Q/src/base.jl:181
  [3] read_container(p::Thrift.TCompactProtocol, ::Type{Parquet.PAR2.Statistics})
    @ Thrift ~/.julia/packages/Thrift/TPU8Q/src/base.jl:168
  [4] read_container(p::Thrift.TCompactProtocol, val::Parquet.PAR2.ColumnMetaData)
    @ Thrift ~/.julia/packages/Thrift/TPU8Q/src/base.jl:190
  [5] read_container(p::Thrift.TCompactProtocol, ::Type{Parquet.PAR2.ColumnMetaData})
    @ Thrift ~/.julia/packages/Thrift/TPU8Q/src/base.jl:168
  [6] read_container(p::Thrift.TCompactProtocol, val::Parquet.PAR2.ColumnChunk)
    @ Thrift ~/.julia/packages/Thrift/TPU8Q/src/base.jl:190
  [7] read
    @ ~/.julia/packages/Thrift/TPU8Q/src/base.jl:169 [inlined]
  [8] read
    @ ~/.julia/packages/Thrift/TPU8Q/src/base.jl:167 [inlined]
  [9] read_container(p::Thrift.TCompactProtocol, val::Vector{Parquet.PAR2.ColumnChunk})
    @ Thrift ~/.julia/packages/Thrift/TPU8Q/src/base.jl:373
 [10] read_container(p::Thrift.TCompactProtocol, ::Type{Vector{Parquet.PAR2.ColumnChunk}})
    @ Thrift ~/.julia/packages/Thrift/TPU8Q/src/base.jl:168
 [11] read_container(p::Thrift.TCompactProtocol, val::Parquet.PAR2.RowGroup)
    @ Thrift ~/.julia/packages/Thrift/TPU8Q/src/base.jl:190
 [12] read
    @ ~/.julia/packages/Thrift/TPU8Q/src/base.jl:169 [inlined]
 [13] read
    @ ~/.julia/packages/Thrift/TPU8Q/src/base.jl:167 [inlined]
 [14] read_container(p::Thrift.TCompactProtocol, val::Vector{Parquet.PAR2.RowGroup})
    @ Thrift ~/.julia/packages/Thrift/TPU8Q/src/base.jl:373
 [15] read_container(p::Thrift.TCompactProtocol, ::Type{Vector{Parquet.PAR2.RowGroup}})
    @ Thrift ~/.julia/packages/Thrift/TPU8Q/src/base.jl:168
 [16] read_container(p::Thrift.TCompactProtocol, val::Parquet.PAR2.FileMetaData)
    @ Thrift ~/.julia/packages/Thrift/TPU8Q/src/base.jl:190
 [17] read
    @ ~/.julia/packages/Thrift/TPU8Q/src/base.jl:169 [inlined]
 [18] read
    @ ~/.julia/packages/Thrift/TPU8Q/src/base.jl:167 [inlined]
 [19] read_thrift
    @ ~/.julia/packages/Parquet/ASpqL/src/reader.jl:402 [inlined]

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions