-
Notifications
You must be signed in to change notification settings - Fork 70
Open
Description
I'm using Arrow v2.7.2 with DataFrames v1.6.1 on Julia 1.10, and am running into an issue that seems to stem from Arrow.jl deserializing my Vector{Vector{T}} columns as Vector{SubArray{...}}:
julia> using Arrow, DataFrames
julia> df = DataFrame(foo=Vector{Int}[]);
julia> push!(df, [[1,2,3]])
1×1 DataFrame
Row │ foo
│ Array…
─────┼───────────
1 │ [1, 2, 3]
julia> Arrow.write("/tmp/test.arrow", df)
"/tmp/test.arrow"
julia> df2 = copy(DataFrame(Arrow.Table("/tmp/test.arrow")));
julia> typeof(df2.foo)
Vector{SubArray{Int64, 1, Primitive{Int64, Vector{Int64}}, Tuple{UnitRange{Int64}}, true}} (alias for Array{SubArray{Int64, 1, Arrow.Primitive{Int64, Array{Int64, 1}}, Tuple{UnitRange{Int64}}, true}, 1})This breaks certain push!es on the dataframe, which I haven't been able to reproduce in isolation, but which looks as follows:
MethodError: Cannot `convert` an object of type Vector{Int64} to an object of type SubArray{Int64, 1, Arrow.Primitive{Int64, Vector{Int64}}, Tuple{UnitRange{Int64}}, true}
Stacktrace:
[1] push!(a::Vector{SubArray{Int64, 1, Arrow.Primitive{Int64, Vector{Int64}}, Tuple{UnitRange{Int64}}, true}}, item::Vector{Int64})
@ Base ./array.jl:1118
[2] _row_inserter!(df::DataFrame, loc::Int64, row::Tuple{String, Vector{Int64}, Int64, Int64, Int64, Int64, Int64, Int64, Int64, Int64, String, Bool, Bool, Bool, Vector{Int64}, Vector{Int64}, Vector{Int64}, String, String, Float64}, mode::Val{:push}, promote::Bool)
@ DataFrames ~/.julia/packages/DataFrames/58MUJ/src/dataframe/insertion.jl:663
[3] push!(df::DataFrame, row::Tuple{String, Vector{Int64}, Int64, Int64, Int64, Int64, Int64, Int64, Int64, Int64, String, Bool, Bool, Bool, Vector{Int64}, Vector{Int64}, Vector{Int64}, String, String, Float64})
@ DataFrames ~/.julia/packages/DataFrames/58MUJ/src/dataframe/insertion.jl:457
It's possible I'm doing something wrong; first time Arrow.jl user here.
mathieu17g
Metadata
Metadata
Assignees
Labels
No labels