Skip to content

Commit d89c14b

Browse files
authored
GH-48478: [Ruby] Fix Ruby list inference for nested non-negative integer arrays (#48584)
### Rationale for this change When building an `Arrow::Table` from a Ruby Hash passed to `Arrow::Table.new`, nested `Integer` arrays are incorrectly inferred as `string` (utf8) if all values are non-negative. This behavior is unexpected; nested integer arrays should be consistently represented as a list type (e.g., `list<item: uint*>` or `list<item: int*>`) rather than falling back to UTF-8 strings. ### What changes are included in this PR? This PR modifies the logic in `detect_builder_info()`, specifically the `when ::Array` block, to correctly identify nested non-negative integer arrays as list arrays. The change ensures that if `sub_builder_info` contains a valid `:builder`, it will be used even if `sub_builder_info` does not yet indicate that the type has been "detected." ### Are these changes tested? Yes. (`ruby ruby/red-arrow/test/run-test.rb`) ### Are there any user-facing changes? Yes. GitHub Issue: Closes #48478 * GitHub Issue: #48478 Authored-by: hypsakata <[email protected]> Signed-off-by: Sutou Kouhei <[email protected]>
1 parent 7be3c13 commit d89c14b

File tree

2 files changed

+45
-3
lines changed

2 files changed

+45
-3
lines changed

ruby/red-arrow/lib/arrow/array-builder.rb

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -155,12 +155,14 @@ def detect_builder_info(value, builder_info)
155155
sub_builder_info = detect_builder_info(sub_value, sub_builder_info)
156156
break if sub_builder_info and sub_builder_info[:detected]
157157
end
158-
if sub_builder_info and sub_builder_info[:detected]
159-
sub_value_data_type = sub_builder_info[:builder].value_data_type
158+
if sub_builder_info
159+
sub_builder = sub_builder_info[:builder]
160+
return builder_info unless sub_builder
161+
sub_value_data_type = sub_builder.value_data_type
160162
field = Field.new("item", sub_value_data_type)
161163
{
162164
builder: ListArrayBuilder.new(ListDataType.new(field)),
163-
detected: true,
165+
detected: sub_builder_info[:detected],
164166
}
165167
else
166168
builder_info

ruby/red-arrow/test/test-array-builder.rb

Lines changed: 40 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -146,6 +146,46 @@ def assert_build(builder_class, raw_array)
146146
["Apache Arrow"],
147147
])
148148
end
149+
150+
test("list<uint>s") do
151+
values = [
152+
[0, 1, 2],
153+
[3, 4],
154+
]
155+
array = Arrow::Array.new(values)
156+
data_type = Arrow::ListDataType.new(Arrow::UInt8DataType.new)
157+
assert_equal({
158+
data_type: data_type,
159+
values: [
160+
[0, 1, 2],
161+
[3, 4],
162+
],
163+
},
164+
{
165+
data_type: array.value_data_type,
166+
values: array.to_a,
167+
})
168+
end
169+
170+
test("list<int>s") do
171+
values = [
172+
[0, -1, 2],
173+
[3, 4],
174+
]
175+
array = Arrow::Array.new(values)
176+
data_type = Arrow::ListDataType.new(Arrow::Int8DataType.new)
177+
assert_equal({
178+
data_type: data_type,
179+
values: [
180+
[0, -1, 2],
181+
[3, 4],
182+
],
183+
},
184+
{
185+
data_type: array.value_data_type,
186+
values: array.to_a,
187+
})
188+
end
149189
end
150190

151191
sub_test_case("specific builder") do

0 commit comments

Comments
 (0)