Skip to content

Conversation

@brancz
Copy link
Contributor

@brancz brancz commented Jan 14, 2026

Which issue does this PR close?

Closes #9174

What changes are included in this PR?

Implementation and tests. It's mostly copied from List.

Are these changes tested?

Yes, see unit tests.

Are there any user-facing changes?

No, purely additive.

@alamb @Jefffrey

@github-actions github-actions bot added the arrow Changes to the arrow crate label Jan 14, 2026
Copy link
Contributor

@friendlymatthew friendlymatthew left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, do you mind rebasing? I think CI is failing because this needs to be rebased on the latest master- the base branch is missing the encoded_len fn that was added recently

Otherwise, the implementation makes sense to me

@brancz brancz force-pushed the row-list-view branch 2 times, most recently from cd2a465 to 29f13ef Compare January 15, 2026 09:13
@brancz
Copy link
Contributor Author

brancz commented Jan 15, 2026

Very confused because both this PR and #9175 are based off of latest main, and the tip of the other PR seems to work fine.

@brancz
Copy link
Contributor Author

brancz commented Jan 16, 2026

Figured it out, I did actually need to change something after the rebase here, also refactored the use in both list-types.

list_size += 1;
}
}
O::from_usize(child_count).expect("overflow");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this to force a panic?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, same as the other assertion, this is consistent with what we do for regular lists as well

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally I feel it makes more sense to return an error here since the function already supports that

.collect();

let child = unsafe { converter.convert_raw(&mut child_rows, validate_utf8) }?;
assert_eq!(child.len(), 1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we return an error here since the function returns a result anyway?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we perform the exact same assertion in the regular lists, so I did this for consistency and it seems like a good idea since something went pretty spectacularly wrong if this isn't true


/// Computes the minimum offset and maximum end (offset + size) for a ListView array.
/// Returns (min_offset, max_end) which can be used to slice the values array.
fn compute_list_view_bounds<O: OffsetSizeTrait>(array: &GenericListViewArray<O>) -> (usize, usize) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function seems oddly placed; should be lower down instead of in the middle of the mod declarations?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed, will move

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moved it further down in the file, but not sure I love that either, maybe move it to the list file? what do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can always put it down near row_lengths where it won't intrude in the middle of Codec here

_ => unreachable!(),
};

let null_buffer = NullBuffer::new(BooleanBuffer::new(nulls.into(), 0, rows.len()));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe use new_unchecked as you already have the null count and we want to avoid calculating it twice

Suggested change
let null_buffer = NullBuffer::new(BooleanBuffer::new(nulls.into(), 0, rows.len()));
let null_buffer = NullBuffer::new_unchecked(BooleanBuffer::new(nulls.into(), 0, rows.len()), null_count);


if size > 0 {
min_offset = min_offset.min(offset);
max_end = max_end.max(end);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe you can break if you reached maximum bounds (0 and maximum value that can be)

Comment on lines +3349 to +3357
#[test]
fn test_list_view() {
test_single_list_view::<i32>();
}

#[test]
fn test_large_list_view() {
test_single_list_view::<i64>();
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add nested tests like the regular list

test_nested_list::<i64>();
}

fn test_single_list_view<O: OffsetSizeTrait>() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add more tests that take advantage of the fact that this is a view, namely

  • both list point to the same value.
  • unordered offsets (one item is from offset x and some item after that is from offset y and y is before x)
  • list 1 items cover list 2 items and a little more (e.g. list 1 offset is 10 and size 5 and list 2 offset is 12 and size 2).

ListArray::new(field, offsets, values, Some(nulls))
}

fn generate_column(len: usize) -> ArrayRef {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add the list view and large list view to here as well similar to how list and large list are here.
don't forget to increase the random range so it will cover the new values

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

arrow Changes to the arrow crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support formatting ListView

4 participants