Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 12 additions & 3 deletions datafusion/functions/src/unicode/left.rs
Original file line number Diff line number Diff line change
Expand Up @@ -139,14 +139,23 @@ fn left_impl<'a, T: OffsetSizeTrait, V: ArrayAccessor<Item = &'a str>>(
n_array: &Int64Array,
) -> Result<ArrayRef> {
let iter = ArrayIter::new(string_array);
let mut chars_buf = Vec::new();
let result = iter
.zip(n_array.iter())
.map(|(string, n)| match (string, n) {
(Some(string), Some(n)) => match n.cmp(&0) {
Ordering::Less => {
let len = string.chars().count() as i64;
Some(if n.abs() < len {
string.chars().take((len + n) as usize).collect::<String>()
// Collect chars once and reuse for both count and take
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can probably use https://doc.rust-lang.org/std/primitive.str.html#method.char_indices here to avoid copying the strings into chars_buf entirely -- you could just find the correct offset and then copy the relevant bytes

If you wanted to get super fancy, you could add specializations for:

  1. Scalar n (no need to expand it out to an array)
  2. StringViewArray (no need to copy strings at all, you could just adjust the views)

However, this code seems better than what is on main! So we can merge it too and keep iterating

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, let me merge it first and probably improve it later.

chars_buf.clear();
chars_buf.extend(string.chars());
let len = chars_buf.len() as i64;

// For negative n, take (len + n) chars if n > -len (avoiding abs() which panics on i64::MIN)
Some(if n > -len {
chars_buf
.iter()
.take((len + n) as usize)
.collect::<String>()
} else {
"".to_string()
})
Expand Down