Skip to content

Conversation

@sarutak
Copy link
Member

@sarutak sarutak commented Dec 6, 2025

What changes were proposed in this pull request?

This PR aims to check offset and length on copying in UTF8String#reverse.
For details, see https://lists.apache.org/thread/d9pvkh3jbsq8lc33v75kmwq5wg57422h (Only PMC members can read with login).
To avoid performance regression, this PR choose to check offset and length rather than validate the input UTF-8 string.

Why are the changes needed?

For safety.

Does this PR introduce any user-facing change?

Yes, but doesn't break compatibility.

How was this patch tested?

Example queries mentioned in this thread works even though the results are broken.
All the operation defined in UTF8String are expected to work correctly with valid UTF-8 strings so the broken results with invalid UTF-8 strings should be reasonable.

Was this patch authored or co-authored using generative AI tooling?

No.

@github-actions github-actions bot added the SQL label Dec 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant