Skip to content

Conversation

drewdaemon
Copy link
Contributor

@drewdaemon drewdaemon commented Sep 20, 2024

Part of #98545

Adds a REVERSE function to the ES|QL language.

REVERSE reverses the characters in a string. If you want to know more, read the docs!

Screenshot 2024-09-27 at 10 23 47 AM

@elasticsearchmachine elasticsearchmachine added external-contributor Pull request authored by a developer outside the Elasticsearch team v9.0.0 labels Sep 20, 2024
@drewdaemon drewdaemon mentioned this pull request Sep 20, 2024
@drewdaemon drewdaemon added >enhancement :Analytics/ES|QL AKA ESQL ES|QL-ui Impacts ES|QL UI and removed external-contributor Pull request authored by a developer outside the Elasticsearch team labels Sep 23, 2024
@elasticsearchmachine
Copy link
Collaborator

Hi @drewdaemon, I've created a changelog YAML for you.

Comment on lines 1224 to 1226
message:keyword | message_reversed:keyword
😂😁😅😐🥺😢😭 | 😭😢🥺😐😅😁😂
// end::reverseEmoji-result[]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice - small comment, I would use different emoticons to better signal visually the reversal: 🏡🚌✈ becomes ✈ 🚌🏡

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additionally add also a string that combines and intertwines regular chars + emoticons + extended ones (こんにちは) and check they are properly reversed (namely that the encoding is preserved).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice - small comment, I would use different emoticons to better signal visually the reversal

I think of the current ones like a sort of easter egg in the docs... you have to squint a bit and then you're rewarded with "getting it."

That said, I defer to you — LMK

cc @leemthompo in case he has input

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think Costin's right that it would be easier with a bit more visual variance in the emojis but we're knee deep in nit territory here 😆

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually Liam this is the most important conversation going on in this PR right now 😆

Okay, I'll take another stab at emoji art 🖼️

Copy link
Contributor Author

@drewdaemon drewdaemon Sep 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My battle for these emojis is met with opposition at every turn... first it was our handling of unicode graphemes and now it's a custom font on the docs website... will I prevail?....

Screen.Recording.2024-09-26.at.4.01.39.PM.mov

@drewdaemon drewdaemon marked this pull request as ready for review September 24, 2024 01:38
@drewdaemon
Copy link
Contributor Author

@nik9000 @ivancea I think I have addressed all your feedback. Would you take another look?

Copy link
Member

@nik9000 nik9000 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

def(Locate.class, Locate::new, "locate"),
def(Repeat.class, Repeat::new, "repeat"),
def(Space.class, Space::new, "space") },
def(Trim.class, Trim::new, "trim") },
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for sorting!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All JetBrains :)

Copy link
Contributor

@bpintea bpintea left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some minor/optional drive-by comments only.

return reversed.toString();
}

private static boolean isOneByteUTF8(BytesRef ref) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could also be isASCII().

Copy link
Contributor Author

@drewdaemon drewdaemon Sep 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understood ASCII vs unicode to be distinct from the UTF encoding so I tried to be specific.

Let me know if I misunderstood!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've used words like "low ascii" or "one bytes utf-8" for the same thing. Either is fine by me. But in grand java tradition - what you've written is quite descriptive.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe isUTF8ASCII or something?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find isOneByteUTF8 clearer than other alternatives.
UTF8 encoding code points corresponding to ASCII on one byte is known, but maybe not "well enough" known. Let's leave it as is.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think isOneByteUTF8 the best, yeah.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool cool

Comment on lines 130 to 138
private static void reverseArray(byte[] array, int start, int end) {
while (start < end) {
byte temp = array[start];
array[start] = array[end];
array[end] = temp;
start++;
end--;
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be nice to add this back to org.elasticsearch.common.util.ArrayUtils.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 6c09607

Comment on lines 142 to 145
if (isOneByteUTF8(val)) {
// this is the fast path. we know we can just reverse the bytes.
BytesRef reversed = BytesRef.deepCopyOf(val);
reverseArray(reversed.bytes, reversed.offset, reversed.offset + reversed.length - 1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This scans an ASCII text for three times (even though not necessarily byte-by-byte).
I guess the first pass could attempt a reverse already, though that would imply allocation for non-ASCII text.
I guess it's fine as is, probably not making a practical difference either way, can be optimised if ever needed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True. At least we're in O(n).

can be optimised if ever needed

Agreed

@drewdaemon
Copy link
Contributor Author

drewdaemon commented Sep 30, 2024

Hmmm, these failing tests appear to be a Catch-22 centered on this CSV test.

When I run ./gradlew :x-pack:plugin:esql:test --tests "org.elasticsearch.xpack.esql.CsvTests" it expects both the first and second REVERSE calls the in the query to produce multi-field warnings. But when I run ./gradlew :x-pack:plugin:esql:qa:server:single-node:javaRestTest it expects only the second call in the query the produce the warnings.

I'm not really sure what to do about this. Could someone take a look?

@nik9000
Copy link
Member

nik9000 commented Sep 30, 2024

Catch-22 centered

That's the best catch.

I'll look.

@drewdaemon
Copy link
Contributor Author

Okay, now it looks like the warning returned in the multi-node scenario is not identifying the line and column number correctly:

Line -1:-1: evaluation of [] failed, treating result as null. Only first 20 failures recorded.

In the CSV test, the warning is Line 1:53: evaluation of [REVERSE(job_positions)] failed, treating result as null. Only first 20 failures recorded. which is correct.

@nik9000
Copy link
Member

nik9000 commented Sep 30, 2024

Ah! That's you actually! I'll link.


@Override
public void writeTo(StreamOutput out) throws IOException {
out.writeNamedWriteable(field());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add source.writeTo(out) and read it with this(Source.readFrom((PlanStreamInput) in), in.readNamedWriteable(Expression.class). Then you'll need to flip the test - it is setup with that alwaysempty thing.

After that the source should appear.

This is a left over from when serializing Source was really expensive. Back then we didn't serialize it. We should serialize it because it's really cheap now - just a few bytes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realized UnaryScalarFunction already takes care of this, so I simplified the constructor in ad7c68f

@drewdaemon drewdaemon merged commit 147461f into elastic:main Oct 4, 2024
16 checks passed
@elasticsearchmachine
Copy link
Collaborator

💚 Backport successful

Status Branch Result
8.x

drewdaemon added a commit to drewdaemon/elasticsearch that referenced this pull request Oct 4, 2024
Adds a REVERSE string function
elasticsearchmachine pushed a commit that referenced this pull request Oct 4, 2024
matthewabbott pushed a commit to matthewabbott/elasticsearch that referenced this pull request Oct 10, 2024
Adds a REVERSE string function
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Analytics/ES|QL AKA ESQL >enhancement ES|QL-ui Impacts ES|QL UI Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v8.16.0 v9.0.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants