-
Notifications
You must be signed in to change notification settings - Fork 111
Support isDistinctFrom and isNotDistinctFrom operators #3357
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
} | ||
} else if (comparand == null) { | ||
return 1; | ||
return toComparable(fieldValue).compareTo(toComparable(comparand)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the original line 237~244 is wrong? If any of the left or right is null, the comparison result should be unknown? Also in the original code, we call: if(value == null) return null
before we call this compare
method, so the block of if(fieldValue == null)
was never reached.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method currently implements full collation for use in contexts where NULL
is to be considered just another value. For example, that is how index entries sort with NULLS FIRST
. It is still reasonable to use such a function in contexts that implement tri-valued logic by guarding calls, as many callers do today.
It is equally reasonable to have this method only implement comparison between non-null values and to expect all callers to deal with nulls, either sorting them first or producing unknown, as is appropriate to their context. And both contexts are appropriate in the larger system, which must deal sometimes with the "query" view that has unknown and other times with the "storage" view that does not. It might be nice to have a clearer separation so that it is unambiguous which view applies, perhaps by using entirely different currency and not just bare Object
.
What this change does is something in between, which doesn't seem right. Specifically, if this does not handle null
at all, then it ought not to have @Nullable
. The signature should also be changed. I am surprised that the IDE / style checker(s) doesn't flag this for a possible NPE. But otherwise, it's okay provided all callers now check both sides.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it, I changed the method to compare between non-null values.
@@ -707,28 +713,44 @@ public static Type invertComparisonType(@Nonnull final Comparisons.Type type) { | |||
@Nullable | |||
@SpotBugsSuppressWarnings("NP_BOOLEAN_RETURN_NULL") | |||
public static Boolean evalComparison(@Nonnull Type type, @Nullable Object value, @Nullable Object comparand) { | |||
if (value == null) { | |||
return null; | |||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as I mentioned in another comment, because of this is called first, we'll never reach the if(fieldValue == null)
block in method compare
. I removed this block here, and added to cases below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
right now, if type = TEXT_CONTAINS_ALL, or anything that is not in the switch
block, and value == null
, this method returns null. Is this the expected behavior? I think we wanted to throw exception in this case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, throwing for unknown type before checking either side does seem better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 after removing the if (value == null) return null
block, unknown type throws exception regardless of input value being null.
...-core/src/main/java/com/apple/foundationdb/record/query/plan/cascades/values/RelOpValue.java
Outdated
Show resolved
Hide resolved
...-core/src/main/java/com/apple/foundationdb/record/query/plan/cascades/values/RelOpValue.java
Show resolved
Hide resolved
@@ -417,7 +417,7 @@ protected void testParameterComparison(String name, String field, Object val1, C | |||
final Bindings bindings = Bindings.newBuilder() | |||
.set("fooParam", val2) | |||
.build(); | |||
if (val1 != null && val2 != null && (type == Comparisons.Type.IN && !(val2 instanceof List) || type.name().startsWith("TEXT_"))) { | |||
if (val1 != null && val2 != null && (type == Comparisons.Type.IN && !(val2 instanceof List)) || type.name().startsWith("TEXT_")) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this was a bug. Why it worked before: when type = TEXT_CONTAINS_ALL, it gets into assertEquals
and eventually, Comparisons.evalComparison
, and because value == null
, it returns null.
...re/src/test/java/com/apple/foundationdb/relational/recordlayer/query/StandardQueryTests.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It feels like there should be some changes added to ExpressionVisitor.java
that I do not see in this PR, but I am not sure.
is there any specific queries that you are concerned about? I think this is a feature on the "operator" level, not "expression" level. |
...-core/src/main/java/com/apple/foundationdb/record/query/plan/cascades/values/RelOpValue.java
Outdated
Show resolved
Hide resolved
...-core/src/main/java/com/apple/foundationdb/record/query/plan/cascades/values/RelOpValue.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I only looked it very much on the surface. Please asked Mike to give you a review on the Comparisons part as that's shared between the planners and runtimes.
} | ||
} else if (comparand == null) { | ||
return 1; | ||
return toComparable(fieldValue).compareTo(toComparable(comparand)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method currently implements full collation for use in contexts where NULL
is to be considered just another value. For example, that is how index entries sort with NULLS FIRST
. It is still reasonable to use such a function in contexts that implement tri-valued logic by guarding calls, as many callers do today.
It is equally reasonable to have this method only implement comparison between non-null values and to expect all callers to deal with nulls, either sorting them first or producing unknown, as is appropriate to their context. And both contexts are appropriate in the larger system, which must deal sometimes with the "query" view that has unknown and other times with the "storage" view that does not. It might be nice to have a clearer separation so that it is unambiguous which view applies, perhaps by using entirely different currency and not just bare Object
.
What this change does is something in between, which doesn't seem right. Specifically, if this does not handle null
at all, then it ought not to have @Nullable
. The signature should also be changed. I am surprised that the IDE / style checker(s) doesn't flag this for a possible NPE. But otherwise, it's okay provided all callers now check both sides.
private static Boolean compareEquals(Object value, Object comparand) { | ||
if (value == null || comparand == null) { | ||
return null; | ||
private static boolean compareNotDistinctFrom(Object value, Object comparand) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should have @Nullable
explicitly to be clear what the expectations are.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added.
@@ -285,6 +285,9 @@ private static Boolean compareStartsWith(@Nullable Object value, @Nullable Objec | |||
@Nullable | |||
@SpotBugsSuppressWarnings("NP_BOOLEAN_RETURN_NULL") | |||
private static Boolean compareLike(@Nullable Object value, @Nullable Object pattern) { | |||
if (value == null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this want to similarly handle LIKE ?
with a NULL
binding, rather than getting Illegal pattern
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I checked that in sql, where x like a
returns null when x or a is null, so I think we should follow that here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed the if(value == null)
block from here though, because LikeOperatorValue.likeOperation
already handles nulls operations.
@@ -707,28 +713,44 @@ public static Type invertComparisonType(@Nonnull final Comparisons.Type type) { | |||
@Nullable | |||
@SpotBugsSuppressWarnings("NP_BOOLEAN_RETURN_NULL") | |||
public static Boolean evalComparison(@Nonnull Type type, @Nullable Object value, @Nullable Object comparand) { | |||
if (value == null) { | |||
return null; | |||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, throwing for unknown type before checking either side does seem better.
} else { | ||
return toComparable(fieldValue).compareTo(toComparable(comparand)); | ||
return toClassWithRealEquals(value).equals(toClassWithRealEquals(comparand)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we make toClassWithRealEquals
also @Nonnull
and remove the null
check in it, to further ensure that all callers are similarly consistent with the rest of this change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
* The {@code isDistinctFrom} function. | ||
*/ | ||
@AutoService(BuiltInFunction.class) | ||
public static class IsDistinctFromFn extends BuiltInFunction<Value> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add unit tests for this function, as well as NotDistinctFromFn
? You can extend the unit tests in BooleanValueTest
and add tests for these functions with variations of argument types so you can test the underlying physical operators.
a734369
to
6b82cb2
Compare
return null; | ||
if (type == Comparisons.Type.IS_DISTINCT_FROM) { | ||
return arg1 != null || arg2 != null; | ||
} else if (type == Comparisons.Type.NOT_DISTINCT_FROM) { | ||
return arg1 == null && arg2 == null; | ||
} else { | ||
return null; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is interesting that we have this early treatment of null
values here instead of delegating it to the actual underlying physical operator.
I opened an issue to fix this, as well as introducing NULL
type code handling (instead of the deprecated UNKNOWN
).
#3526
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes I'm not sure about this as well, but putting it in the operators will result in a lot of duplicated code. Fixing it in a separate PR sounds good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
@@ -894,6 +929,119 @@ private enum BinaryPhysicalOperator { | |||
GT_IDN(Comparisons.Type.GREATER_THAN, Type.TypeCode.UUID, Type.TypeCode.NULL, (l, r) -> null), | |||
GTE_NID(Comparisons.Type.GREATER_THAN_OR_EQUALS, Type.TypeCode.NULL, Type.TypeCode.UUID, (l, r) -> null), | |||
GTE_IDN(Comparisons.Type.GREATER_THAN_OR_EQUALS, Type.TypeCode.UUID, Type.TypeCode.NULL, (l, r) -> null), | |||
|
|||
IS_DISTINCT_FROM_BU(Comparisons.Type.IS_DISTINCT_FROM, Type.TypeCode.BOOLEAN, Type.TypeCode.NULL, (l, r) -> Objects.nonNull(l)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you adjust the naming for these?
I think you're using the letter U
from when those were unknown, but looking at the rest of the file, you'd want to use N
for nulls.
For example IS_DISTINCT_FROM_BU
-> IS_DISTINCT_FROM_BN
This resolves #3504.