-
Notifications
You must be signed in to change notification settings - Fork 11
feat: add support for self-joins with sub-query #220
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
...nt-store/src/main/java/org/hypertrace/core/documentstore/expression/impl/JoinExpression.java
Outdated
Show resolved
Hide resolved
...nt-store/src/main/java/org/hypertrace/core/documentstore/expression/impl/JoinExpression.java
Outdated
Show resolved
Hide resolved
...nt-store/src/main/java/org/hypertrace/core/documentstore/expression/impl/JoinExpression.java
Outdated
Show resolved
Hide resolved
.../src/main/java/org/hypertrace/core/documentstore/expression/impl/SubQueryFromExpression.java
Outdated
Show resolved
Hide resolved
...ore/src/main/java/org/hypertrace/core/documentstore/expression/impl/TableFromExpression.java
Outdated
Show resolved
Hide resolved
...tore/src/test/java/org/hypertrace/core/documentstore/expression/impl/JoinExpressionTest.java
Outdated
Show resolved
Hide resolved
...main/java/org/hypertrace/core/documentstore/expression/impl/AliasedIdentifierExpression.java
Outdated
Show resolved
Hide resolved
.../test/java/org/hypertrace/core/documentstore/expression/impl/SubQueryJoinExpressionTest.java
Outdated
Show resolved
Hide resolved
.../test/java/org/hypertrace/core/documentstore/expression/impl/SubQueryJoinExpressionTest.java
Outdated
Show resolved
Hide resolved
| {"date":"2014-03-01T09:00:00Z","item":"Mirror","quantity":1}, | ||
| {"date":"2014-04-04T11:21:39.736Z","item":"Shampoo","quantity":20}, | ||
| {"date":"2016-02-06T20:20:13Z","item":"Soap","quantity":5} | ||
| ] No newline at end of file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Empty line at the end of this file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated it
| [ | ||
| {"date":"2015-09-10T08:43:00Z","item":"Comb","quantity":10}, | ||
| {"date":"2014-03-01T09:00:00Z","item":"Mirror","quantity":1}, | ||
| {"date":"2014-04-04T11:21:39.736Z","item":"Shampoo","quantity":20}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Can we format this file as JSON?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated it
| */ | ||
| @Value | ||
| public class AliasedIdentifierExpression extends IdentifierExpression { | ||
| String contextAlias; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not just alias?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alias sounds like we are giving the field name as alias (like in SelectionSpec. If you look at the example above, this is not the alias for the field, but the alias for the context in which we are referring the field.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense.
|
|
||
| @Override | ||
| public String toString() { | ||
| return String.format("JOIN (%s) AS %s ON %s", subQuery, subQueryAlias, joinCondition); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit:
| return String.format("JOIN (%s) AS %s ON %s", subQuery, subQueryAlias, joinCondition); | |
| return String.format("JOIN (%s) AS %s ON (%s)", subQuery, subQueryAlias, joinCondition); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated it
| List.of( | ||
| query -> singleton(getFilterClause(query, Query::getFilter)), | ||
| MongoFromTypeExpressionParser::getFromClauses, | ||
| query -> new MongoFromTypeExpressionParser(this).getFromClauses(query), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any way to avoid passing down the this pointer? It seems to be going away from the standard way.
| public BasicDBObject visit(LogicalExpression expression) { | ||
| BasicDBObject letClause = new BasicDBObject(); | ||
| for (FilterTypeExpression operand : expression.getOperands()) { | ||
| letClause.putAll((Map<String, Object>) operand.accept(this)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We seem to be casting BasicDBObject into a Map. This could be confusing (probably compiler might be generating a warning).
Can we either convert everything to BasicDBObject or everything to Map?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed the return type to map
|
|
||
| @Override | ||
| public BasicDBObject visit(ConstantExpression expression) { | ||
| return new BasicDBObject(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this deliberate?
If this is not expected here, why not extend the MongoSelectTypeExpressionParser?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, this is deliberate as this visitor is expected to go through SelectTypeExpressions and only return the fields that have to be declared in the let clause. So any other field other than AliasedIdentifierExpression has to be ignored but it should not throw an error in that case and for the case of AliasedIdentifierExpression it should return the field name that we need to add to the let clause.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it. This may be a common use-case. So, can we create a common base class that returns default (empty map/BasicDBObject) for every implementation and then override it?
That way
- The default returning shared base-class can be re-used
- Implementation here would look concise
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated it
| AliasedIdentifierExpression aliasedExpression, String subQueryAlias) { | ||
| BasicDBObject letClause = new BasicDBObject(); | ||
| if (aliasedExpression.getContextAlias().equals(subQueryAlias)) { | ||
| letClause.put(aliasedExpression.getName(), "$" + aliasedExpression.getName()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if the AliasedExpression contains a nested field like "a.b".
Does it need escaping/encoding?
It would be beneficial to add an integration for the same.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The field name of the aliasedExpression is given by the user and if we can make an assumption that the user won't give a dot in the field name then we should be fine as we are only using the field name here and not the context alias.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit confused. Does it allow me to pick the value of "a.b" and alias it as "c"?
In other words, are you saying that the alias can't contain a dot, but the identifier name can?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, my bad, I misunderstood your previous comment. I went through it again and it made sense this time. I've updated the implementation to encode the variable name so that it can handle nested fields also. I've also added an integration test for it.
| * Subquery join expression is not supported by default. Override this method to support it. | ||
| */ | ||
| default <T> T visit(SubQueryJoinExpression subQueryJoinExpression) { | ||
| throw new UnsupportedOperationException("This operation is not supported"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typically, we should force all the implementors of visitor interface to implement all the methods.
Whether to throw exception or not should be the implementor's responsibility.
We probably created the base class
MongoSelectTypeExpressionParser for a similar purpose (without having a default implementation in the interface).
The reason being, it may make sense to implement this in other implementations (other than MongoDB).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated it
| <T> T visit(final IdentifierExpression expression); | ||
|
|
||
| default <T> T visit(final AliasedIdentifierExpression expression) { | ||
| throw new UnsupportedOperationException(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here. Not a typical visitor flow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated it
| */ | ||
| @Value | ||
| public class AliasedIdentifierExpression extends IdentifierExpression { | ||
| String contextAlias; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense.
| return key.replace("\\u002e", FIELD_SEPARATOR).replace("\\u0024", PREFIX).replace("\\\\", "\\"); | ||
| } | ||
|
|
||
| public static String encodeVariableName(final String variableName) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not use the existing "encode" method in MongoUtils?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Encode method in MongoUtils also adds backslashes. Backslashes are allowed in keys but not in variable names, as mentioned here - https://www.mongodb.com/docs/manual/reference/aggregation-variables/.
So I had to introduce a new method to encode variable names.
| public Map<String, Object> visit(KeyExpression expression) { | ||
| return Collections.emptyMap(); | ||
| } | ||
|
|
||
| @Override | ||
| public Map<String, Object> visit(ArrayRelationalFilterExpression expression) { | ||
| return Collections.emptyMap(); | ||
| } | ||
|
|
||
| @Override | ||
| public Map<String, Object> visit(DocumentArrayFilterExpression expression) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't need to override these, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just realized that it might be confusing because I put the inner class in between the outer class methods.
Did you think that these are methods of the inner class? If so, please note that these are outer class methods and since it implements filter type expression it has to implement these methods as well.
To avoid confusion, I've moved the inner class below the outer class methods
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, yeah. That was the confusion. Makes sense to keep the inner classes towards the end.
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #220 +/- ##
============================================
- Coverage 79.73% 79.61% -0.12%
- Complexity 1043 1048 +5
============================================
Files 199 204 +5
Lines 4945 5083 +138
Branches 410 416 +6
============================================
+ Hits 3943 4047 +104
- Misses 711 740 +29
- Partials 291 296 +5
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Description
This PR has changes to add support for self-joins with sub-query (represented by
SubQueryJoinExpression) in MongoDB.Note that postgres impl of
FromTypeExpressionVisitorfor visitingSubQueryJoinExpressionthrows anUnsupportedOperationException. Currently, only the mongo implementation supports this operation.Testing
Added an integration test for testing the mongo impl of self-join with sub-query. It also demonstrates its sample usage