Skip to content

add support for ordering by expressions #94

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Aug 9, 2024

Conversation

WaVEV
Copy link
Collaborator

@WaVEV WaVEV commented Aug 4, 2024

The compiler can now handle aggregated fields in ordering, and several crashes have been fixed.

@timgraham timgraham changed the title Fix ordering add support for ordering by expressions Aug 8, 2024
@timgraham timgraham force-pushed the fix-ordering branch 2 times, most recently from a84bf11 to 1988672 Compare August 8, 2024 20:12
else:
# The expression must be added to extra_fields with an alias.
field_name = f"__order{next(idx)}"
fields[field_name] = order.expression
Copy link
Collaborator

@timgraham timgraham Aug 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

todo: understand why this changes from extra_fields to fields in the second commit and update comment two lines above.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the reason is:
The refs are actually references to specific fields selected in a query. These references compile to a position number within the as_sql method. To address this, I decided to move the refs to the post-project fields. Additionally, some of these refs are calculated during the project stage (such as annotations), so the refs are processed after the project stage. But the other projected fields that aren't refs have all the info to make the full expression.

Btw, 'refs' also has all the info to be compiled well, like an annotation with name (something that is not __annotation but isn't a column), it has the reference but also the full expression with it. So if we want to delete the extra field we could, the downside will be a field will be calculated twice or more.

Copy link
Collaborator

@timgraham timgraham Aug 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even with the explanation, it's still difficult to understand. Now in this method, extra_fields is only used for nulls first/last, but then outside this method, all ordering_fields are added to extra_fields when columns is None (I guess because they won't be added to $project in that case). edit: I tried to clarify this with a comment.

idx = itertools.count(start=1)
for order in self.order_by_expressions or []:
if isinstance(order.expression, Ref):
# TODO change?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

todo: understand this to know whether it makes a difference if we use the commented version or the uncommented version.

Copy link
Collaborator Author

@WaVEV WaVEV Aug 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are no differences. is okey to just use refs, but maybe we broke an abstraction? but in all part of the code we did (if this is the case)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔 mmh now I see the idea, and you are right, we could ignore the refs projection, we don't need them, because they are already managed in $project. So we can remove the refs from the extra fields

@@ -51,28 +50,11 @@ def __init__(self, compiler):
self.lookup_pipeline = None
self.project_fields = None
self.aggregation_pipeline = compiler.aggregation_pipeline
self.extra_fields = None
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think now we can pass the compiler alone and the query will know how to fill in all the fields from the compiler.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an idea for a later cleanup, right?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, to avoid doing

query.something = value
query.anotherthing = other_value
...

@WaVEV WaVEV marked this pull request as ready for review August 9, 2024 05:00
@@ -47,7 +47,7 @@ def count(self, compiler, connection, resolve_inner_expression=False, **extra_co
source_expressions = node.get_source_expressions()
filter_ = deepcopy(self.filter)
filter_.add(
WhereNode([Exact(source_expressions[0], Value(None))], negated=True),
WhereNode([IsNull(source_expressions[0], True)], negated=True),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I saw your comment about this in the previous PR, but does IT belong in this one? Maybe it affects a Django test we haven't triaged yet?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it doesn't affect any current test, but I don't want to forget this tiny change. This IsNull is needed when the source_expression[0] is a foreign field. maybe we should create another ticket in order to make it more traceable ?

Copy link
Collaborator

@timgraham timgraham Aug 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, thanks. We could merge it now. Just put it in a separate PR with the explanation, please. I'll remove it here when I squash the final edits. edit: I created #98.

@timgraham timgraham merged commit a58e5e0 into mongodb:main Aug 9, 2024
3 checks passed
@WaVEV WaVEV deleted the fix-ordering branch August 26, 2024 15:43
@WaVEV WaVEV restored the fix-ordering branch August 26, 2024 15:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants