Skip to content

conform SQLCompiler.execute_sql() and results_iter() to Django's API #70

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jul 10, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
67 changes: 25 additions & 42 deletions django_mongodb/compiler.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
from django.core.exceptions import EmptyResultSet, FieldDoesNotExist, FullResultSet
from django.db import DatabaseError, IntegrityError, NotSupportedError
from django.db.models import NOT_PROVIDED, Count, Expression
from django.db.models import Count, Expression
from django.db.models.aggregates import Aggregate
from django.db.models.constants import LOOKUP_SEP
from django.db.models.sql import compiler
Expand All @@ -23,14 +23,21 @@ def execute_sql(
# QuerySet.count()
if self.query.annotations == {"__count": Count("*")}:
return [self.get_count()]
# Specify columns if there are any annotations so that annotations are
# computed via $project.
columns = self.get_columns() if self.query.annotations else None
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we lost an optimization here where $project is omitted unless needed (i.e. there are annotations).

For example, the first query in basic.tests.ModelLookupTest.test_equal_lookup went from:

basic_article.aggregate([{'$match': {'$expr': {'$eq': ['$_id', ObjectId('668da533d38a1c8d3ee7708b')]}}}, {'$limit': 21}]);`

to

basic_article.aggregate([{'$match': {'$expr': {'$eq': ['$_id', ObjectId('668da57447faa4a8f994ebd1')]}}}, {'$project': {'_id': 1, 'headline': 1, 'pub_date': 1}}, {'$limit': 21}]);

We could fix it by making self.get_columns() if self.query.annotations else None to build_query() but there might be a better place to put it.

Adding a test to the Django fork for this at some point would probably be a good idea.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I added this new layer thinking that is not a big deal. If I change the code a bit, the old behavior is preserved. So, I will change this code in order to keep the previous behavior.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure how much it matters performance-wise, but particularly when debugging queries, I think it's cleaner not to include $project if it's not needed, as long as the solution for doing so doesn't get super complicated.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's not. As you mention above, we need to pass a None to build query if not self.query.annotation. Running the test, will update soon.


columns = self.get_columns()
try:
query = self.build_query(columns)
query = self.build_query(
# Avoid $project (columns=None) if unneeded.
columns if self.query.annotations or not self.query.default_cols else None
)
except EmptyResultSet:
return None
return query.fetch()
return iter([]) if result_type == MULTI else None

return (
(self._make_result(row, columns) for row in query.fetch())
if result_type == MULTI
else self._make_result(next(query.fetch()), columns)
)

def results_iter(
self,
Expand All @@ -43,37 +50,23 @@ def results_iter(
Return an iterator over the results from executing query given
to this compiler. Called by QuerySet methods.
"""
columns = self.get_columns()

if results is None:
# QuerySet.values() or values_list()
try:
results = self.build_query(columns).fetch()
except EmptyResultSet:
results = []
results = self.execute_sql(MULTI, chunked_fetch=chunked_fetch, chunk_size=chunk_size)

converters = self.get_converters(columns)
for entity in results:
yield self._make_result(entity, columns, converters, tuple_expected=tuple_expected)
fields = [s[0] for s in self.select[0 : self.col_count]]
converters = self.get_converters(fields)
rows = results
if converters:
rows = self.apply_converters(rows, converters)
if tuple_expected:
rows = map(tuple, rows)
return rows

def has_results(self):
return bool(self.get_count(check_exists=True))

def get_converters(self, expressions):
converters = {}
for name_expr in expressions:
try:
name, expr = name_expr
except TypeError:
# e.g., Count("*")
continue
backend_converters = self.connection.ops.get_db_converters(expr)
field_converters = expr.get_db_converters(self.connection)
if backend_converters or field_converters:
converters[name] = backend_converters + field_converters
return converters

def _make_result(self, entity, columns, converters, tuple_expected=False):
def _make_result(self, entity, columns):
"""
Decode values for the given fields from the database entity.

Expand All @@ -82,7 +75,6 @@ def _make_result(self, entity, columns, converters, tuple_expected=False):
"""
result = []
for name, col in columns:
field = col.field
column_alias = getattr(col, "alias", None)
obj = (
# Use the related object...
Expand All @@ -91,16 +83,7 @@ def _make_result(self, entity, columns, converters, tuple_expected=False):
if column_alias is not None and column_alias != self.collection_name
else entity
)
value = obj.get(name, NOT_PROVIDED)
if value is NOT_PROVIDED:
value = field.get_default()
elif converters:
# Decode values using Django's database converters API.
for converter in converters.get(name, ()):
value = converter(value, col, self.connection)
result.append(value)
if tuple_expected:
result = tuple(result)
result.append(obj.get(name, col.field.get_default()))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see you added the get_default() bit back in a separate commit. Were there any test failures before that? The code came from the mongo engine fork and I've been skeptical of it. It seems it could be related to missing columns due to schema changes, but I'm not sure it's the behavior we want. Just wondering if you have any analysis to add.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, there was. A JSONField test, don't remember the name (not_provided wasn't a json). But basically I copied the logic from Django. It is not that deep like types changes or so.
Currently we are flagging those fields with NOT_PROVIDED, then we check if it not provided we assign the default, other case we apply converters. you can see the code here:
https://github.com/WaVEV/django-mongodb/blob/main/django_mongodb/compiler.py#L94
Now a default value is assigned and then don't care about the converters. The only thing to point out is: the code is running converters over default values. But Django does the same if I am not mistaken.

return result

def check_query(self):
Expand Down