Skip to content
Open
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 4 additions & 4 deletions django_mongodb_backend/aggregates.py
Original file line number Diff line number Diff line change
Expand Up @@ -73,7 +73,7 @@ def stddev_variance(self, compiler, connection, **extra_context):


def register_aggregates():
Aggregate.as_mql = aggregate
Count.as_mql = count
StdDev.as_mql = stddev_variance
Variance.as_mql = stddev_variance
Aggregate.as_mql_expr = aggregate
Count.as_mql_expr = count
StdDev.as_mql_expr = stddev_variance
Variance.as_mql_expr = stddev_variance
Comment on lines +76 to +79
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see we have as_mql_expr(), as_mql_path(), and as_mql(..., as_path=...). If this is the way we keep it, it would be good to explain in the design document which objects (aggregate, func, expression, etc.) get which.

I wonder about renaming as_mql_expr() or as_mql_path() to as_mql() (i.e. treating one of paths as the default). Do you think it would be more or less confusing?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that was the idea. I’ll explain it in the docs, and we might also consider renaming some methods. The core concept is:

  • Every expression has an as_mql method.
  • In some cases, it’s simpler to implement as_mql directly, so those methods don’t follow the common expression flow.
  • For other expressions, as_mql is a composite function that delegates to as_path or as_expr when applied.
  • The base_expression.as_mql method controls when these are called and performs boilerplate checks to prevent nesting an expr inside another expr (a MongoDB 6 restriction).

In short: every object has as_mql. Some also define as_path and as_expr. The base_expression coordinates how these methods are used, except for cases where as_mql is defined directly.

Copy link
Collaborator Author

@WaVEV WaVEV Sep 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doc here: link

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@timgraham I actually like the decoupling of as_mql and as_mql_(path | expr). I view it as, you need to define at least two:
as_mql and as_mql_expr.

Then if you want to have the more optimized function you define as_mql_path. It feels less confusing to me that way.

Copy link
Collaborator Author

@WaVEV WaVEV Oct 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔 An expression needs at least one of the methods, as_mql_expr is enough (not optimal but functional). The aggregation module is a good example. Because the as_mql is defined in baseExpression class. In this approach most of the expressions don't need as_mql (if as_mql is defined, expr or path aren't needed).
EDIT: Sorry, I rushed the answer before read all the text, you got the point well. 😬

69 changes: 60 additions & 9 deletions django_mongodb_backend/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,8 @@
import logging
import os

from django.core.exceptions import ImproperlyConfigured
from bson import Decimal128
from django.core.exceptions import EmptyResultSet, FullResultSet, ImproperlyConfigured
from django.db import DEFAULT_DB_ALIAS
from django.db.backends.base.base import BaseDatabaseWrapper
from django.db.backends.utils import debug_transaction
Expand All @@ -20,7 +21,7 @@
from .features import DatabaseFeatures
from .introspection import DatabaseIntrospection
from .operations import DatabaseOperations
from .query_utils import regex_match
from .query_utils import regex_expr, regex_match
from .schema import DatabaseSchemaEditor
from .utils import OperationDebugWrapper
from .validation import DatabaseValidation
Expand Down Expand Up @@ -108,7 +109,12 @@ def _isnull_operator(a, b):
}
return is_null if b else {"$not": is_null}

mongo_operators = {
def _isnull_operator_match(a, b):
if b:
return {"$or": [{a: {"$exists": False}}, {a: None}]}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does b represent in this case? Not negating?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔 well, it is a bit confusing here. but this function checks for nullability. So, if b is True then we check null. maybe the parameter should be: field, null.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

field and is_null are good clarifiers

return {"$and": [{a: {"$exists": True}}, {a: {"$ne": None}}]}

mongo_expr_operators = {
"exact": lambda a, b: {"$eq": [a, b]},
"gt": lambda a, b: {"$gt": [a, b]},
"gte": lambda a, b: {"$gte": [a, b]},
Expand All @@ -118,19 +124,64 @@ def _isnull_operator(a, b):
"lte": lambda a, b: {
"$and": [{"$lte": [a, b]}, DatabaseWrapper._isnull_operator(a, False)]
},
"in": lambda a, b: {"$in": [a, b]},
"in": lambda a, b: {"$in": (a, b)},
"isnull": _isnull_operator,
"range": lambda a, b: {
"$and": [
{"$or": [DatabaseWrapper._isnull_operator(b[0], True), {"$gte": [a, b[0]]}]},
{"$or": [DatabaseWrapper._isnull_operator(b[1], True), {"$lte": [a, b[1]]}]},
]
},
"iexact": lambda a, b: regex_match(a, ("^", b, {"$literal": "$"}), insensitive=True),
"startswith": lambda a, b: regex_match(a, ("^", b)),
"istartswith": lambda a, b: regex_match(a, ("^", b), insensitive=True),
"endswith": lambda a, b: regex_match(a, (b, {"$literal": "$"})),
"iendswith": lambda a, b: regex_match(a, (b, {"$literal": "$"}), insensitive=True),
"iexact": lambda a, b: regex_expr(a, ("^", b, {"$literal": "$"}), insensitive=True),
"startswith": lambda a, b: regex_expr(a, ("^", b)),
"istartswith": lambda a, b: regex_expr(a, ("^", b), insensitive=True),
"endswith": lambda a, b: regex_expr(a, (b, {"$literal": "$"})),
"iendswith": lambda a, b: regex_expr(a, (b, {"$literal": "$"}), insensitive=True),
"contains": lambda a, b: regex_expr(a, b),
"icontains": lambda a, b: regex_expr(a, b, insensitive=True),
"regex": lambda a, b: regex_expr(a, b),
"iregex": lambda a, b: regex_expr(a, b, insensitive=True),
}

def range_match(a, b):
conditions = []
start, end = b
if start is not None:
conditions.append({a: {"$gte": b[0]}})
if end is not None:
conditions.append({a: {"$lte": b[1]}})
if not conditions:
raise FullResultSet
if start is not None and end is not None:
if isinstance(start, Decimal128):
start = start.to_decimal()
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

potencial overflow risk here?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a failing test without the to_decimal() calls? I'm not sure why there would be a mix of Decimal/Decimal128 here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a BSON type held at the database and returned locally. I think it should be good.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, there is: model_fields_.test_polymorphic_embedded_model.QueryingTests.test_range. Maybe this case shouldn't be handled here, and let the query optimizer to remove those False cases.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The root issue (which could be clarified by a comment) is that Decimal128s aren't comparable using less than / greater than:

>>> Decimal128("3") < Decimal128("2")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: '<' not supported between instances of 'Decimal128' and 'Decimal128'

Perhaps this is a reasonable feature request that would allow obsoleting this logic in the future?

if isinstance(end, Decimal128):
end = end.to_decimal()
if start > end:
raise EmptyResultSet
return {"$and": conditions}

# match, path, find? don't know which name use.
mongo_match_operators = {
"exact": lambda a, b: {a: b},
"gt": lambda a, b: {a: {"$gt": b}},
"gte": lambda a, b: {a: {"$gte": b}},
# MongoDB considers null less than zero. Exclude null values to match
# SQL behavior.
"lt": lambda a, b: {
"$and": [{a: {"$lt": b}}, DatabaseWrapper._isnull_operator_match(a, False)]
},
"lte": lambda a, b: {
"$and": [{a: {"$lte": b}}, DatabaseWrapper._isnull_operator_match(a, False)]
},
"in": lambda a, b: {a: {"$in": tuple(b)}},
"isnull": _isnull_operator_match,
"range": range_match,
"iexact": lambda a, b: regex_match(a, f"^{b}$", insensitive=True),
"startswith": lambda a, b: regex_match(a, f"^{b}"),
"istartswith": lambda a, b: regex_match(a, f"^{b}", insensitive=True),
"endswith": lambda a, b: regex_match(a, f"{b}$"),
"iendswith": lambda a, b: regex_match(a, f"{b}$", insensitive=True),
"contains": lambda a, b: regex_match(a, b),
"icontains": lambda a, b: regex_match(a, b, insensitive=True),
"regex": lambda a, b: regex_match(a, b),
Expand Down
8 changes: 4 additions & 4 deletions django_mongodb_backend/compiler.py
Original file line number Diff line number Diff line change
Expand Up @@ -327,14 +327,14 @@ def pre_sql_setup(self, with_col_aliases=False):
pipeline = self._build_aggregation_pipeline(ids, group)
if self.having:
having = self.having.replace_expressions(all_replacements).as_mql(
self, self.connection
self, self.connection, as_path=True
)
# Add HAVING subqueries.
for query in self.subqueries or ():
pipeline.extend(query.get_pipeline())
# Remove the added subqueries.
self.subqueries = []
pipeline.append({"$match": {"$expr": having}})
pipeline.append({"$match": having})
self.aggregation_pipeline = pipeline
self.annotations = {
target: expr.replace_expressions(all_replacements)
Expand Down Expand Up @@ -481,11 +481,11 @@ def build_query(self, columns=None):
query.lookup_pipeline = self.get_lookup_pipeline()
where = self.get_where()
try:
expr = where.as_mql(self, self.connection) if where else {}
match = where.as_mql(self, self.connection, as_path=True) if where else {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we enforce as_path=True here, can we set that as the default on WhereNode.as_mql(..., as_path=True)?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, will refactor. Also it may change the name, as_expr=False. but it is scenically the same

except FullResultSet:
query.match_mql = {}
else:
query.match_mql = {"$expr": expr}
query.match_mql = match
if extra_fields:
query.extra_fields = self.get_project_fields(extra_fields, force_expression=True)
query.subqueries = self.subqueries
Expand Down
60 changes: 42 additions & 18 deletions django_mongodb_backend/expressions/builtins.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@
from django.core.exceptions import EmptyResultSet, FullResultSet
from django.db import NotSupportedError
from django.db.models.expressions import (
BaseExpression,
Case,
Col,
ColPairs,
Expand All @@ -28,6 +29,14 @@
from django_mongodb_backend.query_utils import process_lhs


def base_expression(self, compiler, connection, as_path=False, **extra):
if as_path and hasattr(self, "as_mql_path") and getattr(self, "can_use_path", False):
return self.as_mql_path(compiler, connection, **extra)

expr = self.as_mql_expr(compiler, connection, **extra)
return {"$expr": expr} if as_path else expr


def case(self, compiler, connection):
case_parts = []
for case in self.cases:
Expand Down Expand Up @@ -76,11 +85,11 @@ def col(self, compiler, connection, as_path=False): # noqa: ARG001
return f"{prefix}{self.target.column}"


def col_pairs(self, compiler, connection):
def col_pairs(self, compiler, connection, as_path=False):
cols = self.get_cols()
if len(cols) > 1:
raise NotSupportedError("ColPairs is not supported.")
return cols[0].as_mql(compiler, connection)
return cols[0].as_mql(compiler, connection, as_path=as_path)


def combined_expression(self, compiler, connection):
Expand All @@ -103,7 +112,7 @@ def order_by(self, compiler, connection):
return self.expression.as_mql(compiler, connection)


def query(self, compiler, connection, get_wrapping_pipeline=None):
def query(self, compiler, connection, get_wrapping_pipeline=None, as_path=False):
subquery_compiler = self.get_compiler(connection=connection)
subquery_compiler.pre_sql_setup(with_col_aliases=False)
field_name, expr = subquery_compiler.columns[0]
Expand Down Expand Up @@ -145,14 +154,16 @@ def query(self, compiler, connection, get_wrapping_pipeline=None):
# Erase project_fields since the required value is projected above.
subquery.project_fields = None
compiler.subqueries.append(subquery)
if as_path:
return f"{table_output}.{field_name}"
return f"${table_output}.{field_name}"


def raw_sql(self, compiler, connection): # noqa: ARG001
raise NotSupportedError("RawSQL is not supported on MongoDB.")


def ref(self, compiler, connection): # noqa: ARG001
def ref(self, compiler, connection, as_path=False): # noqa: ARG001
prefix = (
f"{self.source.alias}."
if isinstance(self.source, Col) and self.source.alias != compiler.collection_name
Expand All @@ -162,32 +173,41 @@ def ref(self, compiler, connection): # noqa: ARG001
refs, _ = compiler.columns[self.ordinal - 1]
else:
refs = self.refs
return f"${prefix}{refs}"
if not as_path:
prefix = f"${prefix}"
return f"{prefix}{refs}"


@property
def ref_is_simple_column(self):
return self.source.is_simple_column


def star(self, compiler, connection): # noqa: ARG001
return {"$literal": True}


def subquery(self, compiler, connection, get_wrapping_pipeline=None):
return self.query.as_mql(compiler, connection, get_wrapping_pipeline=get_wrapping_pipeline)
return self.query.as_mql(
compiler, connection, get_wrapping_pipeline=get_wrapping_pipeline, as_path=False
)


def exists(self, compiler, connection, get_wrapping_pipeline=None):
try:
lhs_mql = subquery(self, compiler, connection, get_wrapping_pipeline=get_wrapping_pipeline)
except EmptyResultSet:
return Value(False).as_mql(compiler, connection)
return connection.mongo_operators["isnull"](lhs_mql, False)
return connection.mongo_expr_operators["isnull"](lhs_mql, False)


def when(self, compiler, connection):
return self.condition.as_mql(compiler, connection)


def value(self, compiler, connection): # noqa: ARG001
def value(self, compiler, connection, as_path=False): # noqa: ARG001
value = self.value
if isinstance(value, (list, int)):
if isinstance(value, (list, int)) and not as_path:
# Wrap lists & numbers in $literal to prevent ambiguity when Value
# appears in $project.
return {"$literal": value}
Expand All @@ -210,20 +230,24 @@ def value(self, compiler, connection): # noqa: ARG001


def register_expressions():
Case.as_mql = case
BaseExpression.as_mql = base_expression
BaseExpression.is_simple_column = False
Case.as_mql_expr = case
Col.as_mql = col
Col.is_simple_column = True
ColPairs.as_mql = col_pairs
CombinedExpression.as_mql = combined_expression
Exists.as_mql = exists
CombinedExpression.as_mql_expr = combined_expression
Exists.as_mql_expr = exists
ExpressionList.as_mql = process_lhs
ExpressionWrapper.as_mql = expression_wrapper
NegatedExpression.as_mql = negated_expression
OrderBy.as_mql = order_by
ExpressionWrapper.as_mql_expr = expression_wrapper
NegatedExpression.as_mql_expr = negated_expression
OrderBy.as_mql_expr = order_by
Query.as_mql = query
RawSQL.as_mql = raw_sql
Ref.as_mql = ref
Ref.is_simple_column = ref_is_simple_column
ResolvedOuterRef.as_mql = ResolvedOuterRef.as_sql
Star.as_mql = star
Subquery.as_mql = subquery
When.as_mql = when
Star.as_mql_expr = star
Subquery.as_mql_expr = subquery
When.as_mql_expr = when
Value.as_mql = value
11 changes: 8 additions & 3 deletions django_mongodb_backend/expressions/search.py
Original file line number Diff line number Diff line change
Expand Up @@ -933,11 +933,16 @@ def __str__(self):
def __repr__(self):
return f"SearchText({self.lhs}, {self.rhs})"

def as_mql(self, compiler, connection):
lhs_mql = process_lhs(self, compiler, connection)
value = process_rhs(self, compiler, connection)
def as_mql_expr(self, compiler, connection):
lhs_mql = process_lhs(self, compiler, connection, as_path=False)
value = process_rhs(self, compiler, connection, as_path=False)
return {"$gte": [lhs_mql, value]}

def as_mql_path(self, compiler, connection):
lhs_mql = process_lhs(self, compiler, connection, as_path=True)
value = process_rhs(self, compiler, connection, as_path=True)
return {lhs_mql: {"$gte": value}}

Comment on lines 936 to 945
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is there a $gte query in a search.text lookup?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To convert a score function into a filter I decided to express the following proposition: score_func(...) > 0.


CharField.register_lookup(SearchTextLookup)
TextField.register_lookup(SearchTextLookup)
3 changes: 0 additions & 3 deletions django_mongodb_backend/features.py
Original file line number Diff line number Diff line change
Expand Up @@ -90,9 +90,6 @@ class DatabaseFeatures(GISFeatures, BaseDatabaseFeatures):
"auth_tests.test_views.LoginTest.test_login_session_without_hash_session_key",
# GenericRelation.value_to_string() assumes integer pk.
"contenttypes_tests.test_fields.GenericRelationTests.test_value_to_string",
# icontains doesn't work on ArrayField:
# Unsupported conversion from array to string in $convert
"model_fields_.test_arrayfield.QueryingTests.test_icontains",
# ArrayField's contained_by lookup crashes with Exists: "both operands "
# of $setIsSubset must be arrays. Second argument is of type: null"
# https://jira.mongodb.org/browse/SERVER-99186
Expand Down
Loading