Fix EmbeddedModelField crash when retrieving model where field name isn't present in db data #407

ddrondo · 2025-09-19T12:33:54Z

If the field is embedded and empty, it gave an error about the absence of this field.

timgraham · 2025-09-19T12:40:59Z

Please give complete steps to reproduce the issue, including the traceback. Logically, it seems incorrect that addingblank=True to a field should disable its converters. Perhaps if a converter crashes on a blank value, that converter needs to be fixed.

ddrondo · 2025-09-19T12:54:48Z

 "route_fares": {
    "tent": {
    },
  },

Not all documents contain a fixed format, sometimes there is no need to specify a field. Django does not know how to support this, because it was created for relational databases. This implementation can partially eliminate the need for this. I agree that blank is not suitable for this idea. Perhaps you need to come up with another parameter for this solution, for example, void=True

timgraham · 2025-09-19T17:15:37Z

What does your model look like?

ddrondo · 2025-09-19T18:12:07Z

class RouteFares(EmbeddedModel):
    tent = EmbeddedModelField(RouteFareSize, null=True, blank=True)
    wagon = EmbeddedModelField(RouteFareSize, null=True, blank=True)
    open = EmbeddedModelField(RouteFareSize, null=True, blank=True)
    ref = EmbeddedModelField(RouteFareSize, null=True, blank=True)
    isotherm = EmbeddedModelField(RouteFareSize, null=True, blank=True)


class Tariff(models.Model):
    ...
    route_fares = EmbeddedModelField(RouteFares)

timgraham · 2025-09-20T00:19:40Z

Not all documents contain a fixed format, sometimes there is no need to specify a field. Django does not know how to support this, because it was created for relational databases.

Got it. We've been considering whether or not to try to support this sort of sparse data that wasn't written by Django. We've identified some other problems in passing (#275, #401) but it's unclear that supporting this use case is worth it. Writing rigorous tests for sparse data would be a large effort.

ddrondo · 2025-09-21T09:36:57Z

Definitely needed. Mongo is non-relational database for a reason

timgraham · 2025-09-23T22:30:10Z

I'm guessing the exception is similar to this:

>>> Tariff.objects.all()
Traceback (most recent call last):
  File "<console>", line 1, in <module>
  File "/home/tim/code/django/django/db/models/query.py", line 360, in __repr__
    data = list(self[: REPR_OUTPUT_SIZE + 1])
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/tim/code/django/django/db/models/query.py", line 384, in __iter__
    self._fetch_all()
  File "/home/tim/code/django/django/db/models/query.py", line 1949, in _fetch_all
    self._result_cache = list(self._iterable_class(self))
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/tim/code/django/django/db/models/query.py", line 123, in __iter__
    for row in compiler.results_iter(results):
  File "/home/tim/code/django/django/db/models/sql/compiler.py", line 1541, in apply_converters
    value = converter(value, expression, connection)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/tim/code/django-mongodb/django_mongodb_backend/operations.py", line 192, in convert_embeddedmodelfield_value
    value[field.attname] = converter(value[field.attname], field_expr, connection)
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/tim/code/django-mongodb/django_mongodb_backend/operations.py", line 192, in convert_embeddedmodelfield_value
    value[field.attname] = converter(value[field.attname], field_expr, connection)
                                     ~~~~~^^^^^^^^^^^^^^^
KeyError: 'num'

(where "num" is a field of RouteFareSize).

Thus a suitable patch might be:

diff --git a/django_mongodb_backend/operations.py b/django_mongodb_backend/operations.py
index 4b494c3..6d4f033 100644
--- a/django_mongodb_backend/operations.py
+++ b/django_mongodb_backend/operations.py
@@ -189,7 +189,8 @@ class DatabaseOperations(GISOperations, BaseDatabaseOperations):
                     field_expr
                 ) + field_expr.get_db_converters(connection)
                 for converter in converters:
-                    value[field.attname] = converter(value[field.attname], field_expr, connection)
+                    if field.attname in value:
+                        value[field.attname] = converter(value[field.attname], field_expr, connection)
         return value
 
     def convert_jsonfield_value(self, value, expression, connection):

I gather that you're trying to build a Django application with some data that Django didn't write. Is it otherwise going okay?

ddrondo · 2025-09-24T07:12:38Z

That's right, it's a type of error. In principle, this option can also work, but I would move the condition outside the loop, as there will be unnecessary empty iterations in the loop. However, I don't think this is a good solution, as not everyone expects this behavior, and many follow a strict pattern, which would prevent the error.

timgraham · 2025-09-24T11:30:53Z

All the data that Django writes expects converters to run. If converters don't run, field data will be in an unexpected time (e.g. DateField values will be datetime rather than date, DecimalField will be Decimal128 rather than Decimal). This is why your proposed solution of checking blank is incorrect: converters won't run for fields with blank=True, even if the field has data.

ddrondo · 2025-09-24T11:54:05Z

I agree with blank=True that it is incorrect to specify it, but your decision is also incorrect. I think the best solution would be to inherit from 'from django.db.models.fields import Field' and add a new attribute to solve the problem

timgraham · 2025-09-24T13:29:22Z

As I explained, by disabling an embedded model field's converters, you're going to break those fields. Instead, for example, if you want a version of DecimalField that uses Decimal128 instead of Decimal, you'll instead want to write a custom field (Decimal128Field) and use that on your embedded model.

Nonetheless, you can implement your own custom embedded model field if you believe your proposed solution is suitable for your project.

ddrondo · 2025-09-24T13:48:43Z

I'm not talking about converters, I'm talking about:

if field.attname in value:
      value[field.attname] = converter(value[field.attname], field_expr, connection)

Not everyone expects this behavior, and many follow a strict schema of the model.

timgraham · 2025-09-24T23:40:32Z

Please check if #409 solves your issue.

If not, you'll need to explain a bit more. I'm not sure what a "strict schema of the model" means.

The lines you quoted is where database converters are run on each embedded model field. These cannot be disabled if a field has any data, otherwise the data won't be converted as required (e.g. an EmbeddedModel's DecimalField value would be loaded as Decimal128 instead of Decimal).

ddrondo · 2025-09-25T07:35:19Z

This solution is suitable for me, BUT:
Strict schema as in relational databases. For example:
{ a: [], b: [], }
Let's say it's important for a person that the document contains both 'a' and 'b' fields. However, if we take your proposed solution and a document from the database that only contains the 'a' field and lacks the 'b' field, there may be an error in the backend when working with the document, as the logic is not entirely clear.
Either a different approach is needed, or the documentation should include information about this feature. While this logic is clear to me, as I have encountered this issue, it may not be as obvious to others.

timgraham · 2025-09-25T19:07:43Z

The idea of the fix is to avoid running converters on fields that aren't in the data.

I'm not sure why this is unclear or why you think this approach may result in some other error.

ddrondo · 2025-09-25T19:11:07Z

The idea of the fix is to avoid running converters on fields that aren't in the data.

I'm not sure why this is unclear or why you think this approach may result in some other error.

I'm not talking about converters.

Jibola · 2025-09-30T14:20:31Z

Hey, @ddrondo thanks for taking the time to point out this issue. I want to specifically address your concern here:

Let's say it's important for a person that the document contains both 'a' and 'b' fields. However, if we take your proposed solution and a document from the database that only contains the 'a' field and lacks the 'b' field, there may be an error in the backend when working with the document, as the logic is not entirely clear.

I don't think this will be an error folks run into as there's two primary ways for data to appear in the database. Let's walk through the cases:
1. Data provided by the ORM
In this case, the ORM maintains the invariant that all data is present. For nested embedded documents, if someone has allowed their ORM to leave a subfield empty, then it is not a violation of code expectations, as they are returning a NoneType and there's nothing more can do except try not to convert blank values. As well, the check for a field name being within a key, is still an O(1) lookup operation, so while it does incur a cost, it's minimal given the tradeoff is correctness and fault tolerance.

If someone needs to maintain rigid schema, they should be enforcing & providing default values through the ORM.

2. Data provided externally and used by the ORM
This is the case you've encountered. Information can easily differ from the ORM as MongoDB does not require strict schema. The solution @timgraham provided fixes that issue to ensure we don't crash. Beyond that, the user should check for NoneType.

In both of these cases, the fix @timgraham provided does not introduce a new or unexpected regression. If anything, it maintains that if a field returns as a NoneType from the database it will not crash.

Below is an example of what happens when a field is not present.

>>> Diary.objects.create(title="a", isbn="b", sub_genre=Horror(name="twinkle"))
<Diary: a>
# I went into the database and deleted `sub_genre`
>>> a = Diary.objects.first()
>>> a.sub_genre
>>> a.sub_genre.name
Traceback (most recent call last):
  File "<console>", line 1, in <module>
AttributeError: 'NoneType' object has no attribute 'name'
>>> a.sub_genre = Horror(name="twinkle")
>>> a.save()
>>> a
<Diary: a>
>>> a.sub_genre.name
'twinkle'
# Now I delete sub_genre.name directly from the db
>>> Diary.objects.first().sub_genre.name
''

See above that we no longer crash on missing key retrieval. The only time an error would occur is if iterating on a NoneType which can/should only be achievable by some manual database mutation or explicit NoneType allowance.

timgraham · 2025-09-30T18:56:32Z

(The conversation can continue if need be, but the problem should addressed by #409.)

Update operations.py

7d9b48e

timgraham changed the title ~~Update operations.py~~ Fix EmbeddedModelField crash when retrieving model where field name isn't present in db data Sep 20, 2025

Merge branch 'main' into operations

67a2359

timgraham mentioned this pull request Sep 24, 2025

INTPYTHON-765 Fix crash loading embedded models with missing fields that use database converters #409

Merged

timgraham closed this Sep 30, 2025

Fix EmbeddedModelField crash when retrieving model where field name isn't present in db data #407

Fix EmbeddedModelField crash when retrieving model where field name isn't present in db data #407

Uh oh!

Conversation

ddrondo commented Sep 19, 2025

Uh oh!

timgraham commented Sep 19, 2025

Uh oh!

ddrondo commented Sep 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

timgraham commented Sep 19, 2025

Uh oh!

ddrondo commented Sep 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

timgraham commented Sep 20, 2025

Uh oh!

ddrondo commented Sep 21, 2025

Uh oh!

timgraham commented Sep 23, 2025

Uh oh!

ddrondo commented Sep 24, 2025

Uh oh!

timgraham commented Sep 24, 2025

Uh oh!

ddrondo commented Sep 24, 2025

Uh oh!

timgraham commented Sep 24, 2025

Uh oh!

ddrondo commented Sep 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

timgraham commented Sep 24, 2025

Uh oh!

ddrondo commented Sep 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

timgraham commented Sep 25, 2025

Uh oh!

ddrondo commented Sep 25, 2025

Uh oh!

Jibola commented Sep 30, 2025

Uh oh!

timgraham commented Sep 30, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ddrondo commented Sep 19, 2025 •

edited

Loading

ddrondo commented Sep 19, 2025 •

edited

Loading

ddrondo commented Sep 24, 2025 •

edited

Loading

ddrondo commented Sep 25, 2025 •

edited

Loading