Skip to content
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 4 additions & 1 deletion bson/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -1006,7 +1006,10 @@ def _dict_to_bson(
elements.append(_name_value_to_bson(b"_id\x00", doc["_id"], check_keys, opts))
for key, value in doc.items():
if not top_level or key != "_id":
elements.append(_element_to_bson(key, value, check_keys, opts))
try:
elements.append(_element_to_bson(key, value, check_keys, opts))
except InvalidDocument as err:
raise InvalidDocument(f"Invalid document {doc} | {err}") from err
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this is the pattern we want, in the C version of this code we recursively call write_dict() for subdocuments which means if we end up erroring in a nested field the error will be something like:
Invalid document {"a": {"b": {"c": ...}}} | Invalid document {"b": {"c": ...}} | Invalid document {"c": ...} | ...

Is that the intended behavior?

Copy link
Contributor Author

@navjots18 navjots18 Sep 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ShaneHarvey You're right, we can check the top_level param to prevent this from happening

except AttributeError:
raise TypeError(f"encoder expected a mapping type but got: {doc!r}") from None

Expand Down
36 changes: 36 additions & 0 deletions bson/_cbsonmodule.c
Original file line number Diff line number Diff line change
Expand Up @@ -1743,6 +1743,42 @@ int write_dict(PyObject* self, buffer_t buffer,
while (PyDict_Next(dict, &pos, &key, &value)) {
if (!decode_and_write_pair(self, buffer, key, value,
check_keys, options, top_level)) {
if (PyErr_Occurred()) {
PyObject *etype, *evalue, *etrace;
PyErr_Fetch(&etype, &evalue, &etrace);
PyObject *InvalidDocument = _error("InvalidDocument");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a lot of missing error/NULL checks in this code. I'm not comfortable merging this for 4.9 so it will need to wait for the next release.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@blink1073 @ShaneHarvey I took context from code written in the same file, would appreciate if can you point out some existing code from where i can see what all error/Nulls check to put here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @navjots18, apologies for the delay. I think all we're missing in this block is initializing the etype, evalue, and etrace to NULL as we do here. Otherwise I agree this matches the rest of the existing code block.

Copy link
Contributor Author

@navjots18 navjots18 Nov 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @blink1073 fixed these issues, can you review again?


if (PyErr_GivenExceptionMatches(etype, InvalidDocument)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please reduce the nesting by using if (InvalidDocument && PyErr_GivenExceptionMatches(etype, InvalidDocument))?

if (InvalidDocument) {
Py_DECREF(etype);
etype = InvalidDocument;

if (evalue) {
PyObject *msg = PyObject_Str(evalue);
Py_DECREF(evalue);

if (msg) {
// Prepend doc to the existing message
PyObject *dict_str = PyObject_Str(dict);
PyObject *new_msg = PyUnicode_FromFormat("Invalid document %s | %s", PyUnicode_AsUTF8(dict_str), PyUnicode_AsUTF8(msg));
Py_DECREF(dict_str);

if (new_msg) {
evalue = new_msg;
}
else {
evalue = msg;
}
}
}
PyErr_NormalizeException(&etype, &evalue, &etrace);
}
}
else {
Py_DECREF(InvalidDocument);
}
PyErr_Restore(etype, evalue, etrace);
}
return 0;
}
}
Expand Down
1 change: 1 addition & 0 deletions doc/contributors.rst
Original file line number Diff line number Diff line change
Expand Up @@ -102,3 +102,4 @@ The following is a list of people who have contributed to
- Ivan Lukyanchikov (ilukyanchikov)
- Terry Patterson
- Romain Morotti
- Navjot Singh (navjots18)
13 changes: 13 additions & 0 deletions test/test_bson.py
Original file line number Diff line number Diff line change
Expand Up @@ -1000,6 +1000,19 @@ def __repr__(self):
):
encode({"t": Wrapper(1)})

def test_doc_in_invalid_document_error_message(self):
class Wrapper:
def __init__(self, val):
self.val = val

def __repr__(self):
return repr(self.val)

self.assertEqual("1", repr(Wrapper(1)))
doc = {"t": Wrapper(1)}
with self.assertRaisesRegex(InvalidDocument, f"Invalid document {doc}"):
encode(doc)


class TestCodecOptions(unittest.TestCase):
def test_document_class(self):
Expand Down
Loading