From 3e039fddb5229c79033956124bdefe6d6aa5564f Mon Sep 17 00:00:00 2001 From: Michael Morisi Date: Fri, 7 Feb 2025 13:35:51 -0500 Subject: [PATCH 1/7] DOCSP-46701-serialization --- source/index.txt | 17 +++++++ source/serialization.txt | 99 ++++++++++++++++++++++++++++++++++++++++ 2 files changed, 116 insertions(+) create mode 100644 source/serialization.txt diff --git a/source/index.txt b/source/index.txt index e375ec73..e9f192d1 100644 --- a/source/index.txt +++ b/source/index.txt @@ -24,6 +24,7 @@ MongoDB {+driver-short+} Documentation Data Formats Logging Monitoring + Serialization Third-Party Tools FAQ Troubleshooting @@ -100,6 +101,22 @@ Specialized Data Formats Learn how to work with specialized data formats and custom types in the :ref:`pymongo-data-formats` section. +Logging +------- + +Learn how to configure logging in the :ref:`pymongo-logging` section. + +Monitoring +---------- + +Learn how to monitor changes to your application in the :ref:`pymongo-monitoring` section. + +Serialization +------------- + +Learn how {+driver-short+} serializes and deserializes data in the +:ref:`pymongo-serialization` section. + Third-Party Tools ----------------- diff --git a/source/serialization.txt b/source/serialization.txt new file mode 100644 index 00000000..f2034096 --- /dev/null +++ b/source/serialization.txt @@ -0,0 +1,99 @@ +.. _pymongo-serialization: + +============= +Serialization +============= + +.. facet:: + :name: genre + :values: reference + +.. meta:: + :keywords: class, map, deserialize + +.. contents:: On this page + :local: + :backlinks: none + :depth: 2 + :class: singlecol + +Overview +-------- + +In this guide, you can learn how to use {+driver-long+} to perform +serialization. + +Serialization is the process of mapping a {+language+} object to a BSON +document for storage in MongoDB. {+driver-short+} automatically converts basic {+language+} +types into BSON when you insert them into a collection. Similarly, when you retrieve a +document from a collection, {+driver-short+} automatically converts the returned BSON +back into the corresponding {+language+} types. + +The following list shows some {+language+} types that {+driver-short+} can serialize +and deserialize: + +- Strings (``str``) +- Integers (``int``) +- Floats (``float``) +- Booleans (``bool``) +- Datetimes (``datetime.datetime``) +- Lists (``list``) +- Dictionaries (``dict``) +- None (``None``) + +For a complete list of {+language+}-to-BSON mappings, see the `bson {+api-root+}bson/index.html`__ +API documentation. + +.. note: + + Because the key-value pairs in {+language+} dictionaries are unordered, the order of + fields in serialized BSON documents can differ from the order of fields in the original + dictionary. To preserve the order of keys when serializing and deserializing BSON, + use the `SON <{+api-root+}bson/son.html>`__ class. + +Custom Classes +-------------- + +To serialize and deserialize custom {+language+} classes, you must implement custom logic +to handle the conversion. The following sections show how to serialize and deserialize +custom classes. + +Serializing Custom Classes +~~~~~~~~~~~~~~~~~~~~~~~~~~ + +To serialize a custom class, you must convert the class to a dictionary. The following +example serializes a custom class by using the ``vars()`` method, then inserts the +serialized object into a collection: + +.. code-block:: python + + class Restaurant: + def __init__(self, name, cuisine): + self.name = name + self.cuisine = cuisine + + restaurant = Guitar("Example Cafe", "Coffee") + restaurant_dict = vars(restaurant) + + collection.insert_one(restaurant_dict) + +To learn more about inserting documents into a collection, see the :ref:`pymongo-write-insert` +guide. + +Deserializing Custom Classes +~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +To deserialize a custom class, you must convert the dictionary back into an instance of +the class. The following example retrieves a document from a collection, then converts +it back into a ``Restaurant`` object from the preceding example: + +.. code-block:: python + + def deserialize_restaurant(doc): + return Restaurant(name=doc["name"], cuisine=doc["cuisine"]) + + restaurant_doc = collection.find_one({"name": "Example Cafe"}) + restaurant = deserialize_restaurant(restaurant_doc) + +To learn more about retrieving documents from a collection, see the :ref:`pymongo-retrieve` +guide. From ffe09c14d62f2eefd231d0e094159ade8d87369c Mon Sep 17 00:00:00 2001 From: Michael Morisi Date: Fri, 7 Feb 2025 15:04:47 -0500 Subject: [PATCH 2/7] Fixes --- source/serialization.txt | 58 ++++++++++++++++++++++++++++------------ 1 file changed, 41 insertions(+), 17 deletions(-) diff --git a/source/serialization.txt b/source/serialization.txt index f2034096..22c7bf92 100644 --- a/source/serialization.txt +++ b/source/serialization.txt @@ -32,25 +32,18 @@ back into the corresponding {+language+} types. The following list shows some {+language+} types that {+driver-short+} can serialize and deserialize: -- Strings (``str``) -- Integers (``int``) -- Floats (``float``) -- Booleans (``bool``) -- Datetimes (``datetime.datetime``) -- Lists (``list``) -- Dictionaries (``dict``) -- None (``None``) - -For a complete list of {+language+}-to-BSON mappings, see the `bson {+api-root+}bson/index.html`__ +- ``str`` +- ``int`` +- ``float`` +- ``bool`` +- ``datetime.datetime`` +- ``list`` +- ``dict`` +- ``None`` + +For a complete list of {+language+}-to-BSON mappings, see the `bson <{+api-root+}bson/index.html>`__ API documentation. -.. note: - - Because the key-value pairs in {+language+} dictionaries are unordered, the order of - fields in serialized BSON documents can differ from the order of fields in the original - dictionary. To preserve the order of keys when serializing and deserializing BSON, - use the `SON <{+api-root+}bson/son.html>`__ class. - Custom Classes -------------- @@ -97,3 +90,34 @@ it back into a ``Restaurant`` object from the preceding example: To learn more about retrieving documents from a collection, see the :ref:`pymongo-retrieve` guide. + +Serializing Ordered Documents +----------------------------- + +Because the key-value pairs in {+language+} dictionaries are unordered, the order of +fields in serialized BSON documents can differ from the order of fields in the original +dictionary. This behavior can cause issues when {+driver-short+} compares subdocuments +to each other, since {+driver-short+} only considers subdocuments to be equal if their fields +are in identical order. + +To preserve the order of keys when serializing and deserializing BSON, +use the `SON <{+api-root+}bson/son.html>`__ class. You must also configure your collection +to use SON for serialization and deserialization by specifying ``document_class=SON`` +to the ``with_options`` method of a collection. + +The following example retrieves a document +that has a ``location`` field value of ``{"street": "Cafe St", "zipcode": "10003"}`` from +the ``restaurants`` collection: + +.. code-block:: python + + from bson import CodecOptions, SON + + opts = CodecOptions(document_class=SON) + collection = db.get_collection("restaurants") + son_collection = collection.with_options(codec_options=opts) + doc = son_collection.find_one({"location": SON([("street", "Cafe St"), ("zipcode", "10003")])}) + +For more information about subdocument matching, see the +:manual:`Query on Embedded/Nested Documents ` +guide in the {+mdb-server+} documentation. \ No newline at end of file From 657c8d4b50678ddab515b8b66bde2f11bcc3f994 Mon Sep 17 00:00:00 2001 From: Michael Morisi Date: Fri, 7 Feb 2025 15:13:24 -0500 Subject: [PATCH 3/7] Fix --- source/serialization.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/source/serialization.txt b/source/serialization.txt index 22c7bf92..bdea5fb5 100644 --- a/source/serialization.txt +++ b/source/serialization.txt @@ -103,7 +103,7 @@ are in identical order. To preserve the order of keys when serializing and deserializing BSON, use the `SON <{+api-root+}bson/son.html>`__ class. You must also configure your collection to use SON for serialization and deserialization by specifying ``document_class=SON`` -to the ``with_options`` method of a collection. +to the ``with_options()`` method of a collection. The following example retrieves a document that has a ``location`` field value of ``{"street": "Cafe St", "zipcode": "10003"}`` from From 10781e09fbb25d71668801495dbb90022b1117d0 Mon Sep 17 00:00:00 2001 From: Michael Morisi Date: Fri, 7 Feb 2025 15:23:41 -0500 Subject: [PATCH 4/7] Fix --- source/serialization.txt | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/source/serialization.txt b/source/serialization.txt index bdea5fb5..ec8cbcda 100644 --- a/source/serialization.txt +++ b/source/serialization.txt @@ -20,12 +20,12 @@ Serialization Overview -------- -In this guide, you can learn how to use {+driver-long+} to perform +In this guide, you can learn how to use {+driver-short+} to perform serialization. Serialization is the process of mapping a {+language+} object to a BSON document for storage in MongoDB. {+driver-short+} automatically converts basic {+language+} -types into BSON when you insert them into a collection. Similarly, when you retrieve a +types into BSON when you insert a document into a collection. Similarly, when you retrieve a document from a collection, {+driver-short+} automatically converts the returned BSON back into the corresponding {+language+} types. From 6a1efb915ffd911d1251b00990935bee83169b5b Mon Sep 17 00:00:00 2001 From: Michael Morisi Date: Mon, 10 Feb 2025 09:46:51 -0500 Subject: [PATCH 5/7] SA feedback --- source/serialization.txt | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/source/serialization.txt b/source/serialization.txt index ec8cbcda..36793097 100644 --- a/source/serialization.txt +++ b/source/serialization.txt @@ -29,8 +29,8 @@ types into BSON when you insert a document into a collection. Similarly, when yo document from a collection, {+driver-short+} automatically converts the returned BSON back into the corresponding {+language+} types. -The following list shows some {+language+} types that {+driver-short+} can serialize -and deserialize: +You can use {+driver-short+} to serialize and deserialize the following {+language+} +types: - ``str`` - ``int`` @@ -55,7 +55,7 @@ Serializing Custom Classes ~~~~~~~~~~~~~~~~~~~~~~~~~~ To serialize a custom class, you must convert the class to a dictionary. The following -example serializes a custom class by using the ``vars()`` method, then inserts the +example serializes a custom class by using the ``vars()`` method, and then inserts the serialized object into a collection: .. code-block:: python @@ -77,7 +77,7 @@ Deserializing Custom Classes ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ To deserialize a custom class, you must convert the dictionary back into an instance of -the class. The following example retrieves a document from a collection, then converts +the class. The following example retrieves a document from a collection, and then converts it back into a ``Restaurant`` object from the preceding example: .. code-block:: python @@ -95,9 +95,9 @@ Serializing Ordered Documents ----------------------------- Because the key-value pairs in {+language+} dictionaries are unordered, the order of -fields in serialized BSON documents can differ from the order of fields in the original +fields in serialized BSON documents can differ from the original dictionary. This behavior can cause issues when {+driver-short+} compares subdocuments -to each other, since {+driver-short+} only considers subdocuments to be equal if their fields +to each other, because {+driver-short+} only considers subdocuments to be equal if their fields are in identical order. To preserve the order of keys when serializing and deserializing BSON, From 2cec5739703d92e986e648e65eb00f94981958fa Mon Sep 17 00:00:00 2001 From: Michael Morisi Date: Mon, 10 Feb 2025 11:43:15 -0500 Subject: [PATCH 6/7] NS technical feedback --- source/serialization.txt | 34 +-------- source/troubleshooting.txt | 143 ------------------------------------- 2 files changed, 2 insertions(+), 175 deletions(-) diff --git a/source/serialization.txt b/source/serialization.txt index 36793097..d979b24c 100644 --- a/source/serialization.txt +++ b/source/serialization.txt @@ -48,7 +48,8 @@ Custom Classes -------------- To serialize and deserialize custom {+language+} classes, you must implement custom logic -to handle the conversion. The following sections show how to serialize and deserialize +to handle the conversion. {+driver-short+} cannot serialize custom classes otherwise. The +following sections show how to serialize and deserialize custom classes. Serializing Custom Classes @@ -90,34 +91,3 @@ it back into a ``Restaurant`` object from the preceding example: To learn more about retrieving documents from a collection, see the :ref:`pymongo-retrieve` guide. - -Serializing Ordered Documents ------------------------------ - -Because the key-value pairs in {+language+} dictionaries are unordered, the order of -fields in serialized BSON documents can differ from the original -dictionary. This behavior can cause issues when {+driver-short+} compares subdocuments -to each other, because {+driver-short+} only considers subdocuments to be equal if their fields -are in identical order. - -To preserve the order of keys when serializing and deserializing BSON, -use the `SON <{+api-root+}bson/son.html>`__ class. You must also configure your collection -to use SON for serialization and deserialization by specifying ``document_class=SON`` -to the ``with_options()`` method of a collection. - -The following example retrieves a document -that has a ``location`` field value of ``{"street": "Cafe St", "zipcode": "10003"}`` from -the ``restaurants`` collection: - -.. code-block:: python - - from bson import CodecOptions, SON - - opts = CodecOptions(document_class=SON) - collection = db.get_collection("restaurants") - son_collection = collection.with_options(codec_options=opts) - doc = son_collection.find_one({"location": SON([("street", "Cafe St"), ("zipcode", "10003")])}) - -For more information about subdocument matching, see the -:manual:`Query on Embedded/Nested Documents ` -guide in the {+mdb-server+} documentation. \ No newline at end of file diff --git a/source/troubleshooting.txt b/source/troubleshooting.txt index 495e2708..c20b02c6 100644 --- a/source/troubleshooting.txt +++ b/source/troubleshooting.txt @@ -110,149 +110,6 @@ frameworks. if __name__ == "__main__": app.run() -Query Works in the Shell But Not in {+driver-short+} -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -After the ``_id`` field, which is always first, the key-value pairs in a BSON document can -be in any order. The ``mongo`` shell preserves key order when reading and writing -data, as shown by the fields "b" and "a" in the following code example: - -.. code-block:: javascript - - // mongo shell - db.collection.insertOne( { "_id" : 1, "subdocument" : { "b" : 1, "a" : 1 } } ) - // Returns: WriteResult({ "nInserted" : 1 }) - - db.collection.findOne() - // Returns: { "_id" : 1, "subdocument" : { "b" : 1, "a" : 1 } } - -{+driver-short+} represents BSON documents as Python dictionaries by default, -and the order of keys in dictionaries is not defined. In Python, a dictionary declared with -the "a" key first is the same as one with the "b" key first. In the following example, -the keys are displayed in the same order regardless of their order in the ``print`` -statement: - -.. code-block:: python - - print({'a': 1.0, 'b': 1.0}) - # Returns: {'a': 1.0, 'b': 1.0} - - print({'b': 1.0, 'a': 1.0}) - # Returns: {'a': 1.0, 'b': 1.0} - -Similarly, Python dictionaries might not show keys in the order they are -stored in BSON. The following example shows the result of printing the document -inserted in a preceding example: - -.. code-block:: python - - print(collection.find_one()) - # Returns: {'_id': 1.0, 'subdocument': {'a': 1.0, 'b': 1.0}} - -To preserve the order of keys when reading BSON, use the ``SON`` class, -which is a dictionary that remembers its key order. - -The following code example shows how to create a collection -configured to use the ``SON`` class: - -.. code-block:: python - - from bson import CodecOptions, SON - - opts = CodecOptions(document_class=SON) - - CodecOptions(document_class=...SON..., tz_aware=False, uuid_representation=UuidRepresentation.UNSPECIFIED, unicode_decode_error_handler='strict', tzinfo=None, type_registry=TypeRegistry(type_codecs=[], fallback_encoder=None), datetime_conversion=DatetimeConversion.DATETIME) - collection_son = collection.with_options(codec_options=opts) - -When you find the preceding subdocument, the driver represents query results with -``SON`` objects and preserves key order: - -.. io-code-block:: - - .. input:: - :language: python - - print(collection_son.find_one()) - - .. output:: - - SON([('_id', 1.0), ('subdocument', SON([('b', 1.0), ('a', 1.0)]))]) - -The subdocument's actual storage layout is now visible: "b" is before "a". - -Because a Python dictionary's key order is not defined, you cannot predict how it will be -serialized to BSON. However, MongoDB considers subdocuments equal only if their -keys have the same order. If you use a Python dictionary to query on a subdocument, it may -not match: - -.. io-code-block:: - - .. input:: - :language: python - - collection.find_one({'subdocument': {'b': 1.0, 'a': 1.0}}) is None - - .. output:: - - True - -Because Python considers the two dictionaries the same, swapping the key order in your query -makes no difference: - -.. io-code-block:: - - .. input:: - :language: python - - collection.find_one({'subdocument': {'b': 1.0, 'a': 1.0}}) is None - - .. output:: - - True - -You can solve this in two ways. First, you can match the subdocument field-by-field: - -.. io-code-block:: - - .. input:: - :language: python - - collection.find_one({'subdocument.a': 1.0, - 'subdocument.b': 1.0}) - - .. output:: - - {'_id': 1.0, 'subdocument': {'a': 1.0, 'b': 1.0}} - -The query matches any subdocument with an "a" of 1.0 and a "b" of 1.0, -regardless of the order in which you specify them in Python, or the order in which they're -stored in BSON. This query also now matches subdocuments with additional -keys besides "a" and "b", whereas the previous query required an exact match. - -The second solution is to use a ``~bson.son.SON`` object to specify the key order: - -.. io-code-block:: - - .. input:: - :language: python - - query = {'subdocument': SON([('b', 1.0), ('a', 1.0)])} - collection.find_one(query) - - .. output:: - - {'_id': 1.0, 'subdocument': {'a': 1.0, 'b': 1.0}} - -The driver preserves the key order you use when you create a ``~bson.son.SON`` -when serializing it to BSON and using it as a query. Thus, you can create a -subdocument that exactly matches the subdocument in the collection. - -.. note:: - - For more information about subdocument matching, see the - `Query on Embedded/Nested Documents `__ - guide in the {+mdb-server+} documentation. - Cursors ------- From 654a84bb8c30fe66f262a468029c3707fcc71b19 Mon Sep 17 00:00:00 2001 From: Michael Morisi Date: Mon, 10 Feb 2025 11:48:31 -0500 Subject: [PATCH 7/7] Fix --- source/serialization.txt | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/source/serialization.txt b/source/serialization.txt index d979b24c..99c5c432 100644 --- a/source/serialization.txt +++ b/source/serialization.txt @@ -48,8 +48,7 @@ Custom Classes -------------- To serialize and deserialize custom {+language+} classes, you must implement custom logic -to handle the conversion. {+driver-short+} cannot serialize custom classes otherwise. The -following sections show how to serialize and deserialize +to handle the conversion. The following sections show how to serialize and deserialize custom classes. Serializing Custom Classes