@@ -7,11 +7,11 @@ codec, which is used to populate a :class:`~bson.codec_options.TypeRegistry`.
7
7
The type registry can then be used to create a custom-type-aware
8
8
:class: `~pymongo.collection.Collection `. Read and write operations
9
9
issued against the resulting collection object transparently manipulate
10
- documents as they are saved or retrieved from MongoDB.
10
+ documents as they are saved to or retrieved from MongoDB.
11
11
12
12
13
- Setup
14
- -----
13
+ Setting Up
14
+ ----------
15
15
16
16
We'll start by getting a clean database to use for the example:
17
17
@@ -26,10 +26,10 @@ We'll start by getting a clean database to use for the example:
26
26
Since the purpose of the example is to demonstrate working with custom types,
27
27
we'll need a custom data type to use. For this example, we will be working with
28
28
the :py:class: `~decimal.Decimal ` type from Python's standard library. Since the
29
- BSON library has a :class: `~bson.decimal128.Decimal128 ` type (that implements
30
- the IEEE 754 decimal128 decimal-based floating-point numbering format) which
31
- is distinct from Python's built-in :py:class: `~decimal.Decimal ` type, when we
32
- try to save an instance of ``Decimal `` with PyMongo, we get an
29
+ BSON library's :class: `~bson.decimal128.Decimal128 ` type (that implements
30
+ the IEEE 754 decimal128 decimal-based floating-point numbering format) is
31
+ distinct from Python's built-in :py:class: `~decimal.Decimal ` type, attempting
32
+ to save an instance of ``Decimal `` with PyMongo, results in an
33
33
:exc: `~bson.errors.InvalidDocument ` exception.
34
34
35
35
.. doctest ::
@@ -44,13 +44,13 @@ try to save an instance of ``Decimal`` with PyMongo, we get an
44
44
45
45
.. _custom-type-type-codec :
46
46
47
- The Type Codec
48
- --------------
47
+ The :class: ` ~bson.codec_options.TypeCodec ` Class
48
+ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
49
49
50
50
.. versionadded :: 3.8
51
51
52
- In order to encode custom types , we must first define a **type codec ** for our
53
- type. A type codec describes how an instance of a custom type can be
52
+ In order to encode a custom type , we must first define a **type codec ** for
53
+ that type. A type codec describes how an instance of a custom type can be
54
54
*transformed * to and/or from one of the types :mod: `~bson ` already understands.
55
55
Depending on the desired functionality, users must choose from the following
56
56
base classes when defining type codecs:
@@ -62,7 +62,7 @@ base classes when defining type codecs:
62
62
decodes a specified BSON type into a custom Python type. Users must implement
63
63
the ``bson_type `` property/attribute and the ``transform_bson `` method.
64
64
* :class: `~bson.codec_options.TypeCodec `: subclass this to define a codec that
65
- can both encode from and decode to a custom type. Users must implement the
65
+ can both encode and decode a custom type. Users must implement the
66
66
``python_type `` and ``bson_type `` properties/attributes, as well as the
67
67
``transform_python `` and ``transform_bson `` methods.
68
68
@@ -93,14 +93,14 @@ interested in both encoding and decoding our custom type, we use the
93
93
94
94
.. _custom-type-type-registry :
95
95
96
- The Type Registry
97
- -----------------
96
+ The :class: ` ~bson.codec_options.TypeRegistry ` Class
97
+ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
98
98
99
99
.. versionadded :: 3.8
100
100
101
101
Before we can begin encoding and decoding our custom type objects, we must
102
- first inform PyMongo about our type codec. This is done by creating a
103
- :class: `~bson.codec_options.TypeRegistry ` instance:
102
+ first inform PyMongo about the corresponding codec. This is done by creating
103
+ a :class: `~bson.codec_options.TypeRegistry ` instance:
104
104
105
105
.. doctest ::
106
106
@@ -113,7 +113,7 @@ Once instantiated, registries are immutable and the only way to add codecs
113
113
to a registry is to create a new one.
114
114
115
115
116
- Putting it together
116
+ Putting It Together
117
117
-------------------
118
118
119
119
Finally, we can define a :class: `~bson.codec_options.CodecOptions ` instance
@@ -201,35 +201,79 @@ This is trivial to do since the same transformation as the one used for
201
201
information, it is impossible to discern which incoming
202
202
:class: `~bson.decimal128.Decimal128 ` value needs to be decoded as ``Decimal ``
203
203
and which needs to be decoded as ``DecimalInt ``. This example only considers
204
- the situation where a user wants to *encode * documents containing one or both
204
+ the situation where a user wants to *encode * documents containing either
205
205
of these types.
206
206
207
- Now, we can create a new codec options object and use it to get a collection
208
- object:
207
+ After creating a new codec options object and using it to get a collection
208
+ object, we can seamlessly encode instances of `` DecimalInt `` :
209
209
210
210
.. doctest ::
211
211
212
212
>>> type_registry = TypeRegistry([decimal_codec, decimalint_codec])
213
213
>>> codec_options = CodecOptions(type_registry = type_registry)
214
214
>>> collection = db.get_collection(' test' , codec_options = codec_options)
215
215
>>> collection.drop()
216
-
217
-
218
- We can now seamlessly encode instances of ``DecimalInt ``. Note that the
219
- ``transform_bson `` method of the base codec class results in these values
220
- being decoded as ``Decimal `` (and not ``DecimalInt ``):
221
-
222
- .. doctest ::
223
-
224
216
>>> collection.insert_one({' num' : DecimalInt(" 45.321" )})
225
217
<pymongo.results.InsertOneResult object at ...>
226
218
>>> mydoc = collection.find_one()
227
219
>>> pprint.pprint(mydoc)
228
220
{u'_id': ObjectId('...'), u'num': Decimal('45.321')}
229
221
222
+ Note that the ``transform_bson `` method of the base codec class results in
223
+ these values being decoded as ``Decimal `` (and not ``DecimalInt ``).
224
+
225
+
226
+ .. _decoding-binary-types :
227
+
228
+ Decoding :class: `~bson.binary.Binary ` Types
229
+ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
230
+
231
+ The decoding treatment of :class: `~bson.binary.Binary ` types having
232
+ ``subtype = 0 `` by the :mod: `bson ` module varies slightly depending on the
233
+ version of the Python runtime in use. This must be taken into account while
234
+ writing a ``TypeDecoder `` that modifies how this datatype is decoded.
235
+
236
+ On Python 3.x, :class: `~bson.binary.Binary ` data (``subtype = 0 ``) is decoded
237
+ as a ``bytes `` instance:
238
+
239
+ .. code-block :: python
240
+
241
+ >> > # On Python 3.x.
242
+ >> > from bson.binary import Binary
243
+ >> > newcoll = db.get_collection(' new' )
244
+ >> > newcoll.insert_one({' _id' : 1 , ' data' : Binary(b " 123" , subtype = 0 )})
245
+ >> > doc = newcoll.find_one()
246
+ >> > type (doc[' data' ])
247
+ bytes
248
+
249
+
250
+ On Python 2.7.x, the same data is decoded as a :class: `~bson.binary.Binary `
251
+ instance:
252
+
253
+ .. code-block :: python
254
+
255
+ >> > # On Python 2.7.x
256
+ >> > newcoll = db.get_collection(' new' )
257
+ >> > doc = newcoll.find_one()
258
+ >> > type (doc[' data' ])
259
+ bson.binary.Binary
230
260
231
- The Fallback Encoder
232
- --------------------
261
+
262
+ As a consequence of this disparity, users must set the ``bson_type `` attribute
263
+ on their :class: `~bson.codec_options.TypeDecoder ` classes differently,
264
+ depending on the python version in use.
265
+
266
+
267
+ .. note ::
268
+
269
+ For codebases requiring compatibility with both Python 2 and 3, type
270
+ decoders will have to be registered for both possible ``bson_type `` values.
271
+
272
+
273
+ .. _fallback-encoder-callable :
274
+
275
+ The ``fallback_encoder `` Callable
276
+ ---------------------------------
233
277
234
278
.. versionadded :: 3.8
235
279
@@ -268,27 +312,110 @@ We can now seamlessly encode instances of :py:class:`~decimal.Decimal`:
268
312
>>> pprint.pprint(mydoc)
269
313
{u'_id': ObjectId('...'), u'num': Decimal128('45.321')}
270
314
271
- As you can tell, fallback encoders are a compelling alternative to type codecs
272
- when we only want to encode custom types due to their much simpler API.
273
- Users should note however, that fallback encoders cannot be used to modify the
274
- encoding of types that PyMongo already understands, as illustrated by the
275
- following example:
276
315
277
- >>> def fallback_encoder (value ):
278
- ... """ Encoder that converts floats to int."""
279
- ... if isinstance (value, float ):
280
- ... return int (value)
281
- ... return value
282
- >>> type_registry = TypeRegistry(fallback_encoder = fallback_encoder)
283
- >>> codec_options = CodecOptions(type_registry = type_registry)
284
- >>> collection = db.get_collection(' test' , codec_options = codec_options)
285
- >>> collection.drop()
286
- >>> collection.insert_one({' num' : 45.321 })
287
- <pymongo.results.InsertOneResult object at ...>
288
- >>> mydoc = collection.find_one()
289
- >>> pprint.pprint(mydoc)
290
- {u'_id': ObjectId('...'), u'num': 45.321}
316
+ .. note ::
317
+
318
+ Fallback encoders are invoked *after * attempts to encode the given value
319
+ with standard BSON encoders and any configured type encoders have failed.
320
+ Therefore, in a type registry configured with a type encoder and fallback
321
+ encoder that both target the same custom type, the behavior specified in
322
+ the type encoder will prevail.
323
+
324
+
325
+ Because fallback encoders don't need to declare the types that they encode
326
+ beforehand, they can be used to support interesting use-cases that cannot be
327
+ serviced by ``TypeEncoder ``. One such use-case is described in the next
328
+ section.
329
+
330
+
331
+ Encoding Unknown Types
332
+ ^^^^^^^^^^^^^^^^^^^^^^
333
+
334
+ In this example, we demonstrate how a fallback encoder can be used to save
335
+ arbitrary objects to the database. We will use the the standard library's
336
+ :py:mod: `pickle ` module to serialize the unknown types and so naturally, this
337
+ approach only works for types that are picklable.
338
+
339
+ We start by defining some arbitrary custom types:
340
+
341
+ .. code-block :: python
342
+
343
+ class MyStringType (object ):
344
+ def __init__ (self , value ):
345
+ self .__value = value
346
+ def __repr__ (self ):
347
+ return " MyStringType('%s ')" % (self .__value,)
348
+
349
+ class MyNumberType (object ):
350
+ def __init__ (self , value ):
351
+ self .__value = value
352
+ def __repr__ (self ):
353
+ return " MyNumberType(%s )" % (self .__value,)
354
+
355
+ We also define a fallback encoder that pickles whatever objects it receives
356
+ and returns them as :class: `~bson.binary.Binary ` instances with a custom
357
+ subtype. The custom subtype, in turn, allows us to write a TypeDecoder that
358
+ identifies pickled artifacts upon retrieval and transparently decodes them
359
+ back into Python objects:
360
+
361
+ .. code-block :: python
362
+
363
+ import pickle
364
+ from bson.binary import Binary, USER_DEFINED_SUBTYPE
365
+ def fallback_pickle_encoder (value ):
366
+ return Binary(pickle.dumps(value), USER_DEFINED_SUBTYPE )
367
+
368
+ class PickledBinaryDecoder (TypeDecoder ):
369
+ bson_type = Binary
370
+ def transform_bson (self , value ):
371
+ if value.subtype == USER_DEFINED_SUBTYPE :
372
+ return pickle.loads(value)
373
+ return value
374
+
375
+
376
+ .. note ::
377
+
378
+ The above example is written assuming the use of Python 3. If you are using
379
+ Python 2, ``bson_type `` must be set to ``Binary ``. See the
380
+ :ref: `decoding-binary-types ` section for a detailed explanation.
381
+
382
+
383
+ Finally, we create a ``CodecOptions `` instance:
384
+
385
+ .. code-block :: python
386
+
387
+ codec_options = CodecOptions(type_registry = TypeRegistry(
388
+ [PickledBinaryDecoder()], fallback_encoder = fallback_pickle_encoder))
389
+
390
+ We can now round trip our custom objects to MongoDB:
391
+
392
+ .. code-block :: python
393
+
394
+ collection = db.get_collection(' test_fe' , codec_options = codec_options)
395
+ collection.insert_one({' _id' : 1 , ' str' : MyStringType(" hello world" ),
396
+ ' num' : MyNumberType(2 )})
397
+ mydoc = collection.find_one()
398
+ assert isinstance (mydoc[' str' ], MyStringType)
399
+ assert isinstance (mydoc[' num' ], MyNumberType)
400
+
401
+
402
+ Limitations
403
+ -----------
404
+
405
+ PyMongo's type codec and fallback encoder features have the following
406
+ limitations:
291
407
292
- This is due to the fact that fallback encoders are invoked only after
293
- an attempt to encode the value with type codecs and standard BSON encoding
294
- routines has been unsuccessful.
408
+ #. Users cannot customize the encoding behavior of Python types that PyMongo
409
+ already understands like ``int `` and ``str `` (the 'built-in types').
410
+ Attempting to instantiate a type registry with one or more codecs that act
411
+ upon a built-in type results in a ``TypeError ``. This limitation extends
412
+ to all subtypes of the standard types.
413
+ #. Chaining type encoders is not supported. A custom type value, once
414
+ transformed by a codec's ``transform_python `` method, *must * result in a
415
+ type that is either BSON-encodable by default, or can be
416
+ transformed by the fallback encoder into something BSON-encodable--it
417
+ *cannot * be transformed a second time by a different type codec.
418
+ #. The :meth: `~pymongo.database.Database.command ` method does not apply the
419
+ user's TypeDecoders while decoding the command response document.
420
+ #. :mod: `gridfs ` does not apply custom type encoding or decoding to any
421
+ documents received from or to returned to the user.
0 commit comments