diff --git a/config/redirects b/config/redirects index c3e8b7e9..c504eff7 100644 --- a/config/redirects +++ b/config/redirects @@ -12,4 +12,5 @@ raw: ${prefix}/master -> ${base}/upcoming/ raw: ${prefix}/get-started/download-and-install/ -> ${base}/current/get-started/download-and-install/ [*-master]: ${prefix}/${version}/security/enterprise-authentication/ -> ${base}/${version}/security/authentication/ -[*-master]: ${prefix}/${version}/connect/connection-pools/ -> ${base}/${version}/connect/connection-options/#connection-pools \ No newline at end of file +[*-master]: ${prefix}/${version}/faq/ -> ${base}/${version}/ +[*-master]: ${prefix}/${version}/connect/connection-pools/ -> ${base}/${version}/connect/connection-options/#connection-pools diff --git a/source/compatibility.txt b/source/compatibility.txt index 72efaab7..91d64e0c 100644 --- a/source/compatibility.txt +++ b/source/compatibility.txt @@ -48,7 +48,4 @@ The following compatibility table specifies the recommended version of {+driver-short+} for use with a specific version of Python. The first column lists the driver version. -.. include:: /includes/language-compatibility-table-pymongo.rst - -For more information on how to read the compatibility tables, see our guide on -:ref:`MongoDB Compatibility Tables. ` +.. include:: /includes/language-compatibility-table-pymongo.rst \ No newline at end of file diff --git a/source/connect/mongoclient.txt b/source/connect/mongoclient.txt index b303329c..683afc1d 100644 --- a/source/connect/mongoclient.txt +++ b/source/connect/mongoclient.txt @@ -168,13 +168,60 @@ constructor accepts. All parameters are optional. **Data type:** `TypeRegistry <{+api-root+}bson/codec_options.html#bson.codec_options.TypeRegistry>`__ -.. tip:: Reusing Your Client +Concurrent Execution +-------------------- - Because each ``MongoClient`` object represents a pool of connections to the - database, most applications require only a single instance of - ``MongoClient``, even across multiple requests. However, if you fork - a process, the child process *does* need its own ``MongoClient`` object. - To learn more, see the :ref:`FAQ ` page. +The following sections describe {+driver-short+}'s support for concurrent execution +mechanisms. + +Multithreading +~~~~~~~~~~~~~~ + +{+driver-short+} is thread-safe and provides built-in connection pooling +for threaded applications. +Because each ``MongoClient`` object represents a pool of connections to the +database, most applications require only a single instance of +``MongoClient``, even across multiple requests. + +.. _pymongo-forks: + +Multiple Forks +~~~~~~~~~~~~~~~ + +{+driver-short+} supports calling the ``fork()`` method to create a new process. +However, if you fork a process, you must create a new ``MongoClient`` instance in the +child process. + +.. important:: Don't Pass a MongoClient to a Child Process + + If you use the ``fork()`` method to create a new process, don't pass an instance + of the ``MongoClient`` class from the parent process to the child process. This creates + a high probability of deadlock among ``MongoClient`` instances in the child process. + {+driver-short+} tries to issue a warning if this deadlock might occur. + +Multiprocessing +~~~~~~~~~~~~~~~ + +{+driver-short+} supports the Python ``multiprocessing`` module. +However, on Unix systems, the multiprocessing module spawns processes by using +the ``fork()`` method. This carries the same risks described in :ref:`` + +To use multiprocessing with {+driver-short+}, write code similar to the following example: + +.. code-block:: python + + # Each process creates its own instance of MongoClient. + def func(): + db = pymongo.MongoClient().mydb + # Do something with db. + + proc = multiprocessing.Process(target=func) + proc.start() + +.. important:: + + Do not copy an instance of the ``MongoClient`` class from the parent process to a child + process. Type Hints ---------- diff --git a/source/data-formats/extended-json.txt b/source/data-formats/extended-json.txt index 7272b6f1..5a3ad056 100644 --- a/source/data-formats/extended-json.txt +++ b/source/data-formats/extended-json.txt @@ -178,6 +178,57 @@ list of dictionaries by using the ``loads()`` method: {'bin': Binary(b'\x01\x02\x03\x04', 128)} ] +.. _pymongo-extended-json-binary-values: + +Reading Binary Values in Python 2 +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +In Python 3, the driver decodes JSON binary values with subtype 0 to instances of the +``bytes`` class. In Python 2, the driver decodes these values to instances of the ``Binary`` +class with subtype 0. + +The following code examples show how {+driver-short+} decodes JSON binary instances with +subtype 0. Select the :guilabel:`Python 2` or :guilabel:`Python 3` tab to view the +corresponding code. + +.. tabs:: + + .. tab:: Python 2 + :tabid: python2 + + .. io-code-block:: + :copyable: true + + .. input:: + :language: python + + from bson.json_util import loads + + doc = loads('{"b": {"$binary': b'this is a byte string'}) + print(doc) + + .. output:: + + {u'b': Binary('this is a byte string', 0)} + + .. tab:: Python 3 + :tabid: python3 + + .. io-code-block:: + :copyable: true + + .. input:: + :language: python + + from bson.json_util import loads + + doc = loads('{"b": {"$binary': b'this is a byte string'}) + print(doc) + + .. output:: + + {'b': b'this is a byte string'} + Write Extended JSON ------------------- @@ -273,10 +324,30 @@ The following example shows how to output Extended JSON in the Canonical format: Additional Information ---------------------- +The resources in the following sections provide more information about working +with Extended JSON. + +API Documentation +~~~~~~~~~~~~~~~~~ + For more information about the methods and types in ``bson.json_util``, see the following API documentation: - `loads() <{+api-root+}bson/json_util.html#bson.json_util.loads>`__ - `dumps() <{+api-root+}bson/json_util.html#bson.json_util.dumps>`__ - `CANONICAL_JSON_OPTIONS <{+api-root+}bson/json_util.html#bson.json_util.CANONICAL_JSON_OPTIONS>`__ -- `LEGACY_JSON_OPTIONS <{+api-root+}bson/json_util.html#bson.json_util.LEGACY_JSON_OPTIONS>`__ \ No newline at end of file +- `LEGACY_JSON_OPTIONS <{+api-root+}bson/json_util.html#bson.json_util.LEGACY_JSON_OPTIONS>`__ + +Other Packages +~~~~~~~~~~~~~~ + +`python-bsonjs `__ is another package, +built on top of `libbson `__, +that can convert BSON to Extended JSON. The ``python-bsonjs`` package doesn't +depend on {+driver-short+} and might offer a performance improvement over +``json_util`` in certain cases. + +.. tip:: Use the RawBSONDocument Type + + ``python-bsonjs`` works best with {+driver-short+} when converting from the + ``RawBSONDocument`` type. \ No newline at end of file diff --git a/source/faq.txt b/source/faq.txt index 49edb8ae..e69de29b 100644 --- a/source/faq.txt +++ b/source/faq.txt @@ -1,396 +0,0 @@ -.. _pymongo-faq: - -Frequently Asked Questions -========================== - -.. contents:: On this page - :local: - :backlinks: none - :depth: 1 - :class: singlecol - -.. facet:: - :name: genre - :values: reference - -.. meta:: - :keywords: errors, problems, help, troubleshoot - -Is {+driver-short+} Thread-Safe? ------------------------ - -Yes. {+driver-short+} is thread-safe and provides built-in connection pooling -for threaded applications. - -.. _pymongo-fork-safe: - -Is {+driver-short+} Fork-Safe? ---------------------- - -No. If you use the ``fork()`` method to create a new process, don't pass an instance -of the ``MongoClient`` class from the parent process to the child process. This creates -a high probability of deadlock among ``MongoClient`` instances in the child process. -Instead, create a new ``MongoClient`` instance in the child process. - -.. note:: - - {+driver-short+} tries to issue a warning if this deadlock might occur. - -Can I Use {+driver-short+} with Multiprocessing? ---------------------------------------- - -Yes. However, on Unix systems, the multiprocessing module spawns processes by using -the ``fork()`` method. This carries the same risks described in :ref:`` - -To use multiprocessing with {+driver-short+}, write code similar to the following example: - -.. code-block:: python - - # Each process creates its own instance of MongoClient. - def func(): - db = pymongo.MongoClient().mydb - # Do something with db. - - proc = multiprocessing.Process(target=func) - proc.start() - -.. important:: - - Do not copy an instance of the ``MongoClient`` class from the parent process to a child - process. - -Can {+driver-short+} Load the Results of a Query as a Pandas DataFrame? ------------------------------------------------------------------------ - -You can use the `PyMongoArrow `__ -library to work with numerical or columnar data. PyMongoArrow lets you -load MongoDB query result-sets as -`Pandas DataFrames `__, -`NumPy ndarrays `__, or -`Apache Arrow Tables `__. - -How Does Connection Pooling Work in {+driver-short+}? --------------------------------------------- - -Every ``MongoClient`` instance has a built-in connection pool for each server -in your MongoDB topology. Connection pools open sockets on demand to -support concurrent requests to MongoDB in your application. - -The maximum size of each connection pool is set by the ``maxPoolSize`` option, which -defaults to ``100``. If the number of in-use connections to a server reaches -the value of ``maxPoolSize``, the next request to that server will wait -until a connection becomes available. - -In addition to the sockets needed to support your application's requests, -each ``MongoClient`` instance opens two more sockets per server -in your MongoDB topology for monitoring the server's state. -For example, a client connected to a three-node replica set opens six -monitoring sockets. If the application uses the default setting for -``maxPoolSize`` and only queries the primary (default) node, then -there can be at most ``106`` total connections in the connection pool. If the -application uses a :ref:`read preference ` to query the -secondary nodes, those connection pools grow and there can be -``306`` total connections. - -To support high numbers of concurrent MongoDB requests -within one process, you can increase ``maxPoolSize``. - -Connection pools are rate-limited. The ``maxConnecting`` option -determines the number of connections that the pool can create in -parallel at any time. For example, if the value of ``maxConnecting`` is -``2``, the third request that attempts to concurrently check out a -connection succeeds only when one the following cases occurs: - -- The connection pool finishes creating a connection and there are fewer - than ``maxPoolSize`` connections in the pool. -- An existing connection is checked back into the pool. -- The driver's ability to reuse existing connections improves due to - rate-limits on connection creation. - -You can set the minimum number of concurrent connections to -each server with the ``minPoolSize`` option, which defaults to ``0``. -The driver initializes the connection pool with this number of sockets. If -sockets are closed, causing the total number -of sockets (both in use and idle) to drop below the minimum, more -sockets are opened until the minimum is reached. - -You can set the maximum number of milliseconds that a connection can -remain idle in the pool by setting the ``maxIdleTimeMS`` option. -Once a connection has been idle for ``maxIdleTimeMS``, the connection -pool removes and replaces it. This option defaults to ``0`` (no limit). - -The following default configuration for a ``MongoClient`` works for most -applications: - -.. code-block:: python - - client = MongoClient(host, port) - -``MongoClient`` supports multiple concurrent requests. For each process, -create a client and reuse it for all operations in a process. This -practice is more efficient than creating a client for each request. - -The driver does not limit the number of requests that -can wait for sockets to become available, and it is the application's -responsibility to limit the size of its pool to bound queuing -during a load spike. Requests wait for the amount of time specified in -the ``waitQueueTimeoutMS`` option, which defaults to ``0`` (no limit). - -A request that waits more than the length of time defined by -``waitQueueTimeoutMS`` for a socket raises a ``ConnectionFailure`` error. Use this -option if it is more important to bound the duration of operations -during a load spike than it is to complete every operation. - -When ``MongoClient.close()`` is called by any request, the driver -closes all idle sockets and closes all sockets that are in -use as they are returned to the pool. Calling ``MongoClient.close()`` -closes only inactive sockets, so you cannot interrupt or terminate -any ongoing operations by using this method. The driver closes these -sockets only when the process completes. - -For more information, see the :manual:`Connection Pool Overview ` -in the {+mdb-server+} documentation. - -Why Does {+driver-short+} Add an _id Field to All My Documents? ------------------------------------------------------- - -When you use the ``Collection.insert_one()`` method, -``Collection.insert_many()`` method, or -``Collection.bulk_write()`` method to insert a document into MongoDB, -and that document does not -include an ``_id`` field, {+driver-short+} automatically adds this field for you. -It also sets the value of the field to an instance of ``ObjectId``. - -The following code example inserts a document without an ``_id`` field into MongoDB, then -prints the document. After it's inserted, the document contains an ``_id`` field whose -value is an instance of ``ObjectId``. - -.. code-block:: python - - >>> my_doc = {'x': 1} - >>> collection.insert_one(my_doc) - InsertOneResult(ObjectId('560db337fba522189f171720'), acknowledged=True) - >>> my_doc - {'x': 1, '_id': ObjectId('560db337fba522189f171720')} - -{+driver-short+} adds an ``_id`` field in this manner for a few reasons: - -- All MongoDB documents must have an ``_id`` field. -- If {+driver-short+} inserts a document without an ``_id`` field, MongoDB adds one - itself, but doesn't report the value back to {+driver-short+} for your application - to use. -- Copying the document before adding the ``_id`` field is - prohibitively expensive for most high-write-volume applications. - -.. tip:: - - If you don't want {+driver-short+} to add an ``_id`` to your documents, insert only - documents that your application has already added an ``_id`` field to. - -How Do I Change the Timeout Value for Cursors? ----------------------------------------------- - -MongoDB doesn't support custom timeouts for cursors, but you can turn off cursor -timeouts. To do so, pass the ``no_cursor_timeout=True`` option to -the ``find()`` method. - -How Can I Store ``Decimal`` Instances? --------------------------------------- - -MongoDB v3.4 introduced the ``Decimal128`` BSON type, a 128-bit decimal-based -floating-point value capable of emulating decimal rounding with exact precision. -{+driver-short+} versions 3.4 and later also support this type. -Earlier MongoDB versions, however, support only IEEE 754 floating points, equivalent to the -Python ``float`` type. {+driver-short+} can store ``Decimal`` instances to -these versions of MongoDB only by converting them to the ``float`` type. -You must perform this conversion explicitly. - -For more information, see the {+driver-short+} API documentation for -`decimal128. `__ - -Why Does {+driver-short+} Convert ``9.99`` to ``9.9900000000000002``? ---------------------------------------------------------------------- - -MongoDB represents ``9.99`` as an IEEE floating-point value, which can't -represent the value precisely. This is also true in some versions of -Python. In this regard, {+driver-short+} behaves the same way as -the JavaScript shell, all other MongoDB drivers, and the Python language itself. - -Does {+driver-short+} Support Attribute-style Access for Documents? ----------------------------------------------------------- - -No. {+driver-short+} doesn't implement this feature, for the following reasons: - -1. Adding attributes pollutes the attribute namespace for documents and could - lead to subtle bugs or confusing errors when using a key with the - same name as a dictionary method. - -#. {+driver-short+} uses SON objects instead of regular - dictionaries only to maintain key ordering, because the server - requires this for certain operations. Adding this feature would - complicate the ``SON`` class and could break backwards compatibility - if {+driver-short+} ever reverts to using dictionaries. - -#. Documents behave just like dictionaries, which makes them relatively simple - for new {+driver-short+} users to understand. Changing the behavior of documents - adds a barrier to entry for these users. - -For more information, see the relevant -`Jira case. `__ - -Does {+driver-short+} Support Asynchronous Frameworks? ---------------------------------------------- - -Yes. For more information, see the :ref:`` guide. - -Does {+driver-short+} Work with mod_wsgi? --------------------------------- - -Yes. See :ref:`pymongo-mod_wsgi` in the Tools guide. - -Does {+driver-short+} Work with PythonAnywhere? --------------------------------------- - -No. {+driver-short+} creates Python threads, which -`PythonAnywhere `__ does not support. - -For more information, see -the relevant `Jira ticket. `__ - -How Can I Encode My Documents to JSON? --------------------------------------- - -{+driver-short+} supports some special types, like ``ObjectId`` -and ``DBRef``, that aren't supported in JSON. Therefore, Python's ``json`` module won't -work with all documents in {+driver-short+}. Instead, {+driver-short+} includes the -`json_util `__ -module, a tool for using Python's ``json`` module with BSON documents and -`MongoDB Extended JSON `__. - -`python-bsonjs `__ is another -BSON-to-MongoDB-Extended-JSON converter, built on top of -`libbson `__. python-bsonjs doesn't -depend on {+driver-short+} and might offer a performance improvement over -``json_util`` in certain cases. - -.. tip:: - - python-bsonjs works best with {+driver-short+} when using the ``RawBSONDocument`` - type. - - -Does {+driver-short+} Behave Differently in Python 3? ------------------------------------------------------ - -{+driver-short+} encodes instances of the ``bytes`` class -as BSON type 5 (binary data) with subtype 0. -In Python 2, these instances are decoded to ``Binary`` -with subtype 0. In Python 3, they are decoded back to ``bytes``. - -The following code examples use {+driver-short+} to insert a ``bytes`` instance -into MongoDB, and then find the instance. -In Python 2, the byte string is decoded to ``Binary``. -In Python 3, the byte string is decoded back to ``bytes``. - -.. tabs:: - - .. tab:: Python 2.7 - :tabid: python-2 - - .. code-block:: python - - >>> import pymongo - >>> c = pymongo.MongoClient() - >>> c.test.bintest.insert_one({'binary': b'this is a byte string'}).inserted_id - ObjectId('4f9086b1fba5222021000000') - >>> c.test.bintest.find_one() - {u'binary': Binary('this is a byte string', 0), u'_id': ObjectId('4f9086b1fba5222021000000')} - - .. tab:: Python 3.7 - :tabid: python-3 - - .. code-block:: python - - >>> import pymongo - >>> c = pymongo.MongoClient() - >>> c.test.bintest.insert_one({'binary': b'this is a byte string'}).inserted_id - ObjectId('4f9086b1fba5222021000000') - >>> c.test.bintest.find_one() - {'binary': b'this is a byte string', '_id': ObjectId('4f9086b1fba5222021000000')} - -Similarly, Python 2 and 3 behave differently when {+driver-short+} parses JSON binary -values with subtype 0. In Python 2, these values are decoded to instances of ``Binary`` -with subtype 0. In Python 3, they're decoded into instances of ``bytes``. - -The following code examples use the ``json_util`` module to decode a JSON binary value -with subtype 0. In Python 2, the byte string is decoded to ``Binary``. -In Python 3, the byte string is decoded back to ``bytes``. - -.. tabs:: - - .. tab:: Python 2.7 - :tabid: python-2 - - .. code-block:: python - - >>> from bson.json_util import loads - >>> loads('{"b": {"$binary": "dGhpcyBpcyBhIGJ5dGUgc3RyaW5n", "$type": "00"}}') - {u'b': Binary('this is a byte string', 0)} - - .. tab:: Python 3.7 - :tabid: python-3 - - .. code-block:: python - - >>> from bson.json_util import loads - >>> loads('{"b": {"$binary": "dGhpcyBpcyBhIGJ5dGUgc3RyaW5n", "$type": "00"}}') - {'b': b'this is a byte string'} - -Can I Share Pickled ObjectIds Between Python 2 and Python 3? ------------------------------------------------------------- - -If you use Python 2 to pickle an instance of ``ObjectId``, -you can always unpickle it with Python 3. To do so, you must pass -the ``encoding='latin-1'`` option to the ``pickle.loads()`` method. -The following code example shows how to pickle an ``ObjectId`` in Python 2.7, and then -unpickle it in Python 3.7: - -.. code-block:: python - :emphasize-lines: 12 - - # Python 2.7 - >>> import pickle - >>> from bson.objectid import ObjectId - >>> oid = ObjectId() - >>> oid - ObjectId('4f919ba2fba5225b84000000') - >>> pickle.dumps(oid) - 'ccopy_reg\n_reconstructor\np0\n(cbson.objectid\...' - - # Python 3.7 - >>> import pickle - >>> pickle.loads(b'ccopy_reg\n_reconstructor\np0\n(cbson.objectid\...', encoding='latin-1') - ObjectId('4f919ba2fba5225b84000000') - -If you pickled an ``ObjectID`` in Python 2, and want to unpickle it in Python 3, -you must pass the ``protocol`` argument with a value of ``2`` or less to the -``pickle.dumps()`` method. -The following code example shows how to pickle an ``ObjectId`` in Python 3.7, and then -unpickle it in Python 2.7: - -.. code-block:: python - :emphasize-lines: 7 - - # Python 3.7 - >>> import pickle - >>> from bson.objectid import ObjectId - >>> oid = ObjectId() - >>> oid - ObjectId('4f96f20c430ee6bd06000000') - >>> pickle.dumps(oid, protocol=2) - b'\x80\x02cbson.objectid\nObjectId\nq\x00)\x81q\x01c_codecs\nencode\...' - - # Python 2.7 - >>> import pickle - >>> pickle.loads('\x80\x02cbson.objectid\nObjectId\nq\x00)\x81q\x01c_codecs\nencode\...') - ObjectId('4f96f20c430ee6bd06000000') diff --git a/source/includes/language-compatibility-table-pymongo.rst b/source/includes/language-compatibility-table-pymongo.rst index 6ae014a9..55347b1b 100644 --- a/source/includes/language-compatibility-table-pymongo.rst +++ b/source/includes/language-compatibility-table-pymongo.rst @@ -193,8 +193,98 @@ Python 3 :ref:`TLS ` section of the Troubleshooting guide. .. [#three-six-compat] Pymongo 4.1 requires Python 3.6.2 or later. +For more information about how to read the compatibility tables, see +:ref:`MongoDB Compatibility Tables. ` + Python 2 ~~~~~~~~ -{+driver-short+} versions 3.7 through 3.12 are compatible with Python 2.7 and PyPy, a Python 2.7- -compatible alternative interpreter. +{+driver-short+} versions 3.7 through 3.12 are compatible with Python 2.7 and PyPy, a +Python 2.7-compatible alternative interpreter. However, in some cases, {+driver-short+} +applications behave differently when running in a Python 2 environment. + +The following sections describe the differences in behavior between Python 2 and Python 3 +when using {+driver-short+}. + +Binary Data +``````````` + +In all versions of Python, {+driver-short+} encodes instances of the +`bytes `__ class +as binary data with subtype 0, the default subtype for binary data. In Python 3, +{+driver-short+} decodes these values to instances of the ``bytes`` class. In Python 2, +the driver decodes them to instances of the +`Binary `__ +class with subtype 0. For code examples that show the differences, see the +:ref:`Extended JSON ` page. + +The driver behaves the same way when decoding JSON binary values with subtype 0. In +Python 3, it decodes these values to instances of the ``bytes`` class. In Python 2, +the driver decodes them to instances of the ``Binary`` class with subtype 0. For code +examples that show the differences, see the +:ref:`Extended JSON ` page. + +Pickled ObjectIds +````````````````` + +If you pickled an ``ObjectId`` in Python 2 and want to unpickle it in Python 3, you must +pass ``encoding='latin-1'`` as an argument to the ``pickle.loads()`` method. + +The following example shows how to use Python 3 to unpickle an ``ObjectId`` that was +pickled in Python 2: + +.. code-block:: python + :emphasize-lines: 2 + + import pickle + pickle.loads(b'', encoding='latin-1') + +If a Python 3 application uses a compatible serialization protocol to pickle an ``ObjectId``, +you can use Python 2 to unpickle it. To specify a compatible protocol in Python 3, pass +a value of 0, 1, or 2 for the ``protocol`` parameter of the ``pickle.dumps()`` method. + +The following example pickles an ``ObjectId`` in Python 3, then prints the ``ObjectId`` +and resulting ``bytes`` instance: + +.. io-code-block:: + :copyable: true + + .. input:: + :language: python + + import pickle + from bson.objectid import ObjectId + + oid = ObjectId() + oid_bytes = pickle.dumps(oid, protocol=2) + print("ObjectId: {}".format(oid)) + print("ObjectId bytes: {}".format(oid_bytes)) + + .. output:: + :language: shell + + ObjectId: 67af9b1fae9260c0e97eb9eb + ObjectId bytes: b'\x80\x02cbson.objectid\nObjectId\nq\x00... + +The following example unpickles the ``ObjectId`` from the previous example, and then +prints the ``bytes`` and ``ObjectId`` instances: + +.. io-code-block:: + :copyable: true + + .. input:: + :language: python + + import pickle + from bson.objectid import ObjectId + + oid_bytes = b'\x80\x02cbson.objectid\nObjectId\nq\x00...' + oid = pickle.loads(oid_bytes) + print("ObjectId bytes: {}".format(oid_bytes)) + print("ObjectId: {}".format(oid)) + + .. output:: + :language: shell + + ObjectId bytes: b'\x80\x02cbson.objectid\nObjectId\nq\x00)... + ObjectId: 67af9b1fae9260c0e97eb9eb \ No newline at end of file diff --git a/source/includes/write/unique-id-note.rst b/source/includes/write/unique-id-note.rst new file mode 100644 index 00000000..d9238070 --- /dev/null +++ b/source/includes/write/unique-id-note.rst @@ -0,0 +1,12 @@ +.. note:: _id Field Must Be Unique + + In a MongoDB collection, each document must contain an ``_id`` field + with a unique value. + + If you specify a value for the ``_id`` field, you must ensure that the + value is unique across the collection. If you don't specify a value, + the driver automatically generates a unique ``ObjectId`` value for the field. + + We recommend letting the driver automatically generate ``_id`` values to + ensure uniqueness. Duplicate ``_id`` values violate unique index constraints, which + causes the driver to return an error. \ No newline at end of file diff --git a/source/index.txt b/source/index.txt index e9f192d1..86faa57e 100644 --- a/source/index.txt +++ b/source/index.txt @@ -123,12 +123,6 @@ Third-Party Tools For a list of popular third-party Python libraries for working with MongoDB, see the :ref:`pymongo-tools` section. -Frequently Asked questions --------------------------- - -For answers to commonly asked questions about {+driver-short+}, see the -:ref:`pymongo-faq` section. - Troubleshooting --------------- diff --git a/source/read/retrieve.txt b/source/read/retrieve.txt index ea3434b5..9aa955a1 100644 --- a/source/read/retrieve.txt +++ b/source/read/retrieve.txt @@ -81,7 +81,7 @@ the ``"cuisine"`` field has the value ``"Bakery"``: :manual:`natural order ` on disk if no sort criteria is specified. -To learn more about sorting, see the :ref:`sort guide `. + To learn more about sorting, see the :ref:`sort guide `. .. _pymongo-retrieve-find-multiple: @@ -204,6 +204,13 @@ to the ``find()`` method: Additional Information ---------------------- +The PyMongoArrow library lets you load MongoDB query result-sets as +`Pandas DataFrames `__, +`NumPy ndarrays `__, or +`Apache Arrow Tables `__. +To learn more about PyMongoArrow, see the +`PyMongoArrow documentation `__. + To learn more about query filters, see :ref:`pymongo-specify-query`. For runnable code examples of retrieving documents with {+driver-short+}, see diff --git a/source/serialization.txt b/source/serialization.txt index 99c5c432..c71c0035 100644 --- a/source/serialization.txt +++ b/source/serialization.txt @@ -90,3 +90,62 @@ it back into a ``Restaurant`` object from the preceding example: To learn more about retrieving documents from a collection, see the :ref:`pymongo-retrieve` guide. + +.. _pymongo-serialization-binary-data: + +Binary Data +----------- + +In all versions of Python, {+driver-short+} encodes instances of the +`bytes `__ class +as binary data with subtype 0, the default subtype for binary data. In Python 3, +{+driver-short+} decodes these values to instances of the ``bytes`` class. In Python 2, +the driver decodes them to instances of the +`Binary `__ +class with subtype 0. + +The following code examples show how {+driver-short+} decodes instances of the ``bytes`` +class. Select the :guilabel:`Python 2` or :guilabel:`Python 3` tab to view the corresponding +code. + +.. tabs:: + + .. tab:: Python 2 + :tabid: python2 + + .. io-code-block:: + :copyable: true + + .. input:: + :language: python + + from pymongo import MongoClient + + client = MongoClient() + client.test.test.insert_one({'binary': b'this is a byte string'}) + doc = client.test.test.find_one() + print(doc) + + .. output:: + + {u'_id': ObjectId('67afb78298f604a28f0247b4'), u'binary': Binary('this is a byte string', 0)} + + .. tab:: Python 3 + :tabid: python3 + + .. io-code-block:: + :copyable: true + + .. input:: + :language: python + + from pymongo import MongoClient + + client = MongoClient() + client.test.test.insert_one({'binary': b'this is a byte string'}) + doc = client.test.test.find_one() + print(doc) + + .. output:: + + {'_id': ObjectId('67afb78298f604a28f0247b4'), 'binary': b'this is a byte string'} \ No newline at end of file diff --git a/source/tools.txt b/source/tools.txt index 7eee7638..bf70301a 100644 --- a/source/tools.txt +++ b/source/tools.txt @@ -274,3 +274,11 @@ This section lists alternatives to {+driver-short+}. - `MongoMock `__ is a small library to help test Python code. It uses {+driver-short+} to interact with MongoDB. +.. note:: {+driver-short+} is Incompatible with PythonAnywhere + + {+driver-short+} creates Python threads, which + `PythonAnywhere `__ does not support. + + For more information, see + the relevant `Jira ticket. `__ + diff --git a/source/write/bulk-write.txt b/source/write/bulk-write.txt index 3a8d5b81..7978b4e0 100644 --- a/source/write/bulk-write.txt +++ b/source/write/bulk-write.txt @@ -67,11 +67,7 @@ The following example creates an instance of ``InsertOne``: To insert multiple documents, create an instance of ``InsertOne`` for each document. -.. note:: - - Duplicate ``_id`` values violate unique index constraints, which causes the - driver to return a ``DuplicateKeyError``. To avoid this error, ensure that - each document you insert has a unique ``_id`` value. +.. include:: /includes/write/unique-id-note.rst Update Operations ~~~~~~~~~~~~~~~~~ diff --git a/source/write/insert.txt b/source/write/insert.txt index 738604d5..19ed2b42 100644 --- a/source/write/insert.txt +++ b/source/write/insert.txt @@ -27,6 +27,8 @@ An insert operation inserts one or more documents into a MongoDB collection. You can perform an insert operation by using the ``insert_one()`` or ``insert_many()`` method. +.. include:: /includes/write/unique-id-note.rst + .. .. tip:: Interactive Lab .. This page includes a short interactive lab that demonstrates how to @@ -45,36 +47,6 @@ from the :atlas:`Atlas sample datasets `. To learn how to create a free MongoDB Atlas cluster and load the sample datasets, see the :ref:`` tutorial. -The ``_id`` Field ------------------ - -In a MongoDB collection, each document *must* contain an ``_id`` field -with a unique field value. - -MongoDB allows you to manage this field in two ways: - -- You can set this field for each document yourself, ensuring each - ``_id`` field value is unique. -- You can let the driver automatically generate unique ``ObjectId`` - values for each document ``_id``. If you do not manually set an - ``_id`` value for a document, the driver populates the field - with an ``ObjectId``. - -Unless you can guarantee uniqueness, we recommend -letting the driver automatically generate ``_id`` values. - -.. note:: - - Duplicate ``_id`` values violate unique index constraints, which - causes the driver to return a ``WriteError`` from - ``insert_one()`` or a ``BulkWriteError`` from ``insert_many()``. - -To learn more about the ``_id`` field, see the -:manual:`Unique Indexes ` guide in the {+mdb-server+} manual. - -To learn more about document structure and rules, see the -:manual:`Documents ` guide in the {+mdb-server+} manual. - Insert One Document -------------------