diff --git a/snooty.toml b/snooty.toml index df700929..5b4584c2 100644 --- a/snooty.toml +++ b/snooty.toml @@ -1,9 +1,10 @@ name = "mongoid" title = "Mongoid" -intersphinx = [ "https://www.mongodb.com/docs/manual/objects.inv", - "https://www.mongodb.com/docs/atlas/objects.inv" - ] +intersphinx = [ + "https://www.mongodb.com/docs/manual/objects.inv", + "https://www.mongodb.com/docs/atlas/objects.inv", +] toc_landing_pages = [ "/quick-start-rails", @@ -24,4 +25,5 @@ quickstart-sinatra-app-name = "my-sinatra-app" quickstart-rails-app-name = "my-rails-app" feedback-widget-title = "Feedback" server-manual = "Server manual" -api = "https://www.mongodb.com/docs/mongoid/master/api" \ No newline at end of file +api-root = "https://www.mongodb.com/docs/mongoid/master/api/Mongoid" +api = "https://www.mongodb.com/docs/mongoid/master/api" diff --git a/source/aggregation.txt b/source/aggregation.txt new file mode 100644 index 00000000..55fc3a16 --- /dev/null +++ b/source/aggregation.txt @@ -0,0 +1,232 @@ +.. _mongoid-aggregation: + +==================================== +Transform Your Data with Aggregation +==================================== + +.. facet:: + :name: genre + :values: reference + +.. meta:: + :keywords: code example, transform, pipeline + +.. contents:: On this page + :local: + :backlinks: none + :depth: 2 + :class: singlecol + +Overview +-------- + +In this guide, you can learn how to use {+odm+} to perform **aggregation +operations**. + +Aggregation operations process data in your MongoDB collections and return +computed results. The MongoDB Aggregation framework, which is part of the Query +API, is modeled on the concept of data processing pipelines. Documents enter a +pipeline that contains one or more stages, and this pipeline transforms the +documents into an aggregated result. + +Aggregation operations function similarly to car factories with assembly +lines. The assembly lines have stations with specialized tools to +perform specific tasks. For example, when building a car, the assembly +line begins with the frame. Then, as the car frame moves through the +assembly line, each station assembles a separate part. The result is a +transformed final product, the finished car. + +The assembly line represents the *aggregation pipeline*, the individual +stations represent the *aggregation stages*, the specialized tools +represent the *expression operators*, and the finished product +represents the *aggregated result*. + +Compare Aggregation and Find Operations +--------------------------------------- + +The following table lists the different tasks you can perform with find +operations, compared to what you can achieve with aggregation +operations. The aggregation framework provides expanded functionality +that allows you to transform and manipulate your data. + +.. list-table:: + :header-rows: 1 + :widths: 50 50 + + * - Find Operations + - Aggregation Operations + + * - | Select *certain* documents to return + | Select *which* fields to return + | Sort the results + | Limit the results + | Count the results + - | Select *certain* documents to return + | Select *which* fields to return + | Sort the results + | Limit the results + | Count the results + | Rename fields + | Compute new fields + | Summarize data + | Connect and merge data sets + +{+odm+} Builders +---------------- + +You can construct an aggregation pipeline by using {+odm+}'s high-level +domain-specific language (DSL). The DSL supports the following aggregation +pipeline operators: + +.. list-table:: + :header-rows: 1 + :widths: 50 50 + + * - Operator + - Method Name + + * - :manual:`$group ` + - ``group()`` + + * - :manual:`$project ` + - ``project()`` + + * - :manual:`$unwind ` + - ``unwind()`` + +To create an aggregation pipeline by using one of the preceding operators, call +the corresponding method on an instance of ``Criteria``. Calling the method adds +the aggregation operation to the ``pipeline`` atrritbure of the ``Criteria`` +instance. To run the aggregation pipeline, pass the ``pipeline`` attribute value +to the ``Collection#aggregate()`` method. + +Example +~~~~~~~ + +Consider a database that contains a collection with documents that are modeled by +the following classes: + +.. code-block:: ruby + + class Tour + include Mongoid::Document + + embeds_many :participants + + field :name, type: String + field :states, type: Array + end + + class Participant + include Mongoid::Document + + embedded_in :tour + + field :name, type: String + end + +In this example, the ``Tour`` model represents the name of a tour and the states +it travels through, and the ``Participant`` model represents the name of a +person participating in the tour. + +The following example creates an aggregation pipeline that outputs the states a +participant has visited by using the following +aggregation operations: + +- ``match()``, which find documents in which the ``participants.name`` field + value is ``"Serenity"`` +- ``unwind()``, which deconstructs the ``states`` array field and outputs a + document for each element in the array +- ``group()``, which groups the documents by the value of their ``states`` field +- ``project()``, which prompts the pipeline to return only the ``_id`` and + ``states`` fields + +.. io-code-block:: + + .. input:: /includes/aggregation/builder-dsl.rb + :language: ruby + + .. output:: + + [{"states":["OR","WA","CA"]}] + +Aggregation without Builders +---------------------------- + +You can use the ``Collection#aggregate()`` method to run aggregation operations that do not have +corresponding builder methods by passing in an array of aggregation +operations. Using this method to perform the aggregation returns +raw ``BSON::Document`` objects rather than ``Mongoid::Document`` model +instances. + +Example +~~~~~~~ + +Consider a database that contains a collection with documents that are modeled +by the following classes: + +.. code-block:: ruby + + class Band + include Mongoid::Document + has_many :tours + has_many :awards + field :name, type: String + end + + class Tour + include Mongoid::Document + belongs_to :band + field :year, type: Integer + end + + class Award + include Mongoid::Document + belongs_to :band + field :name, type: String + end + +The following example creates an aggregation pipeline to retrieve all bands that +have toured since ``2000`` and have at least ``1`` award: + +.. io-code-block:: + + .. input:: /includes/aggregation/ruby-aggregation.rb + :language: ruby + + .. output:: + + [ + {"_id": "...", "name": "Deftones" }, + {"_id": "...", "name": "Tool"}, + ... + ] + +.. tip:: + + The preceding example projects only the ``_id`` field of the output + documents. It then uses the projected results to find the documents and return + them as ``Mongoid::Document`` model instances. This optional step is not + required to run an aggregation pipeline. + +Additional Information +---------------------- + +To view a full list of aggregation operators, see :manual:`Aggregation +Operators. ` + +To learn about assembling an aggregation pipeline and view examples, see +:manual:`Aggregation Pipeline. ` + +To learn more about creating pipeline stages, see :manual:`Aggregation +Stages. ` + +API Documentation +~~~~~~~~~~~~~~~~~ + +To learn more about any of the methods discussed in this +guide, see the following API documentation: + +- `group() <{+api-root+}/Criteria/Queryable/Aggregable.html#group-instance_method>`__ +- `project() <{+api-root+}/Criteria/Queryable/Aggregable.html#project-instance_method>`__ +- `unwind() <{+api-root+}/Criteria/Queryable/Aggregable.html#unwind-instance_method>`__ \ No newline at end of file diff --git a/source/includes/aggregation/builder-dsl.rb b/source/includes/aggregation/builder-dsl.rb new file mode 100644 index 00000000..9e884ba7 --- /dev/null +++ b/source/includes/aggregation/builder-dsl.rb @@ -0,0 +1,6 @@ +criteria = Tour.where('participant.name' => 'Serenity'). + unwind(:states). + group(_id: 'states', :states.add_to_set => '$states'). + project(_id: 0, states: 1) + +@states = Tour.collection.aggregate(criteria.pipeline).to_json \ No newline at end of file diff --git a/source/includes/aggregation/ruby-aggregation.rb b/source/includes/aggregation/ruby-aggregation.rb new file mode 100644 index 00000000..91181f90 --- /dev/null +++ b/source/includes/aggregation/ruby-aggregation.rb @@ -0,0 +1,21 @@ +band_ids = Band.collection.aggregate([ + { '$lookup' => { + from: 'tours', + localField: '_id', + foreignField: 'band_id', + as: 'tours', + } }, + { '$lookup' => { + from: 'awards', + localField: '_id', + foreignField: 'band_id', + as: 'awards', + } }, + { '$match' => { + 'tours.year' => {'$gte' => 2000}, + 'awards._id' => {'$exists' => true}, + } }, + {'$project' => {_id: 1}}, +]) + +bands = Band.find(band_ids.to_a) \ No newline at end of file diff --git a/source/reference/aggregation.txt b/source/reference/aggregation.txt deleted file mode 100644 index 5a5c83d9..00000000 --- a/source/reference/aggregation.txt +++ /dev/null @@ -1,203 +0,0 @@ -.. _aggregation-pipeline: - -******************** -Aggregation Pipeline -******************** - -.. default-domain:: mongodb - -.. contents:: On this page - :local: - :backlinks: none - :depth: 2 - :class: singlecol - - -Mongoid exposes `MongoDB's aggregation pipeline -`_, -which is used to construct flows of operations that process and return results. -The aggregation pipeline is a superset of the deprecated -:ref:`map/reduce framework ` functionality. - - -Basic Usage -=========== - -.. _aggregation-pipeline-example-multiple-collections: - -Querying Across Multiple Collections -```````````````````````````````````` - -The aggregation pipeline may be used for queries involving multiple -referenced associations at the same time: - -.. code-block:: ruby - - class Band - include Mongoid::Document - has_many :tours - has_many :awards - field :name, type: String - end - - class Tour - include Mongoid::Document - belongs_to :band - field :year, type: Integer - end - - class Award - include Mongoid::Document - belongs_to :band - field :name, type: String - end - -To retrieve bands that toured since 2000 and have at least one award, one -could do the following: - -.. code-block:: ruby - - band_ids = Band.collection.aggregate([ - { '$lookup' => { - from: 'tours', - localField: '_id', - foreignField: 'band_id', - as: 'tours', - } }, - { '$lookup' => { - from: 'awards', - localField: '_id', - foreignField: 'band_id', - as: 'awards', - } }, - { '$match' => { - 'tours.year' => {'$gte' => 2000}, - 'awards._id' => {'$exists' => true}, - } }, - {'$project' => {_id: 1}}, - ]) - bands = Band.find(band_ids.to_a) - -Note that the aggregation pipeline, since it is implemented by the Ruby driver -for MongoDB and not Mongoid, returns raw ``BSON::Document`` objects rather than -``Mongoid::Document`` model instances. The above example projects only -the ``_id`` field which is then used to load full models. An alternative is -to not perform such a projection and work with raw fields, which would eliminate -having to send the list of document ids to Mongoid in the second query -(which could be large). - - -.. _aggregation-pipeline-builder-dsl: - -Builder DSL -=========== - -Mongoid provides limited support for constructing the aggregation pipeline -itself using a high-level DSL. The following aggregation pipeline operators -are supported: - -- `$group `_ -- `$project `_ -- `$unwind `_ - -To construct a pipeline, call the corresponding aggregation pipeline methods -on a ``Criteria`` instance. Aggregation pipeline operations are added to the -``pipeline`` attribute of the ``Criteria`` instance. To execute the pipeline, -pass the ``pipeline`` attribute value to ``Collection#aggragegate`` method. - -For example, given the following models: - -.. code-block:: ruby - - class Tour - include Mongoid::Document - - embeds_many :participants - - field :name, type: String - field :states, type: Array - end - - class Participant - include Mongoid::Document - - embedded_in :tour - - field :name, type: String - end - -We can find out which states a participant visited: - -.. code-block:: ruby - - criteria = Tour.where('participants.name' => 'Serenity',). - unwind(:states). - group(_id: 'states', :states.add_to_set => '$states'). - project(_id: 0, states: 1) - - pp criteria.pipeline - # => [{"$match"=>{"participants.name"=>"Serenity"}}, - # {"$unwind"=>"$states"}, - # {"$group"=>{"_id"=>"states", "states"=>{"$addToSet"=>"$states"}}}, - # {"$project"=>{"_id"=>0, "states"=>1}}] - - Tour.collection.aggregate(criteria.pipeline).to_a - - -group -````` - -The ``group`` method adds a `$group aggregation pipeline stage -`_. - -The field expressions support Mongoid symbol-operator syntax: - -.. code-block:: ruby - - criteria = Tour.all.group(_id: 'states', :states.add_to_set => '$states') - criteria.pipeline - # => [{"$group"=>{"_id"=>"states", "states"=>{"$addToSet"=>"$states"}}}] - -Alternatively, standard MongoDB aggregation pipeline syntax may be used: - -.. code-block:: ruby - - criteria = Tour.all.group(_id: 'states', states: {'$addToSet' => '$states'}) - - -project -``````` - -The ``project`` method adds a `$project aggregation pipeline stage -`_. - -The argument should be a Hash specifying the projection: - -.. code-block:: ruby - - criteria = Tour.all.project(_id: 0, states: 1) - criteria.pipeline - # => [{"$project"=>{"_id"=>0, "states"=>1}}] - - -.. _unwind-dsl: - -unwind -`````` - -The ``unwind`` method adds an `$unwind aggregation pipeline stage -`_. - -The argument can be a field name, specifiable as a symbol or a string, or -a Hash or a ``BSON::Document`` instance: - -.. code-block:: ruby - - criteria = Tour.all.unwind(:states) - criteria = Tour.all.unwind('states') - criteria.pipeline - # => [{"$unwind"=>"$states"}] - - criteria = Tour.all.unwind(path: '$states') - criteria.pipeline - # => [{"$unwind"=>{:path=>"$states"}}] diff --git a/source/reference/associations.txt b/source/reference/associations.txt index e8f9c7b3..46d4c880 100644 --- a/source/reference/associations.txt +++ b/source/reference/associations.txt @@ -338,7 +338,7 @@ Querying Referenced Associations In most cases, efficient queries across referenced associations (and in general involving data or conditions or multiple collections) are performed using the aggregation pipeline. Mongoid helpers for constructing aggregation pipeline -queries are described in the :ref:`aggregation pipeline ` +queries are described in the :ref:`aggregation pipeline ` section. For simple queries, the use of aggregation pipeline may be avoided and diff --git a/source/reference/map-reduce.txt b/source/reference/map-reduce.txt index 2251ffa9..48aa0ae5 100644 --- a/source/reference/map-reduce.txt +++ b/source/reference/map-reduce.txt @@ -19,7 +19,7 @@ custom map/reduce jobs or simple aggregations. .. note:: The map-reduce operation is deprecated. - The :ref:`aggregation framework ` provides better + The :ref:`aggregation framework ` provides better performance and usability than map-reduce operations, and should be preferred for new development. diff --git a/source/working-with-data.txt b/source/working-with-data.txt index 2b8b1e91..96ab61dc 100644 --- a/source/working-with-data.txt +++ b/source/working-with-data.txt @@ -12,7 +12,7 @@ Working With Data reference/crud reference/queries reference/text-search - reference/aggregation + /aggregation reference/map-reduce reference/persistence-configuration reference/nested-attributes @@ -28,7 +28,7 @@ See the following sections to learn more about working with data in Mongoid: - :ref:`CRUD Operations ` - :ref:`Queries ` - :ref:`Text Search ` -- :ref:`Aggregation Pipeline ` +- :ref:`mongoid-aggregation` - :ref:`Map/Reduce ` - :ref:`Persistence Configuration ` - :ref:`Nested Attributes `