@@ -121,12 +121,12 @@ pipeline aggregates all of these documents, producing a document for each month:
121121Without going into more details on this, even if we were to comment on parts of
122122the aggregation pipeline to explain what it does, there will still be a high
123123cognitive load when going through the aggregation pipeline. One reason for this
124- is that any PHP editor will not know that this is an aggregation pipeline, and
125- thus can't provide much help beyond syntax highlighting (e.g. "this is a string
126- in an array"). Couple that with a few levels of nesting, and you've got yourself
127- this magical kind of code that you can write, but not read. We can of course
128- refactor this code, but before we get into that, we want to move away from these
129- array structures.
124+ is that the only way to express the aggregation framework domain-specific
125+ language (DSL) is through untyped arrays, and any PHP editor can't provide much
126+ help beyond syntax highlighting. Pair that with a few levels of nesting, and
127+ you've got yourself the kind of code that you can write, but not read. We could
128+ start off by refactoring the code, but instead let's try to move away from array
129+ structures and use a better solution .
130130
131131## Introducing the Aggregation Pipeline Builder
132132
@@ -137,6 +137,12 @@ and operators. Here is that same pipeline as we had before, this time written
137137with the aggregation pipeline builder:
138138
139139``` php
140+ use MongoDB\Builder\Accumulator;
141+ use MongoDB\Builder\Expression;
142+ use MongoDB\Builder\Pipeline;
143+ use MongoDB\Builder\Stage;
144+ use function MongoDB\object;
145+
140146$pipeline = new Pipeline(
141147 Stage::group(
142148 _id: object(
@@ -209,7 +215,9 @@ that can receive an aggregation pipeline, such as `Collection::aggregate` or
209215` Collection::watch ` . In addition, methods like ` Collection::updateMany ` and
210216` Collection::findOneAndUpdate ` can receive a ` Pipeline ` instance to run an
211217update. Keep in mind that you won't be able to use all available aggregation
212- pipeline stages in update operations.
218+ pipeline stages in update operations. Since a pipeline instance could be used
219+ everywhere, the builder will allow you to add stages to a pipeline that aren't
220+ supported in updates.
213221
214222### Builder Design
215223
@@ -223,7 +231,8 @@ expression that resolves to a date, timestamp, or ObjectId. While we could use
223231` $reportDate ` to reference the ` reportDate ` field of the document being evaluated,
224232` dateFieldPath ` is more expressive and shows intent of receiving a date field.
225233This also allows IDEs like PhpStorm to make better suggestions when offering
226- code completion.
234+ code completion, as well as a certain amount of type checking to ensure you're
235+ not creating pipelines that would result in server errors.
227236
228237For all expressions, there are factory classes with methods to create the
229238expression objects. The use of static methods makes the code a little more
@@ -314,17 +323,17 @@ to since they are only used once, but you may want to do so in favor of consiste
314323In complex pipelines, you'll often find comments explaining what a certain
315324pipeline stage or segment does. You should definitely include comments like
316325that, but you can also consider extracting parts of a pipeline to your own
317- builder method. If you choose a descriptive method name, this can already
318- explain what the stage or segment does, without the reader having to see the
319- internal workings.
326+ builder method. Choosing a descriptive method name that explains what the stage
327+ or segment does can also remove the need for a comment and spare the reader from
328+ having to dive into the internal workings.
320329
321330``` php
322331public static function groupAndComputeStatistics(
323- stdClass $_id ,
332+ stdClass $groupBy ,
324333 Expression\ResolvesToDouble $price,
325334): GroupStage {
326335 return Stage::group(
327- _id: $_id ,
336+ _id: $groupBy ,
328337 lowest: Accumulator::min($price),
329338 highest: Accumulator::max($price),
330339 average: Accumulator::avg($price),
@@ -342,7 +351,7 @@ $reportDate = Expression::dateFieldPath('reportDate');
342351$price = Expression::doubleFieldPath('price');
343352
344353self::groupAndComputeStatistics(
345- _id : object(
354+ groupBy : object(
346355 year: Expression::year($reportDate),
347356 month: Expression::month($reportDate),
348357 fuelType: Expression::fieldPath('fuelType'),
@@ -366,7 +375,7 @@ return a `Pipeline` instance:
366375
367376``` php
368377public static function groupAndAssembleFuelTypePriceObject(
369- stdClass $_id ,
378+ stdClass $groupBy ,
370379 Expression\ResolvesToString $fuelType,
371380 Expression\ResolvesToInt $count,
372381 Expression\ResolvesToDouble $lowest,
@@ -375,7 +384,7 @@ public static function groupAndAssembleFuelTypePriceObject(
375384): Pipeline {
376385 return new Pipeline(
377386 Stage::group(
378- _id: $_id ,
387+ _id: $groupBy ,
379388 count: Accumulator::sum($count),
380389 prices: Accumulator::push(
381390 object(
@@ -409,7 +418,7 @@ the pipeline:
409418
410419``` php
411420self::groupAndAssembleFuelTypePriceObject(
412- _id : object(
421+ groupBy : object(
413422 year: Expression::fieldPath('_id.year'),
414423 month: Expression::fieldPath('_id.month'),
415424 brand: Expression::fieldPath('_id.brand'),
@@ -432,7 +441,7 @@ $price = Expression::doubleFieldPath('price');
432441
433442$pipeline = new Pipeline(
434443 self::groupAndComputeStatistics(
435- _id : object(
444+ groupBy : object(
436445 year: Expression::year($reportDate),
437446 month: Expression::month($reportDate),
438447 fuelType: Expression::fieldPath('fuelType'),
@@ -441,7 +450,7 @@ $pipeline = new Pipeline(
441450 price: $price,
442451 ),
443452 self::groupAndAssembleFuelTypePriceObject(
444- _id : object(
453+ groupBy : object(
445454 year: Expression::fieldPath('_id.year'),
446455 month: Expression::fieldPath('_id.month'),
447456 brand: Expression::fieldPath('_id.brand'),
@@ -453,7 +462,7 @@ $pipeline = new Pipeline(
453462 average: Expression::fieldPath('average'),
454463 ),
455464 self::groupBrandsAndSort(
456- _id : object(
465+ groupBy : object(
457466 year: Expression::fieldPath('_id.year'),
458467 month: Expression::fieldPath('_id.month'),
459468 ),
@@ -588,16 +597,16 @@ public static function computeElapsedSecondsOnDay(
588597 * argument.
589598 */
590599public static function computeDurationBetweenDates(
591- Expression\ResolvesToDate $previousReportDate ,
592- Expression\ResolvesToDate $reportDate ,
600+ Expression\ResolvesToDate $startDate ,
601+ Expression\ResolvesToDate $endDate ,
593602): Expression\ResolvesToInt {
594603 return Expression::min(
595604 Expression::dateDiff(
596- startDate: $previousReportDate ,
597- endDate: $reportDate ,
605+ startDate: $startDate ,
606+ endDate: $endDate ,
598607 unit: 'second',
599608 ),
600- self::computeElapsedSecondsOnDay($reportDate ),
609+ self::computeElapsedSecondsOnDay($endDate ),
601610 );
602611}
603612```
@@ -638,7 +647,7 @@ This would include all the logic included in the pipeline. By extracting complex
638647parts into separate methods, you can test those in isolation under controlled
639648conditions, knowing that they behave as you intended. Looking at the last
640649example where we've extracted the expression that computes the duration between
641- two dates, you can now easily verify that all the individual parts of that
650+ two dates, you can now better verify that all the individual parts of that
642651complex expression work as expected.
643652
644653Just beware of premature abstractions: extract complex logic with the goal of
@@ -656,12 +665,12 @@ but they allow us to know what types an expression will resolve to.
656665For an example, let's take the ` $hour ` operator. When you use
657666` Expression::hour ` , you will receive an instance of an ` HourOperator ` class.
658667This class implements an ` OperatorInterface ` , telling the builder that this is
659- an operator that can be used in aggregation pipeline. It also implements a
660- ` ResolvesToInt ` interface, as we always know that evaluating the expression
661- results in an integer value. The required ` date ` parameter of the operator is a
662- date, which in the builder is one of many things. It could be a BSON
663- ` UTCDateTime ` instance, but it could also be the result of any operator that
664- returns a date, e.g. ` $dateFromString ` .
668+ an operator that can be used in an aggregation pipeline stage . It also
669+ implements a ` ResolvesToInt ` interface, as we always know that evaluating the
670+ expression results in an integer value. The required ` date ` parameter of the
671+ operator is a date, which in the builder is one of many things. It could be a
672+ ` MongoDB\BSON\ UTCDateTime` instance, but it could also be the result of any
673+ operator that returns a date, e.g. ` $dateFromString ` .
665674
666675Now that we know about these value holder objects, we still need to make sure
667676the server knows what we're talking about. When you call ` Collection::aggregate `
@@ -674,8 +683,8 @@ representations.
674683This allows us to keep the logic customizable. For example, the Doctrine MongoDB
675684ODM allows users to specify different names for fields in the database. In turn,
676685it needs to convert the name of a property in the mapped class to the name of
677- the field in the database. Such a feature could easily be built by creating a
678- custom encoder for all ` fieldPath ` expressions, and changing the field path
686+ the field in the database. Such a feature could be built by creating a custom
687+ encoder for all ` fieldPath ` expressions, and changing the field path
679688accordingly.
680689
681690When creating a ` MongoDB\Client ` instance, you can now pass an additional
@@ -690,7 +699,7 @@ With factories, value holders, and encoders, we wanted to ensure that creating
690699the builder does not turn into a repetitive chore. As you can imagine, many
691700operators will mostly consist of the same logic, resulting in tons of code
692701duplication. To make matters worse, every new server version may introduce new
693- operators and stages, so we wanted to make sure that we can easily expand
702+ operators and stages, so we wanted to make sure that we can quickly expand
694703the builder.
695704
696705We could try to rely on generative AI to help us with this, but this only goes
@@ -706,7 +715,6 @@ before and adds them to the tests. In these tests, we manually write the builder
706715code that would generate the pipeline or expression from the example. This
707716ensures that the generated logic behaves as we expect, is protected from
708717regression should we make any changes to the generator, and it allows us to use
709- the builder and feel what it's like. To top it all off, we could add this code
710- to the server documentation, similar to how we add other language-specific code
711- snippets. We're not quite there yet, but we'd love to include this in the
712- documentation.
718+ the builder and feel what it's like. Ultimately, we love to add this builder
719+ code to the server documentation, similar to how we add other language-specific
720+ code snippets, but we're not quite there yet.
0 commit comments