Skip to content

Commit d128f4f

Browse files
jason-price-mongodbjason-price-mongodb
andauthored
DOCS-14901 aggregation refactor (#339) (#425)
* DOCS-14901 aggregation refactor * DOCS-14901-aggregation-refactor * DOCS-14901-aggregation-refactor * DOCS-14901-aggregation-refactor * DOCS-14901-aggregation-refactor * DOCS-14901-aggregation-refactor * DOCS-14901-aggregation-refactor * DOCS-14901-aggregation-refactor * DOCS-14901-aggregation-refactor * DOCS-14901-aggregation-refactor * DOCS-14901-aggregation-refactor * DOCS-14901 aggregation refactor * DOCS-14901 aggregation refactor * DOCS-14901 aggregation refactor * DOCS-14901 aggregation refactor Co-authored-by: jason-price-mongodb <[email protected]> Co-authored-by: jason-price-mongodb <[email protected]>
1 parent e15b56f commit d128f4f

8 files changed

+318
-223
lines changed

source/aggregation.txt

Lines changed: 42 additions & 48 deletions
Original file line numberDiff line numberDiff line change
@@ -21,12 +21,16 @@ results. You can use aggregation operations to:
2121

2222
To perform aggregation operations, you can use:
2323

24-
- :ref:`Aggregation pipelines <aggregation-framework>`
24+
- :ref:`Aggregation pipelines <aggregation-framework>`, which are the
25+
preferred method for performing aggregations.
2526

2627
- :ref:`Single purpose aggregation methods
27-
<single-purpose-agg-operations>`
28+
<single-purpose-agg-methods>`, which are simple but lack the
29+
capabilities of an aggregation pipeline.
2830

29-
- :ref:`Map-reduce functions <aggregation-map-reduce>`
31+
- :ref:`Map-reduce operations <aggregation-map-reduce>`, which are
32+
deprecated starting in MongoDB 5.0. Instead, use an aggregation
33+
pipeline.
3034

3135
.. _aggregation-framework:
3236

@@ -40,50 +44,39 @@ Aggregation Pipeline Example
4044

4145
.. include:: /includes/aggregation-pipeline-example.rst
4246

43-
For a runnable example, see :ref:`Complete Aggregation Pipeline
44-
Example <aggregation-pipeline-example>`.
47+
For runnable examples containing sample input documents, see
48+
:ref:`Complete Aggregation Pipeline Examples
49+
<aggregation-pipeline-examples>`.
4550

46-
Aggregation Pipeline Stages and Operations
47-
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
51+
.. _single-purpose-agg-methods:
4852

49-
The most basic pipeline stages provide *filters* that operate like
50-
queries and *document transformations* that modify the form
51-
of the output document.
53+
Single Purpose Aggregation Methods
54+
----------------------------------
5255

53-
Other pipeline operations provide tools for grouping and sorting
54-
documents by specific field or fields as well as tools for aggregating
55-
the contents of arrays, including arrays of documents. In addition,
56-
pipeline stages can use :ref:`operators
57-
<aggregation-expression-operators>` for tasks such as calculating the
58-
average or concatenating a string.
56+
You can use the following single purpose aggregation methods to
57+
aggregate documents from a single collection:
5958

60-
The pipeline provides efficient data aggregation using native
61-
operations within MongoDB, and is the preferred method for data
62-
aggregation in MongoDB.
59+
.. list-table::
60+
:header-rows: 1
61+
:widths: 50 50
62+
63+
* - Method
64+
- Description
6365

64-
The aggregation pipeline can operate on a
65-
:doc:`sharded collection </sharding>`.
66+
* - :method:`db.collection.estimatedDocumentCount()`
67+
- Returns an approximate count of the documents in a collection or
68+
a view.
6669

67-
The aggregation pipeline can use indexes to improve its performance
68-
during some of its stages. In addition, the aggregation pipeline has an
69-
internal optimization phase. See
70-
:ref:`aggregation-pipeline-operators-and-performance` and
71-
:doc:`/core/aggregation-pipeline-optimization` for details.
70+
* - :method:`db.collection.count()`
71+
- Returns a count of the number of documents in a collection or a
72+
view.
7273

73-
.. _single-purpose-agg-operations:
74+
* - :method:`db.collection.distinct()`
75+
- Returns an array of documents that have distinct values for the
76+
specified field.
7477

75-
Single Purpose Aggregation Operations
76-
-------------------------------------
77-
78-
MongoDB also provides :method:`db.collection.estimatedDocumentCount()`,
79-
:method:`db.collection.count()` and :method:`db.collection.distinct()`.
80-
81-
All of these operations aggregate documents from a single collection.
82-
While these operations provide simple access to common aggregation
83-
processes, they lack the flexibility and capabilities of an aggregation
84-
pipeline.
85-
86-
.. include:: /images/distinct.rst
78+
The single purpose aggregation methods are simple but lack the
79+
capabilities of an :ref:`aggregation pipeline <aggregation-framework>`.
8780

8881
.. _aggregation-map-reduce:
8982

@@ -92,21 +85,22 @@ Map-Reduce
9285

9386
.. include:: /includes/fact-use-aggregation-not-map-reduce.rst
9487

95-
Additional Features and Behaviors
96-
---------------------------------
97-
98-
For a feature comparison of the aggregation pipeline,
99-
map-reduce, and the special group functionality, see
88+
For a feature comparison of aggregation pipelines and map-reduce, see
10089
:doc:`/reference/aggregation-commands-comparison`.
10190

10291
Learn More
10392
----------
10493

105-
Practical MongoDB Aggregations E-Book
106-
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
94+
To learn more about aggregations, see:
95+
96+
- :ref:`aggregation-pipeline`
97+
98+
- :ref:`aggregation-expression-operators`
99+
100+
- :ref:`aggregation-pipeline-operator-reference`
107101

108-
For more information on aggregations, read the `Practical MongoDB
109-
Aggregations <https://www.practical-mongodb-aggregations.com>`__ e-book.
102+
- `Practical MongoDB Aggregations
103+
<https://www.practical-mongodb-aggregations.com>`_
110104

111105

112106
.. toctree::

source/core/aggregation-pipeline-optimization.txt

Lines changed: 77 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,12 @@ include the :method:`explain <db.collection.aggregate()>` option in the
1919

2020
.. include:: /includes/fact-optimizations-subject-to-change.rst
2121

22+
In addition to learning about the aggregation pipeline optimizations
23+
performed during the optimization phase, you will also see how to
24+
improve aggregation pipeline performance using indexes and document
25+
filters. See
26+
:ref:`aggregation-pipeline-optimization-indexes-and-filters`.
27+
2228
.. _aggregation-pipeline-projection-optimization:
2329

2430
Projection Optimization
@@ -113,11 +119,11 @@ use any values computed in either the :pipeline:`$project` or
113119

114120
.. note::
115121

116-
After optimization, the filter ``{ name: "Joe Schmoe" }`` is in
117-
a :pipeline:`$match` stage at the beginning of the pipeline. This has
122+
After optimization, the filter ``{ name: "Joe Schmoe" }`` is in a
123+
:pipeline:`$match` stage at the beginning of the pipeline. This has
118124
the added benefit of allowing the aggregation to use an index on the
119-
``name`` field when initially querying the collection.
120-
See :ref:`aggregation-pipeline-operators-and-performance` for more
125+
``name`` field when initially querying the collection. See
126+
:ref:`aggregation-pipeline-optimization-indexes-and-filters` for more
121127
information.
122128

123129
.. _agg-sort-match-optimization:
@@ -152,8 +158,9 @@ can sometimes add a portion of the :pipeline:`$match` stage before the
152158
:pipeline:`$redact` stage. If the added :pipeline:`$match` stage is at
153159
the start of a pipeline, the aggregation can use an index as well as
154160
query the collection to limit the number of documents that enter the
155-
pipeline. See :ref:`aggregation-pipeline-operators-and-performance` for
156-
more information.
161+
pipeline. See
162+
:ref:`aggregation-pipeline-optimization-indexes-and-filters` for more
163+
information.
157164

158165
For example, if the pipeline consists of the following stages:
159166

@@ -403,6 +410,70 @@ When the |sbe| is used for :pipeline:`$group`, the :ref:`explain results
403410
- ``explain.explainVersion: '2'``
404411
- ``explain.queryPlanner.winningPlan.queryPlan.stage: "GROUP"``
405412

413+
.. _aggregation-pipeline-optimization-indexes-and-filters:
414+
415+
Improve Performance with Indexes and Document Filters
416+
-----------------------------------------------------
417+
418+
The following sections show how you can improve aggregation performance
419+
using indexes and document filters.
420+
421+
Indexes
422+
~~~~~~~
423+
424+
The :ref:`query planner <query-plans-query-optimization>` analyzes
425+
an aggregation pipeline to determine if :ref:`indexes <indexes>`
426+
can be used to improve pipeline performance.
427+
428+
The following list shows some pipeline stages that can use indexes:
429+
430+
``$match`` stage
431+
:pipeline:`$match` can use an index to filter documents if
432+
:pipeline:`$match` is the first stage in a pipeline.
433+
434+
``$sort`` stage
435+
:pipeline:`$sort` can use an index if :pipeline:`$sort` is not
436+
preceded by a :pipeline:`$project`, :pipeline:`$unwind`, or
437+
:pipeline:`$group` stage.
438+
439+
``$group`` stage
440+
:pipeline:`$group` can potentially use an index to find the first
441+
document in each group if:
442+
443+
- :pipeline:`$group` is preceded by :pipeline:`$sort` that sorts the
444+
field to group by, and
445+
446+
- there is an index on the grouped field that matches the sort order,
447+
and
448+
449+
- :group:`$first` is the only accumulator in :pipeline:`$group`.
450+
451+
See :ref:`group-pipeline-optimization` for an example.
452+
453+
``$geoNear`` stage
454+
:pipeline:`$geoNear` can use a geospatial index. :pipeline:`$geoNear`
455+
must be the first stage in an aggregation pipeline.
456+
457+
Indexes can :ref:`cover <read-operations-covered-query>` queries in an
458+
aggregation pipeline. A covered query uses an index to return all of the
459+
documents and has high performance.
460+
461+
Document Filters
462+
~~~~~~~~~~~~~~~~
463+
464+
If your aggregation operation requires only a subset of the documents in
465+
a collection, filter the documents first:
466+
467+
- Use the :pipeline:`$match`, :pipeline:`$limit`, and :pipeline:`$skip`
468+
stages to restrict the documents that enter the pipeline.
469+
470+
- When possible, put :pipeline:`$match` at the beginning of the pipeline
471+
to use indexes that scan the matching documents in a collection.
472+
473+
- :pipeline:`$match` followed by :pipeline:`$sort` at the start of the
474+
pipeline is equivalent to a single query with a sort, and can use an
475+
index.
476+
406477
Example
407478
-------
408479
.. _agg-sort-skip-limit-sequence:

0 commit comments

Comments
 (0)