Skip to content

Commit 537ad11

Browse files
authored
DOCSP-38417 - Aggregation (#69)
1 parent 0f8ac56 commit 537ad11

File tree

9 files changed

+267
-17
lines changed

9 files changed

+267
-17
lines changed

snooty.toml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,8 @@ toc_landing_pages = [
66
"/read",
77
"/connect",
88
"/indexes",
9+
"/aggregation",
10+
"/aggregation/aggregation-tutorials",
911
"/security",
1012
"/aggregation-tutorials",
1113
]

source/aggregation.txt

Lines changed: 259 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -1,17 +1,265 @@
1-
To run an explain plan for this aggregation use
2-
`PyMongoExplain <https://pypi.org/project/pymongoexplain/>`_,
3-
a companion library for PyMongo. It allows you to explain any CRUD operation
4-
by providing a few convenience classes:
1+
.. _pymongo-aggregation:
52

6-
.. code-block:: python
3+
====================================
4+
Transform Your Data with Aggregation
5+
====================================
6+
7+
.. facet::
8+
:name: genre
9+
:values: reference
10+
11+
.. meta::
12+
:keywords: code example, transform, computed, pipeline
13+
:description: Learn how to use {+driver-short+} to perform aggregation operations.
14+
15+
.. contents:: On this page
16+
:local:
17+
:backlinks: none
18+
:depth: 2
19+
:class: singlecol
20+
21+
.. toctree::
22+
:titlesonly:
23+
:maxdepth: 1
24+
25+
/aggregation/aggregation-tutorials
26+
27+
Overview
28+
--------
29+
30+
In this guide, you can learn how to use {+driver-short+} to perform
31+
**aggregation operations**.
32+
33+
Aggregation operations process data in your MongoDB collections and
34+
return computed results. The MongoDB Aggregation framework, which is
35+
part of the Query API, is modeled on the concept of data processing
36+
pipelines. Documents enter a pipeline that contains one or more stages,
37+
and this pipeline transforms the documents into an aggregated result.
38+
39+
An aggregation operation is similar to a car factory. A car factory has
40+
an assembly line, which contains assembly stations with specialized
41+
tools to do specific jobs, like drills and welders. Raw parts enter the
42+
factory, and then the assembly line transforms and assembles them into a
43+
finished product.
44+
45+
The **aggregation pipeline** is the assembly line, **aggregation stages** are the
46+
assembly stations, and **operator expressions** are the
47+
specialized tools.
48+
49+
Aggregation Versus Find Operations
50+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
51+
52+
You can use find operations to perform the following actions:
53+
54+
- Select which documents to return
55+
- Select which fields to return
56+
- Sort the results
57+
58+
You can use aggregation operations to perform the following actions:
59+
60+
- Perform find operations
61+
- Rename fields
62+
- Calculate fields
63+
- Summarize data
64+
- Group values
65+
66+
Limitations
67+
~~~~~~~~~~~
768

8-
>>> from pymongoexplain import ExplainableCollection
9-
>>> ExplainableCollection(collection).aggregate(pipeline)
10-
{'ok': 1.0, 'queryPlanner': [...]}
69+
Keep the following limitations in mind when using aggregation operations:
1170

12-
Or, use the the ``~pymongo.database.Database.command`` method method:
71+
- Returned documents must not violate the
72+
:manual:`BSON document size limit </reference/limits/#mongodb-limit-BSON-Document-Size>`
73+
of 16 megabytes.
74+
- Pipeline stages have a memory limit of 100 megabytes by default. You can exceed this
75+
limit by using the ``allowDiskUse`` keyword argument of the
76+
``aggregate()`` method.
77+
78+
.. important:: $graphLookup exception
79+
80+
The :manual:`$graphLookup
81+
</reference/operator/aggregation/graphLookup/>` stage has a strict
82+
memory limit of 100 megabytes and ignores the ``allowDiskUse`` parameter.
83+
84+
Aggregation Example
85+
-------------------
86+
87+
.. note::
88+
89+
This example uses the ``sample_restaurants.restaurants`` collection
90+
from the :atlas:`Atlas sample datasets </sample-data>`. To learn how to create a
91+
free MongoDB Atlas cluster and load the sample datasets, see :ref:`<pymongo-get-started>`.
92+
93+
To perform an aggregation, pass a list of aggregation stages to the
94+
``collection.aggregate()`` method.
95+
96+
The following code example produces a count of the number of bakeries in each borough
97+
of New York. To do so, it uses an aggregation pipeline with the following stages:
98+
99+
- A :manual:`$match </reference/operator/aggregation/match/>` stage to filter for documents
100+
whose ``cuisine`` field contains the value ``"Bakery"``.
101+
102+
- A :manual:`$group </reference/operator/aggregation/group/>` stage to group the matching
103+
documents by the ``borough`` field, accumulating a count of documents for each distinct
104+
value.
13105

14106
.. code-block:: python
107+
:copyable: true
108+
109+
# Define an aggregation pipeline with a match stage and a group stage
110+
pipeline = [
111+
{ "$match": { "cuisine": "Bakery" } },
112+
{ "$group": { "_id": "$borough", "count": { "$sum": 1 } } }
113+
]
114+
115+
# Execute the aggregation
116+
aggCursor = collection.aggregate(pipeline)
117+
118+
# Print the aggregated results
119+
for document in aggCursor:
120+
print(document)
121+
122+
The preceding code example produces output similar to the following:
123+
124+
.. code-block:: javascript
125+
126+
{'_id': 'Bronx', 'count': 71}
127+
{'_id': 'Brooklyn', 'count': 173}
128+
{'_id': 'Missing', 'count': 2}
129+
{'_id': 'Manhattan', 'count': 221}
130+
{'_id': 'Queens', 'count': 204}
131+
{'_id': 'Staten Island', 'count': 20}
132+
133+
Explain an Aggregation
134+
~~~~~~~~~~~~~~~~~~~~~~
135+
136+
To view information about how MongoDB executes your operation, you can
137+
instruct MongoDB to **explain** it. When MongoDB explains an operation, it returns
138+
**execution plans** and performance statistics. An execution
139+
plan is a potential way MongoDB can complete an operation.
140+
When you instruct MongoDB to explain an operation, it returns both the
141+
plan MongoDB executed and any rejected execution plans.
142+
143+
To explain an aggregation operation, you can use either the
144+
`PyMongoExplain <https://pypi.org/project/pymongoexplain/>`__ library or a database
145+
command. Select the corresponding tab below to see an example of each method.
146+
147+
.. tabs::
148+
149+
.. tab:: PyMongoExplain
150+
:tabid: pymongoexplain
151+
152+
Use pip to install the ``pymongoexplain`` library, as shown in the
153+
following example:
154+
155+
.. code-block:: sh
156+
157+
python3 -m pip install pymongoexplain
158+
159+
The following code example runs the preceding aggregation example and prints the explanation
160+
returned by MongoDB:
161+
162+
.. io-code-block::
163+
:copyable: true
164+
165+
.. input::
166+
:language: python
167+
168+
# Define an aggregation pipeline with a match stage and a group stage
169+
pipeline = [
170+
{ "$match": { "cuisine": "Bakery" } },
171+
{ "$group": { "_id": "$borough", "count": { "$sum": 1 } } }
172+
]
173+
174+
# Execute the operation and print the explanation
175+
result = ExplainableCollection(collection).aggregate(pipeline)
176+
print(result)
177+
178+
.. output::
179+
:language: javascript
180+
:visible: false
181+
182+
...
183+
'winningPlan': {'queryPlan': {'stage': 'GROUP',
184+
'planNodeId': 3,
185+
'inputStage': {'stage': 'COLLSCAN',
186+
'planNodeId': 1,
187+
'filter': {'cuisine': {'$eq': 'Bakery'}},
188+
'direction': 'forward'}},
189+
...
190+
191+
.. tab:: Database Command
192+
:tabid: db-command
193+
194+
The following code example runs the preceding aggregation example and prints the explanation
195+
returned by MongoDB:
196+
197+
.. io-code-block::
198+
:copyable: true
199+
200+
.. input::
201+
:language: python
202+
203+
# Define an aggregation pipeline with a match stage and a group stage
204+
pipeline = [
205+
{ $match: { cuisine: "Bakery" } },
206+
{ $group: { _id: "$borough", count: { $sum: 1 } } }
207+
]
208+
209+
# Execute the operation and print the explanation
210+
result = database.command("aggregate", "collection", pipeline=pipeline, explain=True)
211+
print(result)
212+
213+
.. output::
214+
:language: javascript
215+
216+
...
217+
'command': {'aggregate': 'collection',
218+
'pipeline': [{'$match': {'cuisine': 'Bakery'}},
219+
{'$group': {'_id': '$borough',
220+
'count': {'$sum': 1}}}],
221+
'explain': True,
222+
...
223+
224+
.. tip::
225+
226+
You can use Python's ``pprint`` module to make explanation results easier to read:
227+
228+
.. code-block:: python
229+
230+
import pprint
231+
...
232+
pprint.pp(result)
233+
234+
Additional Information
235+
----------------------
236+
237+
MongoDB Server Manual
238+
~~~~~~~~~~~~~~~~~~~~~
239+
240+
To view a full list of expression operators, see :manual:`Aggregation
241+
Operators. </reference/operator/aggregation/>`
242+
243+
To learn about assembling an aggregation pipeline and view examples, see
244+
:manual:`Aggregation Pipeline. </core/aggregation-pipeline/>`
245+
246+
To learn more about creating pipeline stages, see :manual:`Aggregation
247+
Stages. </reference/operator/aggregation-pipeline/>`
248+
249+
To learn more about explaining MongoDB operations, see
250+
:manual:`Explain Output </reference/explain-results/>` and
251+
:manual:`Query Plans. </core/query-plans/>`
252+
253+
Aggregation Tutorials
254+
~~~~~~~~~~~~~~~~~~~~~
255+
256+
To view step-by-step explanations of common aggregation tasks, see
257+
:ref:`pymongo-aggregation-tutorials-landing`.
258+
259+
API Documentation
260+
~~~~~~~~~~~~~~~~~
261+
262+
For more information about executing aggregation operations with {+driver-short+},
263+
see the following API documentation:
15264

16-
>>> db.command('aggregate', 'things', pipeline=pipeline, explain=True)
17-
{'ok': 1.0, 'stages': [...]}
265+
- `aggregate() <{+api-root+}pymongo/collection.html#pymongo.collection.Collection.aggregate>`__

source/aggregation-tutorials.txt renamed to source/aggregation/aggregation-tutorials.txt

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -19,11 +19,11 @@ Aggregation Tutorials
1919

2020
.. toctree::
2121

22-
/aggregation-tutorials/filtered-subset/
23-
/aggregation-tutorials/group-total/
24-
/aggregation-tutorials/unpack-arrays/
25-
/aggregation-tutorials/one-to-one-join/
26-
/aggregation-tutorials/multi-field-join/
22+
/aggregation/aggregation-tutorials/filtered-subset/
23+
/aggregation/aggregation-tutorials/group-total/
24+
/aggregation/aggregation-tutorials/unpack-arrays/
25+
/aggregation/aggregation-tutorials/one-to-one-join/
26+
/aggregation/aggregation-tutorials/multi-field-join/
2727

2828
Overview
2929
--------

source/index.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@ MongoDB PyMongo Documentation
1818
/write-operations
1919
/read
2020
/indexes
21-
/aggregation-tutorials
21+
/aggregation
2222
/security
2323
/tools
2424
/faq

0 commit comments

Comments
 (0)