@@ -4,10 +4,10 @@ Read Operations
44
55.. default-domain:: mongodb
66
7- This document how MongoDB performs read operations .
7+ Read operations determine how MongoDB returns collection data when you issue a query .
88
9- MongodDB uses read operations when you retrieve collection data by using
10- a query .
9+ This document describes how MongoDB performs read operations and how
10+ different factors affect the efficiency of reads .
1111
1212.. TODO intro and high-level read operations info
1313
@@ -20,6 +20,10 @@ a query.
2020Query Operations
2121----------------
2222
23+ Queries retrieve data from your database collections. How a query
24+ retrieves data is dependent on MongoDB read operations and on the
25+ indexes you have created.
26+
2327.. _read-operations-query-syntax:
2428
2529Query Syntax
@@ -29,111 +33,21 @@ For a list of query operators, see :doc:`/reference/operators`.
2933
3034.. TODO see the yet-to-be created query operations doc
3135
32- .. _read-operations-query-optimization:
33-
34- Query Optimization
35- ~~~~~~~~~~~~~~~~~~
36-
37- The MongoDB query optimizer matches a query to the best index for
38- performing that query. When the optimizer finds the best index, it
39- creates a query plan so that the query will always use the specified
40- index.
41-
42- The MongoDB query optimizer deletes a query plan when a collection has
43- changed to a point that the the specified index might no longer provide
44- the fastest results.
45-
46- Query plans take advantage of MongoDB's indexing features. You should
47- always write indexes that use the same fields and sort in the same order
48- as do your queries.
49-
50- MongoDB creates a query plan as follows: When you run a query for which
51- there is no query plan, either because the query is new or the old plan
52- is obsolete, the query optimizer runs the query against several indexes
53- at once in parallel. Though the optimizer queries the indexes in
54- parallel, it records the results as though all coming from one index.
55- The optimizer records all matches in a single common buffer.
56-
57- As each index yields a match, MongoDB records the match in the buffer.
58- If an index returns a result already returned by another index, the
59- optimizer recognizes the duplication and skips recording the match
60- a second time.
61-
62- The optimizer determines a "winning" index and stops querying when either of
63- the following occur:
64-
65- - The optimizer exhausts an index, which means that index has provided
66- the full result set the fastest.
67-
68- - The optimizer reaches 101 results. At that point, the optimizer
69- chooses the plan that has provided the most results *first* and
70- continues reading only from that plan. Note that another index might
71- have provided all those results as duplicates but because the "winning"
72- index provided the results faster, it is the most efficient index.
73-
74- The "winning" index now becomes the index specified in the query plan as
75- the one to use the next time that query is run.
76-
77- To evaluate the optimizer's choice of query plan, run the query again
78- with the :method:`explain() <cursor.explain()>` method and
79- :method:`hint() <cursor.hint()>` methods appended. This returns
80- statistics about how the query runs. (It returns the statistics in place
81- of returning the query results.)
82-
83- .. code-block:: javascript
84-
85- db.people.find( { name:"John"} ).explain().hint()
86-
87- .. For details on the output of the :method:`explain()
88- <cursor.explain()>` method, see ...
89-
90- If you run :method:`explain() <cursor.explain()>` without including
91- :method:`hint() <cursor.hint()>`, the query optimizer will re-evaluate
92- the query, running multiple query plans, before it returns the query
93- statistics. Unless you want the optimizer to re-evaluate the query, do
94- not leave off :method:`hint() <cursor.hint()>`.
95-
96- Because your collections will likely change over time, the query
97- optimizer uses the query plan only to a certain point.
98-
99- .. Order of buffer results is different because coming from different
100- indexes. Not ordered on one index.
101-
102- .. Sorting >> all query plans are ordered vs none vs some.
103-
104- .. "Optimal" is determined from a past run of multiple plans. But that
105- cache gets cleared if there's been multiple writes.
106-
107- .. Speculative scan of multiple plans.
108-
109- .. Sparce indexes can change a result set.
110-
111- .. Interleaving of results sets from multiple indexes ocurrs only when
112- query plan is being determined. Once query plan is cached, then it's
113- going to use one index.
114-
115- .. What validates a cache: 1000 doc writes (not write operations but
116- actual doc writes). Also if reindex or restart mongod.
117-
118- .. Interweaving/leaving plans is done with cursor.
119-
120- .. Dupe on disk lock and not on ID.
121-
122- .. First time it runs the query (the first time it picks a query plan),
123- it runs union of all query plans deemed to be potentially useful to
124- return results set. Second time run the same query, it runs a single
125- query plan.
36+ .. _read-operations-indexing:
12637
127- .. Therefore, you can run the same query twice in a row and get the
128- same results ordered differently.
38+ Indexes
39+ ~~~~~~~
12940
130- .. And when you run explain, you also get different statistics.
41+ Indexes significantly reduce the amount of work needed for query read
42+ operations. Indexes record specified keys and key values and the disk
43+ locations of the documents containing those values.
13144
132- .. END OF MY NOTES ON THE TECH TALK, EXCEPT FOR THE NOTES ON SPECIFIC
133- OPTIMIZATION OPERATORS, such as $elemMatch
45+ Indexes are typically stored in RAM *or* located sequentially on disk,
46+ and indexes are smaller than the documents they catalog. When a query
47+ can use an index, the read operation is significantly faster than when
48+ the query must scan all documents in a collection.
13449
135- Selective Indexes Return Fastest Results
136- ````````````````````````````````````````
50+ MongoDB represents indexes internally as B-trees.
13751
13852The most selective indexes return the fastest results. The most
13953selective index possible for a given query is an index for which all the
@@ -169,91 +83,116 @@ documents that match the query criteria also match the entire query.
16983 Conversely, not all the documents that match the query's ``x`` key
17084 value also match the entire query.
17185
172- .. _read-operations-projection :
86+ .. seealso: :
17387
174- Projection
175- ~~~~~~~~~~
88+ - The :doc:`/core/indexes` documentation, in particular :doc:`/applications/indexes`
89+ - :doc:`/reference/operators`
90+ - :method:`find <db.collection.find()>`
91+ - :method:`findOne`
17692
177- A projection specifies which field values a query should return for
178- matching documents. If you run a query *without* a projection, the query
179- returns all fields and values for matching documents, which can
180- add unnecessary network and deserialization costs.
93+ .. _read-operations-query-optimization:
18194
182- MongoDB provides special projection operators that let you specify the
183- fields to return. For documentation on each operator, click the operator name:
95+ Query Optimization
96+ ~~~~~~~~~~~~~~~~~~
18497
185- - :projection:`$elemMatch`
98+ MongoDB provides a query optimizer that matches a query to the index
99+ that performs the fastest read operation for that query.
186100
187- - :projection:`$slice`
101+ When you issue a query for the first time, the query optimizer runs the
102+ query against several indexes to find the most efficient. The optimizer
103+ then creates a "query plan" that specifies the index for future runs of
104+ the query.
188105
189- .. _read-operations-indexing:
106+ The MongoDB query optimizer deletes a query plan when a collection has
107+ changed to a point that the the specified index might no longer provide
108+ the fastest results.
190109
191- Indexing
192- ~~~~~~~~
110+ Query plans take advantage of MongoDB's indexing features. You should
111+ always write indexes that use the same fields and that sort in the same
112+ order as do your queries. For more information, see :doc:`/applications/indexes`.
193113
194- Indexes significantly reduce the amount of work needed for query read
195- operations. Indexes record specified keys and key values and the disk
196- locations of the documents containing those values.
114+ MongoDB creates a query plan as follows: When you run a query for which
115+ there is no query plan, either because the query is new or the old plan
116+ is obsolete, the query optimizer runs the query against several indexes
117+ at once in parallel but records the results in a single common buffer,
118+ as though the results all come from the same index. As each index yields
119+ a match, MongoDB records the match in the buffer. If an index returns a
120+ result already returned by another index, the optimizer recognizes the
121+ duplication and skips the duplicate match.
122+
123+ The optimizer determines a "winning" index when either of
124+ the following occur:
125+
126+ - The optimizer exhausts an index, which means that the index has
127+ provided the full result set. At this point, the optimizer stops
128+ querying.
129+
130+ - The optimizer reaches 101 results. At this point, the optimizer
131+ chooses the plan that has provided the most results *first* and
132+ continues reading only from that plan. Note that another index might
133+ have provided all those results as duplicates but because the
134+ "winning" index provided the full result set first, it is more
135+ efficient.
197136
198- Without indexes, MongoDB must scan all documents to return query
199- results .
137+ The "winning" index now becomes the index specified in the query plan as
138+ the one to use the next time the query is run .
200139
201- The order of index keys matters.
140+ To evaluate the optimizer's choice of query plan, run the query again
141+ with the :method:`explain() <cursor.explain()>` method and
142+ :method:`hint() <cursor.hint()>` methods appended. Instead of returning
143+ query results, this returns statistics about how the query runs. For example:
202144
203- In order to fulfill a multi-field query using an index, the query
204- optimizer first searches the index for the first field in the query.
205- When the first instance of that entry is found, the query then searches
206- for the next field within the index entries for the first field.
145+ .. code-block:: javascript
207146
208- If you structure your index such that the first field ...
147+ db.people.find( { name:"John"} ).explain().hint()
209148
210- As a general rule, a query where one term demands an exact match and
211- another specifies a range requires a com- pound index where the range
212- key comes second.
149+ For details on the output, see :method:`explain() <cursor.explain()>`.
213150
214- When you create indexes, you must do so with your queries in mind. A
215- query can use only one index and therefore you must create indexes that
216- include all the fields in a given query.
151+ .. note::
217152
218- Because indexes take up space and because MongoDB writes to an index
219- with every write to the database, you must also be careful with index
220- creation. Do not create indexes that duplicate each other. For example,
221- an index that queries on ``a`` and then ``b`` can be used for queries of
222- ``a`` then ``b`` as well as for queries of just ``a``. Do not have two
223- indexes where one will do .
153+ If you run :method:`explain() <cursor.explain()>` without including
154+ :method:`hint() <cursor.hint()>`, the query optimizer will
155+ re-evaluate the query and run against multiple indexes before
156+ returning the query statistics. Unless you want the optimizer to
157+ re-evaluate the query, do not leave off :method:`hint()
158+ <cursor.hint()>` .
224159
225- You can also speed read operations by eliminating unnecessary indexes.
160+ Because your collections will likely change over time, the query
161+ optimizer deletes a query plan and re-evaluates the indexes when any
162+ of the following occur:
226163
227- Whenever you add a document to a collection, each index on that
228- collection must be modified to include the new document. So if a
229- particular collection has 10 indexes, then that makes 10 separate
230- structures to modify on each insert. This holds for any write operation,
231- whether you’re removing a document or updating a given document’s
232- indexed keys.
164+ - The number of writes to the collection reaches 1,000.
233165
234- For read-intensive applications, the cost of indexes is almost always
235- justified. Just realize that indexes do impose a cost and that they
236- therefore must be chosen with care. This means ensuring that all of your
237- indexes are used and that none of them are redundant. You can do this in
238- part by profiling your application’s queries.
166+ - You run the :dbcommand:`reIndex` command on the index.
239167
240- Reading from RAM is faster than reading from disk, so you must make sure
241- your indexes and working sets together fit into RAM. To check the size
242- of an index use the :method:`db.collection.totalIndexSize()` helper.
168+ - You restart :program:`mongod`.
243169
244- MongoDB represents indexes internally as B-trees.
170+ When you re-evaluate a query, the optimizer will display the same
171+ results (assuming no data has changed) but might display the results in
172+ a different order, and the :method:`explain() <cursor.explain()>` method
173+ and :method:`hint() <cursor.hint()>` methods might result in different
174+ statistics. This is because the optimizer retrieves the results from
175+ several indexes at once during re-evaluation and the order in which
176+ results appear depends on the order of the indexes within the parallel
177+ querying.
245178
246- Use the different index types to keep your indexes to only the size
247- needed. For example, for queries that always return a document only if a
248- value exists for the search keys, use sparse indexes. Sparse indexes
249- take up less space than default indexes.
179+ .. _read-operations-projection:
250180
251- .. seealso::
181+ Projection
182+ ~~~~~~~~~~
252183
253- - The :doc:`/core/indexes` documentation, in particular :doc:`/applications/indexes`
254- - :doc:`/reference/operators`
255- - :method:`find <db.collection.find()>`
256- - :method:`findOne`
184+ A projection specifies which field values from an array a query should
185+ return for matching documents. If you run a query *without* a
186+ projection, the query returns all fields and values for matching
187+ documents, which can add unnecessary network and deserialization costs.
188+
189+ To run the most efficient queries, use the following projection
190+ operators when possible when querying on array values. For documentation
191+ on each operator, click the operator name:
192+
193+ - :projection:`$elemMatch`
194+
195+ - :projection:`$slice`
257196
258197.. _read-operations-aggregation:
259198
@@ -271,6 +210,23 @@ Aggregation
271210.. index:: read operation; architecture
272211.. _read-operations-architecture:
273212
213+ Query Operators that Cannot Use Indexes
214+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
215+
216+ Some query operators cannot take advantage of indexes and require a
217+ collection scan. When using these operators you can narrow the documents
218+ scanned by combining the operator with another operator that does use an
219+ index.
220+
221+ Operators that cannot use indexes include the following:
222+
223+ - :operator:`$nin`
224+
225+ - :operator:`$ne`
226+
227+ .. TODO Regular expressions queries also do not use an index.
228+ .. TODO :method:`cursor.skip()` can cause paginating large numbers of docs
229+
274230Architecture
275231------------
276232
0 commit comments