Skip to content

Commit e83d9de

Browse files
(DOCS-10564): Sharding tutorial - distribute collections using zones (#2142)
* initial page setup * WIP * wip * WIP * finish balancer section * fix widths * add security prereq * (DOCS-16064): Sharding tutorial - distribute collections using zones * edits * fix learn more * add prereq * fix code block highlighting * typo * wording * WIP review edits * finish review edits * edits * edits * typo fixes * updates per Asya's feedback * wording * remove extra heading * review edits * wording * reorder * add clarification * clarify balancing behavior * ordering * wording * alphabetize * edits
1 parent da650da commit e83d9de

File tree

4 files changed

+260
-5
lines changed

4 files changed

+260
-5
lines changed

source/core/zone-sharding.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -45,6 +45,7 @@ violate any of the zones.
4545
/tutorial/sharding-tiered-hardware-for-varying-slas
4646
/tutorial/sharding-segmenting-shards
4747
/tutorial/sharding-high-availability-writes
48+
/tutorial/sharding-distribute-collections-with-zones
4849

4950
Behavior and Operations
5051
-----------------------
Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
1-
In sharded clusters, you can create :term:`zones <zone>` of sharded data based
2-
on the :term:`shard key`. You can associate each zone with one or more shards
3-
in the cluster. A shard can associate with any number of zones. In a balanced
4-
cluster, MongoDB migrates :term:`chunks <chunk>` covered by a zone only to
5-
those shards associated with the zone.
1+
In sharded clusters, you can create :term:`zones <zone>` of sharded data
2+
based on the :term:`shard key`. You can associate each zone with one or
3+
more shards in the cluster. A shard can associate with any number of
4+
zones. In a balanced cluster, MongoDB migrates :term:`chunks <chunk>`
5+
covered by a zone only to those shards associated with the zone.

source/tutorial/manage-shard-zone.txt

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,5 @@
1+
.. _sharding-manage-zones:
2+
13
==================
24
Manage Shard Zones
35
==================
Lines changed: 252 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,252 @@
1+
.. _sharding-tutorial-distribute-collections:
2+
3+
==================================
4+
Distribute Collections Using Zones
5+
==================================
6+
7+
.. default-domain:: mongodb
8+
9+
.. contents:: On this page
10+
:local:
11+
:backlinks: none
12+
:depth: 2
13+
:class: singlecol
14+
15+
.. include:: /includes/intro-zone-sharding.rst
16+
17+
You can use :ref:`zone sharding <zone-sharding>` to distribute
18+
collections across a sharded cluster and designate which shards store
19+
data for each collection. You can distribute collections based on shard
20+
properties, such as physical resources and available memory, to ensure
21+
that each collection is stored on the optimal shard for that data.
22+
23+
Prerequisites
24+
-------------
25+
26+
To complete this tutorial, you must:
27+
28+
- :ref:`Deploy a sharded cluster <sharding-procedure-setup>`. This
29+
tutorial uses a sharded cluster with three shards.
30+
31+
- Connect to a :program:`mongos`. You cannot create zones or zone ranges
32+
by connecting directly to a shard.
33+
34+
- Authenticate as a user with at least the :authrole:`clusterManager`
35+
role on the ``admin`` database. To view user permissions, use the
36+
:method:`db.getUser()` method.
37+
38+
Scenario
39+
--------
40+
41+
You have a database called ``shardDistributionDB`` that contains two
42+
sharded collections:
43+
44+
- ``bigData``, which contains a large amount of data.
45+
46+
- ``manyIndexes``, which contains many large indexes.
47+
48+
You want to limit each collection to a subset of shards so that each
49+
collection can use the shards' different physical resources.
50+
51+
Architecture
52+
~~~~~~~~~~~~
53+
54+
The sharded cluster has three shards. Each shard has unique physical
55+
resources:
56+
57+
.. list-table::
58+
:header-rows: 1
59+
:widths: 8 20
60+
61+
* - Shard Name
62+
- Physical Resources
63+
64+
* - ``shard0``
65+
- High memory capacity
66+
67+
* - ``shard1``
68+
- Fast flash storage
69+
70+
* - ``shard2``
71+
- High memory capacity **and** fast flash storage
72+
73+
Zones
74+
~~~~~
75+
76+
To distribute collections based on physical resources, use shard zones.
77+
A shard zone associates collections with a specific subset of shards,
78+
which restricts the shards that store the collection's data. In this
79+
example, you need two shard zones:
80+
81+
.. list-table::
82+
:header-rows: 1
83+
:widths: 10 15 20
84+
85+
* - Zone Name
86+
- Description
87+
- Collections in this Zone
88+
89+
* - ``HI_RAM``
90+
- Servers with high memory capacity.
91+
- Collections requiring more memory, such as collections with large
92+
indexes, should be on the ``HI_RAM`` shards.
93+
94+
* - ``FLASH``
95+
- Servers with flash drives for fast storage speeds.
96+
- Large collections requiring fast data retrieval should be on the
97+
``FLASH`` shards.
98+
99+
Shard Key
100+
~~~~~~~~~
101+
102+
In this tutorial, the :ref:`shard key <shard-key>` you will use to shard
103+
each collection is ``{ _id: "hashed" }``. You will configure shard zones
104+
**before** you shard the collections. As a result, each collection's
105+
data only ever exists on the shards in the corresponding zone.
106+
107+
With :ref:`hashed sharding <index-type-hashed>`, if you shard
108+
collections before you configure zones, MongoDB assigns :term:`chunks
109+
<chunk>` evenly between all shards when sharding is enabled. This means
110+
that chunks may be temporarily assigned to a shard poorly suited to
111+
handle that chunk's data.
112+
113+
Balancer
114+
~~~~~~~~
115+
116+
The :ref:`balancer <sharding-balancing>` migrates chunks to the
117+
appropriate shard, respecting any configured zones. When balancing is
118+
complete, shards only contain chunks whose ranges match its assigned
119+
zones.
120+
121+
.. important:: Performance
122+
123+
Adding, removing, or changing zones or zone ranges can result in
124+
chunk migrations. Depending on the size of your dataset and the
125+
number of chunks a zone or zone range affects, these migrations may
126+
impact cluster performance. Consider running the balancer during
127+
specific scheduled windows. To learn how to set a scheduling window,
128+
see :ref:`sharding-schedule-balancing-window`.
129+
130+
Steps
131+
-----
132+
133+
Use the following procedure to configure shard zones and distribute
134+
collections based on shard physical resources.
135+
136+
.. procedure::
137+
138+
.. step:: Add each shard to the appropriate zone.
139+
140+
To configure the shards in each zone, use the
141+
:dbcommand:`addShardToZone` command.
142+
143+
Add ``shard0`` and ``shard2`` to the ``HI_RAM`` zone:
144+
145+
.. code-block:: javascript
146+
147+
sh.addShardToZone("shard0", "HI_RAM")
148+
149+
sh.addShardToZone("shard2", "HI_RAM")
150+
151+
Add ``shard1`` and ``shard2`` to the ``FLASH`` zone:
152+
153+
.. code-block:: javascript
154+
155+
sh.addShardToZone("shard1", "FLASH")
156+
157+
sh.addShardToZone("shard2", "FLASH")
158+
159+
.. step:: Add zone ranges for the relevant collections.
160+
161+
To associate a range of
162+
shard keys to a zone, use :method:`sh.updateZoneKeyRange()`.
163+
164+
In this scenario, you want to associate all documents in a
165+
collection to the appropriate zone. To associate all collection
166+
documents to a zone, specify the following zone range:
167+
168+
- a lower bound of ``{ "_id" : MinKey }``
169+
- an upper bound of ``{ "_id" : MaxKey }``
170+
171+
For the ``bigData`` collection, set:
172+
173+
- The namespace to ``shardDistributionDB.bigData``,
174+
- The lower bound to :bsontype:`MinKey`,
175+
- The upper bound to :bsontype:`MaxKey`,
176+
- The zone to ``FLASH``
177+
178+
.. code-block:: javascript
179+
180+
sh.updateZoneKeyRange(
181+
"shardDistributionDB.bigData",
182+
{ "_id" : MinKey },
183+
{ "_id" : MaxKey },
184+
"FLASH"
185+
)
186+
187+
For the ``manyIndexes`` collection, set:
188+
189+
- The namespace to ``shardDistributionDB.manyIndexes``,
190+
- The lower bound to :bsontype:`MinKey`,
191+
- The upper bound to :bsontype:`MaxKey`,
192+
- The zone to ``HI_RAM``
193+
194+
.. code-block:: javascript
195+
196+
sh.updateZoneKeyRange(
197+
"shardDistributionDB.manyIndexes",
198+
{ "_id" : MinKey },
199+
{ "_id" : MaxKey },
200+
"HI_RAM"
201+
)
202+
203+
.. step:: Shard the collections.
204+
205+
To shard both collections (``bigData`` and ``manyIndexes``),
206+
specify a :ref:`shard key <shard-key>` of ``{ _id: "hashed" }``.
207+
208+
Run the following commands:
209+
210+
.. code-block:: javascript
211+
212+
sh.shardCollection(
213+
"shardDistributionDB.bigData", { _id: "hashed" }
214+
)
215+
216+
sh.shardCollection(
217+
"shardDistributionDB.manyIndexes", { _id: "hashed" }
218+
)
219+
220+
.. step:: Review the changes.
221+
222+
To view chunk distribution and shard zones, use the
223+
:method:`sh.status()` method:
224+
225+
.. code-block:: javascript
226+
227+
sh.status()
228+
229+
The next time the :ref:`balancer <sharding-balancing>` runs, it
230+
splits chunks where necessary and migrates chunks across the
231+
shards, respecting the configured zones. The amount of time the
232+
balancer takes to complete depends on several factors, including
233+
number of shards, available memory, and
234+
:abbr:`IOPS (Input/Output Operations Per Second)`.
235+
236+
When balancing finishes:
237+
238+
- Chunks for documents in the ``manyIndexes`` collection reside on
239+
``shard0`` and ``shard2``
240+
241+
- Chunks for documents in the ``bigData`` collection reside on
242+
``shard1`` and ``shard2``.
243+
244+
Learn More
245+
----------
246+
247+
To learn more about sharding and balancing, see the following pages:
248+
249+
- :ref:`sharding-data-partitioning`
250+
- :ref:`index-type-hashed`
251+
- :ref:`sharding-manage-zones`
252+
- :ref:`sharding-shards`

0 commit comments

Comments
 (0)