Skip to content

Commit 6b910d4

Browse files
author
Sam Kleinman
committed
merge/edits: DOCS-306
2 parents dea325e + 519432a commit 6b910d4

File tree

1 file changed

+104
-26
lines changed

1 file changed

+104
-26
lines changed

source/administration/sharding.txt

Lines changed: 104 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -407,41 +407,51 @@ stop the processes comprising the ``mongodb0`` shard.
407407
Chunk Management
408408
----------------
409409

410-
This section describes various operations on
411-
:term:`chunks <chunk>` in :term:`shard clusters <shard cluster>`. In
412-
most cases MongoDB automates these processes; however, in some cases,
413-
particularly when you're setting up a shard cluster, you may
414-
need to create and manipulate chunks directly.
410+
This section describes various operations on :term:`chunks <chunk>` in
411+
:term:`shard clusters <shard cluster>`. MongoDB automates these
412+
processes; however, in some cases, particularly when you're setting up
413+
a shard cluster, you may need to create and manipulate chunks
414+
directly.
415415

416416
.. _sharding-procedure-create-split:
417417

418418
Splitting Chunks
419419
~~~~~~~~~~~~~~~~
420420

421-
Normally, MongoDB splits a :term:`chunk` following inserts or updates
422-
when a chunk exceeds the :ref:`chunk size <sharding-chunk-size>`.
421+
Normally, MongoDB splits a :term:`chunk` following inserts when a
422+
chunk exceeds the :ref:`chunk size <sharding-chunk-size>`. Recently
423+
split chunks may be moved immediately to a new shard if
424+
:program:`mongos` predicts future insertions will benefit from the
425+
move.
426+
427+
The MongoDB treats all chunks the same, whether split manually or
428+
automatically by the system.
429+
430+
.. warning::
431+
432+
You cannot merge or combine chunks once you have split them.
423433

424434
You may want to split chunks manually if:
425435

426-
- you have a large amount of data in your cluster that is *not* split,
427-
as is the case after creating a shard cluster with existing data.
436+
- you have a large amount of data in your cluster and very few
437+
:term:`chunks <chunk>`,
438+
as is the case after creating a shard cluster from existing data.
428439

429440
- you expect to add a large amount of data that would
430441
initially reside in a single chunk or shard.
431442

432443
.. example::
433444

434-
You plan to insert a large amount of data as the result of an
435-
import process with :term:`shard key` values between ``300`` and
436-
``400``, *but* all values of your shard key between ``250`` and
437-
``500`` are within a single chunk.
445+
You plan to insert a large amount of data with :term:`shard key`
446+
values between ``300`` and ``400``, *but* all values of your shard
447+
keys are between ``250`` and ``500`` are in a single chunk.
438448

439449
Use :method:`sh.status()` to determine the current chunks ranges across
440450
the cluster.
441451

442-
To split chunks manually, use either the :method:`sh.splitAt()` or
443-
:method:`sh.splitFind()` helpers in the :program:`mongo` shell.
444-
These helpers wrap the :dbcommand:`split` command.
452+
To split chunks manually, use the :dbcommand:`split` command with
453+
operators: ``middle`` and ``find``. The equivalent shell helpers are
454+
:method:`sh.splitAt()` or :method:`sh.splitFind()`.
445455

446456
.. example::
447457

@@ -453,28 +463,96 @@ These helpers wrap the :dbcommand:`split` command.
453463
sh.splitFind( { "zipcode": 63109 } )
454464

455465
:method:`sh.splitFind()` will split the chunk that contains the *first* document returned
456-
that matches this query into two equal components. MongoDB will split
457-
the chunk so that documents that have half of the shard keys in will
458-
be in one chunk and the documents that have other half of the shard
459-
keys will be a second chunk. The query in :method:`sh.splitFind()` need
466+
that matches this query into two equally sized chunks. The query in :method:`sh.splitFind()` need
460467
not contain the shard key, though it almost always makes sense to
461468
query for the shard key in this case, and including the shard key will
462469
expedite the operation.
463470

464-
However, the location of the document that this query finds with
465-
respect to the other documents in the chunk does not affect how the
466-
chunk splits.
467-
468471
Use :method:`sh.splitAt()` to split a chunk in two using the queried
469472
document as the partition point:
470473

471474
.. code-block:: javascript
472475

473476
sh.splitAt( { "zipcode": 63109 } )
474477

475-
.. warning::
478+
However, the location of the document that this query finds with
479+
respect to the other documents in the chunk does not affect how the
480+
chunk splits.
476481

477-
You cannot merge or combine chunks once you have split them.
482+
.. _sharding-administration-pre-splitting:
483+
.. _sharding-administration-create-chunks:
484+
485+
Create Chunks (Pre-Splitting)
486+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
487+
488+
In most situations a :term:`shard cluster` will create and distribute
489+
chunks automatically without user intervention. However, in a limited
490+
number of use profiles, MongoDB cannot create enough chunks or
491+
distribute data fast enough to support required throughput. Consider
492+
the following scenarios:
493+
494+
- you must partition an existing data collection that resides on a
495+
single shard.
496+
497+
- you must ingest a large volume of data into a shard cluster that
498+
isn't balanced, or where the ingestion of data will lead to an
499+
imbalance of data.
500+
501+
This can arise in an initial data loading, or in a case where you
502+
must insert a large volume of data into a single chunk, as is the
503+
case when you must insert at the beginning or end of the chunk
504+
range, as is the case for monotonically increasing or decreasing
505+
shard keys.
506+
507+
Preemptively splitting chunks increases cluster throughput for these
508+
operations, by reducing the overhead of migrating chunks that hold
509+
data during the write operation. MongoDB only creates splits after an
510+
insert operation, and can only migrate a single chunk at a time. Chunk
511+
migrations are resource intensive and further complicated by large
512+
write volume to the migrating chunk.
513+
514+
To create and migrate chunks manually, use the following procedure:
515+
516+
#. Split empty chunks in your collection by manually performing
517+
:dbcommand:`split` command on chunks.
518+
519+
.. example::
520+
521+
To create chunks for documents in the ``myapp.users``
522+
collection, using the ``email`` field as the :term:`shard key`,
523+
use the following operation in the :program:`mongo` shell:
524+
525+
.. code-block:: javascript
526+
527+
for ( var x=97; x<97+26; x++ ){
528+
for( var y=97; y<97+26; y+=6 ) {
529+
var prefix = String.fromCharCode(x) + String.fromCharCode(y);
530+
db.runCommand( { split : "myapp.users" , middle : { email : prefix } } );
531+
}
532+
}
533+
534+
This assumes a collection size of 100 million documents.
535+
536+
#. Migrate chunks manually using the :dbcommand:`moveChunk` command:
537+
538+
.. example::
539+
540+
To migrate all of the manually created user profiles evenly,
541+
putting each prefix chunk on the next shard from the other, run
542+
the following commands in the mongo shell:
543+
544+
.. code-block:: javascript
545+
546+
var shServer = [ "sh0.example.net", "sh1.example.net", "sh2.example.net", "sh3.example.net", "sh4.example.net" ];
547+
for ( var x=97; x<97+26; x++ ){
548+
for( var y=97; y<97+26; y+=6 ) {
549+
var prefix = String.fromCharCode(x) + String.fromCharCode(y);
550+
db.adminCommand({moveChunk : "myapp.users", find : {email : prefix}, to : shServer[(y-97)/6]})
551+
}
552+
}
553+
554+
You can also let the balancer automatically distribute the new
555+
chunks.
478556

479557
.. _sharding-balancing-modify-chunk-size:
480558

0 commit comments

Comments
 (0)