@@ -407,41 +407,51 @@ stop the processes comprising the ``mongodb0`` shard.
407
407
Chunk Management
408
408
----------------
409
409
410
- This section describes various operations on
411
- :term:`chunks <chunk>` in :term:` shard clusters <shard cluster>`. In
412
- most cases MongoDB automates these processes; however, in some cases,
413
- particularly when you're setting up a shard cluster, you may
414
- need to create and manipulate chunks directly.
410
+ This section describes various operations on :term:`chunks <chunk>` in
411
+ :term:`shard clusters <shard cluster>`. MongoDB automates these
412
+ processes; however, in some cases, particularly when you're setting up
413
+ a shard cluster, you may need to create and manipulate chunks
414
+ directly.
415
415
416
416
.. _sharding-procedure-create-split:
417
417
418
418
Splitting Chunks
419
419
~~~~~~~~~~~~~~~~
420
420
421
- Normally, MongoDB splits a :term:`chunk` following inserts or updates
422
- when a chunk exceeds the :ref:`chunk size <sharding-chunk-size>`.
421
+ Normally, MongoDB splits a :term:`chunk` following inserts when a
422
+ chunk exceeds the :ref:`chunk size <sharding-chunk-size>`. Recently
423
+ split chunks may be moved immediately to a new shard if
424
+ :program:`mongos` predicts future insertions will benefit from the
425
+ move.
426
+
427
+ The MongoDB treats all chunks the same, whether split manually or
428
+ automatically by the system.
429
+
430
+ .. warning::
431
+
432
+ You cannot merge or combine chunks once you have split them.
423
433
424
434
You may want to split chunks manually if:
425
435
426
- - you have a large amount of data in your cluster that is *not* split,
427
- as is the case after creating a shard cluster with existing data.
436
+ - you have a large amount of data in your cluster and very few
437
+ :term:`chunks <chunk>`,
438
+ as is the case after creating a shard cluster from existing data.
428
439
429
440
- you expect to add a large amount of data that would
430
441
initially reside in a single chunk or shard.
431
442
432
443
.. example::
433
444
434
- You plan to insert a large amount of data as the result of an
435
- import process with :term:`shard key` values between ``300`` and
436
- ``400``, *but* all values of your shard key between ``250`` and
437
- ``500`` are within a single chunk.
445
+ You plan to insert a large amount of data with :term:`shard key`
446
+ values between ``300`` and ``400``, *but* all values of your shard
447
+ keys are between ``250`` and ``500`` are in a single chunk.
438
448
439
449
Use :method:`sh.status()` to determine the current chunks ranges across
440
450
the cluster.
441
451
442
- To split chunks manually, use either the :method:`sh.splitAt()` or
443
- :method:`sh.splitFind()` helpers in the :program:`mongo` shell.
444
- These helpers wrap the :dbcommand:`split` command .
452
+ To split chunks manually, use the :dbcommand:`split` command with
453
+ operators: ``middle`` and ``find``. The equivalent shell helpers are
454
+ :method:`sh.splitAt()` or :method:`sh.splitFind()` .
445
455
446
456
.. example::
447
457
@@ -453,28 +463,96 @@ These helpers wrap the :dbcommand:`split` command.
453
463
sh.splitFind( { "zipcode": 63109 } )
454
464
455
465
:method:`sh.splitFind()` will split the chunk that contains the *first* document returned
456
- that matches this query into two equal components. MongoDB will split
457
- the chunk so that documents that have half of the shard keys in will
458
- be in one chunk and the documents that have other half of the shard
459
- keys will be a second chunk. The query in :method:`sh.splitFind()` need
466
+ that matches this query into two equally sized chunks. The query in :method:`sh.splitFind()` need
460
467
not contain the shard key, though it almost always makes sense to
461
468
query for the shard key in this case, and including the shard key will
462
469
expedite the operation.
463
470
464
- However, the location of the document that this query finds with
465
- respect to the other documents in the chunk does not affect how the
466
- chunk splits.
467
-
468
471
Use :method:`sh.splitAt()` to split a chunk in two using the queried
469
472
document as the partition point:
470
473
471
474
.. code-block:: javascript
472
475
473
476
sh.splitAt( { "zipcode": 63109 } )
474
477
475
- .. warning::
478
+ However, the location of the document that this query finds with
479
+ respect to the other documents in the chunk does not affect how the
480
+ chunk splits.
476
481
477
- You cannot merge or combine chunks once you have split them.
482
+ .. _sharding-administration-pre-splitting:
483
+ .. _sharding-administration-create-chunks:
484
+
485
+ Create Chunks (Pre-Splitting)
486
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
487
+
488
+ In most situations a :term:`shard cluster` will create and distribute
489
+ chunks automatically without user intervention. However, in a limited
490
+ number of use profiles, MongoDB cannot create enough chunks or
491
+ distribute data fast enough to support required throughput. Consider
492
+ the following scenarios:
493
+
494
+ - you must partition an existing data collection that resides on a
495
+ single shard.
496
+
497
+ - you must ingest a large volume of data into a shard cluster that
498
+ isn't balanced, or where the ingestion of data will lead to an
499
+ imbalance of data.
500
+
501
+ This can arise in an initial data loading, or in a case where you
502
+ must insert a large volume of data into a single chunk, as is the
503
+ case when you must insert at the beginning or end of the chunk
504
+ range, as is the case for monotonically increasing or decreasing
505
+ shard keys.
506
+
507
+ Preemptively splitting chunks increases cluster throughput for these
508
+ operations, by reducing the overhead of migrating chunks that hold
509
+ data during the write operation. MongoDB only creates splits after an
510
+ insert operation, and can only migrate a single chunk at a time. Chunk
511
+ migrations are resource intensive and further complicated by large
512
+ write volume to the migrating chunk.
513
+
514
+ To create and migrate chunks manually, use the following procedure:
515
+
516
+ #. Split empty chunks in your collection by manually performing
517
+ :dbcommand:`split` command on chunks.
518
+
519
+ .. example::
520
+
521
+ To create chunks for documents in the ``myapp.users``
522
+ collection, using the ``email`` field as the :term:`shard key`,
523
+ use the following operation in the :program:`mongo` shell:
524
+
525
+ .. code-block:: javascript
526
+
527
+ for ( var x=97; x<97+26; x++ ){
528
+ for( var y=97; y<97+26; y+=6 ) {
529
+ var prefix = String.fromCharCode(x) + String.fromCharCode(y);
530
+ db.runCommand( { split : "myapp.users" , middle : { email : prefix } } );
531
+ }
532
+ }
533
+
534
+ This assumes a collection size of 100 million documents.
535
+
536
+ #. Migrate chunks manually using the :dbcommand:`moveChunk` command:
537
+
538
+ .. example::
539
+
540
+ To migrate all of the manually created user profiles evenly,
541
+ putting each prefix chunk on the next shard from the other, run
542
+ the following commands in the mongo shell:
543
+
544
+ .. code-block:: javascript
545
+
546
+ var shServer = [ "sh0.example.net", "sh1.example.net", "sh2.example.net", "sh3.example.net", "sh4.example.net" ];
547
+ for ( var x=97; x<97+26; x++ ){
548
+ for( var y=97; y<97+26; y+=6 ) {
549
+ var prefix = String.fromCharCode(x) + String.fromCharCode(y);
550
+ db.adminCommand({moveChunk : "myapp.users", find : {email : prefix}, to : shServer[(y-97)/6]})
551
+ }
552
+ }
553
+
554
+ You can also let the balancer automatically distribute the new
555
+ chunks.
478
556
479
557
.. _sharding-balancing-modify-chunk-size:
480
558
0 commit comments