-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Migrate splitting chunks - DOCS-304 #140
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
8eb8c5d
861d112
0e967e2
e673848
236163e
f11feef
06dcd61
b5942b7
519432a
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -125,7 +125,7 @@ more detail or use the following procedure as a quick starting point: | |
replicaSetName/<seed1>,<seed2>,<seed3> | ||
|
||
For example, if the name of the replica set is "``repl0``", then | ||
your :func:`sh.addShard` command would be: | ||
your :method:`sh.addShard` command would be: | ||
|
||
.. code-block:: javascript | ||
|
||
|
@@ -136,7 +136,7 @@ more detail or use the following procedure as a quick starting point: | |
MongoDB enables sharding on a per-database basis. This is only a | ||
meta-data change and will not redistribute your data. To enable | ||
sharding for a given database, use the :dbcommand:`enableSharding` | ||
command or the :func:`sh.enableSharding()` shell function. | ||
command or the :method:`sh.enableSharding()` shell function. | ||
|
||
.. code-block:: javascript | ||
|
||
|
@@ -165,7 +165,7 @@ more detail or use the following procedure as a quick starting point: | |
collections must belong to a database for which you have enabled | ||
sharding. When you shard a collection, you also choose the shard | ||
key. To shard a collection, run the :dbcommand:`shardCollection` | ||
command or the :func:`sh.shardCollection()` shell helper. | ||
command or the :method:`sh.shardCollection()` shell helper. | ||
|
||
.. code-block:: javascript | ||
|
||
|
@@ -223,7 +223,7 @@ procedure: | |
|
||
#. First, you need to tell the cluster where to find the individual | ||
shards. You can do this using the :dbcommand:`addShard` command or | ||
the :func:`sh.addShard()` helper: | ||
the :method:`sh.addShard()` helper: | ||
|
||
.. code-block:: javascript | ||
|
||
|
@@ -266,7 +266,7 @@ procedure: | |
following form: the replica set name, followed by a forward | ||
slash, followed by a comma-separated list of seeds for the | ||
replica set. For example, if the name of the replica set is | ||
"myapp1", then your :func:`sh.addShard` command might resemble: | ||
"myapp1", then your :method:`sh.addShard` command might resemble: | ||
|
||
.. code-block:: javascript | ||
|
||
|
@@ -312,7 +312,7 @@ The procedure to remove a shard is as follows: | |
shard name when you first ran the :dbcommand:`addShard` command. If not, | ||
you can find out the name of the shard by running the | ||
:dbcommand:`listshards` or :dbcommand:`printShardingStatus` | ||
commands or the :func:`sh.status()` shell helper. | ||
commands or the :method:`sh.status()` shell helper. | ||
|
||
The following examples will remove a shard named ``mongodb0`` from the cluster. | ||
|
||
|
@@ -404,41 +404,51 @@ stop the processes comprising the ``mongodb0`` shard. | |
Chunk Management | ||
---------------- | ||
|
||
This section describes various operations on | ||
:term:`chunks <chunk>` in :term:`shard clusters <shard cluster>`. In | ||
most cases MongoDB automates these processes; however, in some cases, | ||
particularly when you're setting up a shard cluster, you may | ||
need to create and manipulate chunks directly. | ||
This section describes various operations on :term:`chunks <chunk>` in | ||
:term:`shard clusters <shard cluster>`. MongoDB automates these | ||
processes; however, in some cases, particularly when you're setting up | ||
a shard cluster, you may need to create and manipulate chunks | ||
directly. | ||
|
||
.. _sharding-procedure-create-split: | ||
|
||
Splitting Chunks | ||
~~~~~~~~~~~~~~~~ | ||
|
||
Normally, MongoDB splits a :term:`chunk` following inserts or updates | ||
when a chunk exceeds the :ref:`chunk size <sharding-chunk-size>`. | ||
Normally, MongoDB splits a :term:`chunk` following inserts when a | ||
chunk exceeds the :ref:`chunk size <sharding-chunk-size>`. Recently | ||
split chunks may be moved immediately to a new shard if | ||
:program:`mongos` predicts future insertions will benefit from the | ||
move. | ||
|
||
The MongoDB treats all chunks the same, whether split manually or | ||
automatically by the system. | ||
|
||
.. warning:: | ||
|
||
You cannot merge or combine chunks once you have split them. | ||
|
||
You may want to split chunks manually if: | ||
|
||
- you have a large amount of data in your cluster that is *not* split, | ||
as is the case after creating a shard cluster with existing data. | ||
- you have a large amount of data in your cluster and very few | ||
:term:`chunks <chunk>`, | ||
as is the case after creating a shard cluster from existing data. | ||
|
||
- you expect to add a large amount of data that would | ||
initially reside in a single chunk or shard. | ||
|
||
.. example:: | ||
|
||
You plan to insert a large amount of data as the result of an | ||
import process with :term:`shard key` values between ``300`` and | ||
``400``, *but* all values of your shard key between ``250`` and | ||
``500`` are within a single chunk. | ||
You plan to insert a large amount of data with :term:`shard key` | ||
values between ``300`` and ``400``, *but* all values of your shard | ||
keys are between ``250`` and ``500`` are in a single chunk. | ||
|
||
Use :func:`sh.status()` to determine the current chunks ranges across | ||
the cluster. | ||
To determine the current chunk ranges across the cluster, use | ||
:method:`sh.status()` or :method:`db.printShardingStatus()`. | ||
|
||
To split chunks manually, use either the :func:`sh.splitAt()` or | ||
:func:`sh.splitFind()` helpers in the :program:`mongo` shell. | ||
These helpers wrap the :dbcommand:`split` command. | ||
To split chunks manually, use the :dbcommand:`split` command with | ||
operators: ``middle`` and ``find``. The equivalent shell helpers are | ||
:method:`sh.splitAt()` or :method:`sh.splitFind()`. | ||
|
||
.. example:: | ||
|
||
|
@@ -449,29 +459,97 @@ These helpers wrap the :dbcommand:`split` command. | |
|
||
sh.splitFind( { "zipcode": 63109 } ) | ||
|
||
:func:`sh.splitFind()` will split the chunk that contains the *first* document returned | ||
that matches this query into two equal components. MongoDB will split | ||
the chunk so that documents that have half of the shard keys in will | ||
be in one chunk and the documents that have other half of the shard | ||
keys will be a second chunk. The query in :func:`sh.splitFind()` need | ||
not contain the shard key, though it almost always makes sense to | ||
query for the shard key in this case, and including the shard key will | ||
expedite the operation. | ||
:method:`sh.splitFind()` will split the chunk containing the queried | ||
document into two equal sized chunks, dividing the chunk using the | ||
balancer algorithm. The :method:`sh.splitFind()` query may not be based | ||
on the shard key, but it makes sense to use the shard key, and | ||
including the shard key will expedite the operation. | ||
|
||
Use :method:`sh.splitAt()` to split a chunk in two using the queried | ||
document as the partition point: | ||
|
||
.. code-block:: javascript | ||
|
||
sh.splitAt( { "zipcode": 63109 } ) | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. we can change the form of this admonition, but I think it's a crucial point, and it needs to remain in the document. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. it's still in the document. this line has been moved up in the document to line 427. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. right. sorry. |
||
However, the location of the document that this query finds with | ||
respect to the other documents in the chunk does not affect how the | ||
chunk splits. | ||
|
||
Use :func:`sh.splitAt()` to split a chunk in two using the queried | ||
document as the partition point: | ||
Pre-Split | ||
~~~~~~~~~ | ||
|
||
.. code-block:: javascript | ||
Splitting chunks will improve shard cluster performance when importing | ||
data such as: | ||
|
||
sh.splitAt( { "zipcode": 63109 } ) | ||
- migrating data from another system to a shard cluster | ||
|
||
.. warning:: | ||
- performing a full shard cluster restore from back up | ||
|
||
You cannot merge or combine chunks once you have split them. | ||
In such cases, data import to an empty MongoDB shard cluster can | ||
become slower than to a replica set. The reason for this is how | ||
MongoDB writes data and manage chunks in a shard cluster. | ||
|
||
MongoDB inserts data into a chunk until it becomes large and splits on | ||
the same shard. If the balancer notices a chunk imbalance between | ||
shards, a migration process will begin to evenly distribute chunks. | ||
|
||
Migrating chunks between shards is extremely resource intensive as | ||
shard members must notify, copy, update, and delete chunks between | ||
each other and the configuration database. With a high volume import, | ||
this migration process can reduce system performance so that imports | ||
cannot occur because of migrations. | ||
|
||
To improve import performance, manually split and migrate chunks in an | ||
empty shard cluster beforehand. This allows the shard cluster to only | ||
write import data instead of trying to manage migrations and write | ||
data. | ||
|
||
To prepare your shard cluster for data import, split and migrate | ||
empty chunks. | ||
|
||
#. Split empty chunks in your collection by manually performing | ||
:dbcommand:`split` command on chunks. | ||
|
||
.. example:: | ||
|
||
To pre-split chunks for 100 million user profiles sharded by | ||
email address for 5 shards, run the following commands in the | ||
mongo shell: | ||
|
||
.. code-block:: javascript | ||
|
||
for ( var x=97; x<97+26; x++ ){ | ||
for( var y=97; y<97+26; y+=6 ) { | ||
var prefix = String.fromCharCode(x) + String.fromCharCode(y); | ||
db.runCommand( { split : "myapp.users" , middle : { email : prefix } } ); | ||
} | ||
} | ||
|
||
#. Move chunks to different shard manually using the | ||
:dbcommand:`moveChunk` command. | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. syntax error. |
||
.. example:: | ||
|
||
To migrate all the chunks created for 100 million user profiles | ||
evenly, putting each prefix chunk on the next shard from the | ||
other , run the following commands in the mongo shell: | ||
|
||
.. code-block:: javascript | ||
|
||
var shServer = [ "sh0.example.net", "sh1.example.net", "sh2.example.net", "sh3.example.net", "sh4.example.net" ]; | ||
for ( var x=97; x<97+26; x++ ){ | ||
for( var y=97; y<97+26; y+=6 ) { | ||
var prefix = String.fromCharCode(x) + String.fromCharCode(y); | ||
db.adminCommand({moveChunk : "myapp.users", find : {email : prefix}, to : shServer[(y-97)/6]}) | ||
} | ||
} | ||
|
||
Optionally, you can also let the balancer automatically | ||
redistribute chunks in your shard cluster. | ||
|
||
When empty chunks are distributed, data import can occur with multiple | ||
:program:`mongos` instances, improving overall import performance. | ||
|
||
.. _sharding-balancing-modify-chunk-size: | ||
|
||
|
@@ -496,7 +574,7 @@ To modify the chunk size, use the following procedure: | |
|
||
use config | ||
|
||
#. Issue the following :func:`save() <db.collection.save()>` | ||
#. Issue the following :method:`save() <db.collection.save()>` | ||
operation: | ||
|
||
.. code-block:: javascript | ||
|
@@ -640,7 +718,7 @@ Here's what this tells you: | |
|
||
.. note:: | ||
|
||
Use the :func:`sh.isBalancerRunning()` helper in the | ||
Use the :method:`sh.isBalancerRunning()` helper in the | ||
:program:`mongo` shell to determine if the balancer is | ||
running, as follows: | ||
|
||
|
@@ -668,7 +746,7 @@ be able to migrate chunks: | |
|
||
use config | ||
|
||
#. Use an operation modeled on the following example :func:`update() | ||
#. Use an operation modeled on the following example :method:`update() | ||
<db.collection.update()>` operation to modify the balancer's | ||
window: | ||
|
||
|
@@ -722,10 +800,10 @@ all migration, use the following procedure: | |
|
||
If a balancing round is in progress, the system will complete the | ||
current round before the balancer is officially disabled. After | ||
disabling, you can use the :func:`sh.getBalancerState()` shell | ||
disabling, you can use the :method:`sh.getBalancerState()` shell | ||
function to determine whether the balancer is in fact disabled. | ||
|
||
The above process and the :func:`sh.setBalancerState()` helper provide a | ||
The above process and the :method:`sh.setBalancerState()` helper provide a | ||
wrapper on the following process, which may be useful if you need to | ||
run this operation from a driver that does not have helper functions: | ||
|
||
|
@@ -1090,6 +1168,6 @@ operation: | |
for the duration of the backup procedure. | ||
|
||
Confirm that the balancer is not active using the | ||
:func:`sh.getBalancerState()` method before starting a backup | ||
:method:`sh.getBalancerState()` method before starting a backup | ||
operation. When the backup procedure is complete you can reactivate | ||
the balancer process. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
re-flowing these paragraphs makes reviewing more difficult.
avoid doing this unless you making a significant substantive change, otherwise it wastes time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the 2nd line edited.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I see now.
I think asserting:
Is rhetorically confusing. The original isn't perfect by any means, but let's revert unless there's another good solution given that this is orthogonal to the project of getting the page redirected.