From 2f65486f31af6918ada96fc82a6ce520ee1502b6 Mon Sep 17 00:00:00 2001 From: Bob Grabar Date: Fri, 1 Feb 2013 16:45:03 -0500 Subject: [PATCH 1/2] DOCS-983 splitting chunks --- source/reference/command/split.txt | 20 +++++++++++++------ source/reference/command/splitChunk.txt | 10 ++++++++++ .../manage-chunks-in-sharded-cluster.txt | 18 ++++++++++++++--- 3 files changed, 39 insertions(+), 9 deletions(-) diff --git a/source/reference/command/split.txt b/source/reference/command/split.txt index 635d8900321..24390d3ed35 100644 --- a/source/reference/command/split.txt +++ b/source/reference/command/split.txt @@ -12,12 +12,20 @@ split this command makes it possible for administrators to manually create splits. - .. admonition:: In normal operation there is no need to manually split chunks - - The :term:`balancer` and other sharding infrastructure will - automatically create chunks in the course of normal - operations. See :doc:`/core/sharded-cluster-internals` for more - information. + In normal operation there is no need to manually split chunks. The + :term:`balancer` and other sharding infrastructure will automatically + create chunks in the course of normal operations. See + :doc:`/core/sharded-cluster-internals` for more information. + +.. warning:: + + Be careful when splitting chunks. When you shard a collection that + has existing data, MongoDB automatically creates chunks to evenly + spread the collection. Performing additional splits requires + knowledge of the resulting chunk sizes by numbers of documents and by + size. You do not want splits that cause some chunks to be much larger + than others. This leads to balancing based on count of chunks, not on + their size, which may cause extreme load/data-distribution problems. Consider the following example: diff --git a/source/reference/command/splitChunk.txt b/source/reference/command/splitChunk.txt index 181bae6a39c..27428e0c881 100644 --- a/source/reference/command/splitChunk.txt +++ b/source/reference/command/splitChunk.txt @@ -10,4 +10,14 @@ splitChunk :method:`sh.splitFind()` and :method:`sh.splitAt()` functions in the :program:`mongo` shell to access this functionality. +.. warning:: + + Be careful when splitting chunks. When you shard a collection that + has existing data, MongoDB automatically creates chunks to evenly + spread the collection. Performing additional splits requires + knowledge of the resulting chunk sizes by numbers of documents and by + size. You do not want splits that cause some chunks to be much larger + than others. This leads to balancing based on count of chunks, not on + their size, which may cause extreme load/data-distribution problems. + .. admin-only. diff --git a/source/tutorial/manage-chunks-in-sharded-cluster.txt b/source/tutorial/manage-chunks-in-sharded-cluster.txt index ad4d775384c..8befe7d6d3c 100644 --- a/source/tutorial/manage-chunks-in-sharded-cluster.txt +++ b/source/tutorial/manage-chunks-in-sharded-cluster.txt @@ -58,6 +58,16 @@ You may want to split chunks manually if: values between ``300`` and ``400``, *but* all values of your shard keys are between ``250`` and ``500`` are in a single chunk. +.. warning:: + + Be careful when splitting chunks. When you shard a collection that + has existing data, MongoDB automatically creates chunks to evenly + spread the collection. Performing additional splits requires + knowledge of the resulting chunk sizes by numbers of documents and by + size. You do not want splits that cause some chunks to be much larger + than others. This leads to balancing based on count of chunks, not on + their size, which may cause extreme load/data-distribution problems. + Use :method:`sh.status()` to determine the current chunks ranges across the cluster. @@ -101,11 +111,13 @@ chunk splits. Create Chunks (Pre-Splitting) ----------------------------- +Pre-splitting lets you preemptively split chunks in an empty collection +and is used *only* in certain situations. In most situations a :term:`sharded cluster` will create and distribute chunks automatically without user intervention. However, in a limited number of use profiles, MongoDB cannot create enough chunks or -distribute data fast enough to support required throughput. Consider -the following scenarios: +distribute data fast enough to support required throughput. +For example, if: - you must partition an existing data collection that resides on a single shard. @@ -123,7 +135,7 @@ the following scenarios: Preemptively splitting chunks increases cluster throughput for these operations, by reducing the overhead of migrating chunks that hold data during the write operation. MongoDB only creates splits after an -insert operation, and can only migrate a single chunk at a time. Chunk +insert operation and can migrate only a single chunk at a time. Chunk migrations are resource intensive and further complicated by large write volume to the migrating chunk. From 1fd9ce858f8843a9e3ec9dabf0d66191e3fac8b7 Mon Sep 17 00:00:00 2001 From: Bob Grabar Date: Fri, 8 Feb 2013 11:32:19 -0500 Subject: [PATCH 2/2] DOCS-983 splitting chunks review edits --- source/includes/warning-splitting-chunks.rst | 9 +++++++++ source/reference/command/split.txt | 18 ++++-------------- source/reference/command/splitChunk.txt | 10 +--------- .../manage-chunks-in-sharded-cluster.txt | 10 +--------- 4 files changed, 15 insertions(+), 32 deletions(-) create mode 100644 source/includes/warning-splitting-chunks.rst diff --git a/source/includes/warning-splitting-chunks.rst b/source/includes/warning-splitting-chunks.rst new file mode 100644 index 00000000000..61c613bb2e3 --- /dev/null +++ b/source/includes/warning-splitting-chunks.rst @@ -0,0 +1,9 @@ +.. warning:: + + Be careful when splitting chunks. When you shard a collection that + has existing data, MongoDB automatically creates chunks to evenly + spread the collection. Performing additional splits requires + knowledge of the resulting chunk sizes by numbers of documents and by + size. You do not want splits that cause some chunks to be much larger + than others. This leads to balancing based on count of chunks, not on + their size, which may cause extreme load/data-distribution problems. diff --git a/source/reference/command/split.txt b/source/reference/command/split.txt index 24390d3ed35..e6a264670ea 100644 --- a/source/reference/command/split.txt +++ b/source/reference/command/split.txt @@ -12,20 +12,10 @@ split this command makes it possible for administrators to manually create splits. - In normal operation there is no need to manually split chunks. The - :term:`balancer` and other sharding infrastructure will automatically - create chunks in the course of normal operations. See - :doc:`/core/sharded-cluster-internals` for more information. - -.. warning:: - - Be careful when splitting chunks. When you shard a collection that - has existing data, MongoDB automatically creates chunks to evenly - spread the collection. Performing additional splits requires - knowledge of the resulting chunk sizes by numbers of documents and by - size. You do not want splits that cause some chunks to be much larger - than others. This leads to balancing based on count of chunks, not on - their size, which may cause extreme load/data-distribution problems. + In most clusters, MongoDB will manage all chunk creation and + distribution operations without manual intervention. + +.. include:: /includes/warning-splitting-chunks.rst Consider the following example: diff --git a/source/reference/command/splitChunk.txt b/source/reference/command/splitChunk.txt index 27428e0c881..e3f4866210e 100644 --- a/source/reference/command/splitChunk.txt +++ b/source/reference/command/splitChunk.txt @@ -10,14 +10,6 @@ splitChunk :method:`sh.splitFind()` and :method:`sh.splitAt()` functions in the :program:`mongo` shell to access this functionality. -.. warning:: - - Be careful when splitting chunks. When you shard a collection that - has existing data, MongoDB automatically creates chunks to evenly - spread the collection. Performing additional splits requires - knowledge of the resulting chunk sizes by numbers of documents and by - size. You do not want splits that cause some chunks to be much larger - than others. This leads to balancing based on count of chunks, not on - their size, which may cause extreme load/data-distribution problems. +.. include:: /includes/warning-splitting-chunks.rst .. admin-only. diff --git a/source/tutorial/manage-chunks-in-sharded-cluster.txt b/source/tutorial/manage-chunks-in-sharded-cluster.txt index 8befe7d6d3c..a99a75caf6e 100644 --- a/source/tutorial/manage-chunks-in-sharded-cluster.txt +++ b/source/tutorial/manage-chunks-in-sharded-cluster.txt @@ -58,15 +58,7 @@ You may want to split chunks manually if: values between ``300`` and ``400``, *but* all values of your shard keys are between ``250`` and ``500`` are in a single chunk. -.. warning:: - - Be careful when splitting chunks. When you shard a collection that - has existing data, MongoDB automatically creates chunks to evenly - spread the collection. Performing additional splits requires - knowledge of the resulting chunk sizes by numbers of documents and by - size. You do not want splits that cause some chunks to be much larger - than others. This leads to balancing based on count of chunks, not on - their size, which may cause extreme load/data-distribution problems. +.. include:: /includes/warning-splitting-chunks.rst Use :method:`sh.status()` to determine the current chunks ranges across the cluster.