@@ -770,8 +770,11 @@ to pre-splitting.
770
770
.. index:: bulk insert
771
771
.. _sharding-bulk-inserts:
772
772
773
- Bulk Inserts and Sharding
774
- ~~~~~~~~~~~~~~~~~~~~~~~~~
773
+ Bulk Insert Strategies
774
+ ~~~~~~~~~~~~~~~~~~~~~~
775
+
776
+ .. todo Consider moving to the administrative guide as it's of an applied nature,
777
+ or create an applications document for sharding
775
778
776
779
.. todo link the words "bulk insert" to the bulk insert topic when it's
777
780
published
@@ -788,53 +791,43 @@ the following:
788
791
:program:`mongos` instances. If the collection is empty, pre-split
789
792
first, as described in :ref:`sharding-administration-pre-splitting`.
790
793
791
- Monotonically Increasing Shard Key Values
792
- `````````````````````````````````````````
793
-
794
- If your shard key monotonically increases during an insert then all the
795
- inserts will go to the last chunk in the collection. The system will
796
- adjust the metadata to keep balance, but at a given time ``t`` all
797
- writes will be going to a single shard, which is undesirable if insert
798
- rate is extremely large. A large insert is one in which the insert
799
- volume is beyond the range that a single shard can process at a given
800
- point in time. Increasing values are fine if the insert volume is within
801
- the range the shard can process.
802
-
803
- To avoid sending more writes than a shard can process, use a shard key
804
- that is not increasing in value. For example in some cases you could
805
- reverse all the bits of your shard key, which preserves information
806
- while avoiding the increasing sequence of values.
807
-
808
- :term:`BSON` :term:`ObjectIds <ObjectId>` are one case of a value that
809
- monotonically increases during an insert. If you use :term:`ObjectId` as
810
- a shard key, then you can do either of the following at generation time
811
- to more evenly distribute inserts based on this property:
812
-
813
- - Reverse the bits of the ObjectIds, or
814
- - Swap the first and last 16-bit words, to "shuffle" the inserts.
815
- - Use UUIDs instead, but check that your UUID generator does not
816
- generate consistent increasing UUIDs, which would cause the same
817
- behavior.
818
-
819
- .. example:: The following example, in C++, swaps the leading and
820
- trailing 16-bit word of object IDs generated so that they are no
821
- longer monotonically increasing.
822
-
823
- .. code-block:: none
824
-
825
- using namespace mongo;
826
- OID make_an_id() {
827
- OID x = OID::gen();
828
- const unsigned char *p = x.getData();
829
- swap( (unsigned short&) p[0], (unsigned short&) p[10] );
830
- return x;
831
- }
832
-
833
- void foo() {
834
- // create an object
835
- BSONObj o = BSON( "_id" << make_an_id() << "x" << 3 << "name" << "jane" );
836
- // now we might insert o into a sharded collection...
837
- }
794
+ - If your shard key monotonically increases during an insert then all
795
+ the inserts will go to the last chunk in the collection, which is
796
+ undesirable if the insert volume is beyond the range that a single
797
+ shard can process at a given point in time.
798
+
799
+ If the insert volume exceeds that range, and if you can't avoid
800
+ picking a monotonically increasing shard key, then you can do either
801
+ of the following at generation time to more evenly distribute inserts:
802
+
803
+ - Reverse all the bits of your shard key, which preserves information
804
+ while avoiding the increasing sequence of values.
805
+ - Swap the first and last 16-bit words, to "shuffle" the inserts.
806
+
807
+ .. example:: The following example, in C++, swaps the leading and
808
+ trailing 16-bit word of :term:`BSON` :term:`ObjectIds <ObjectId>`
809
+ generated so that they are no longer monotonically increasing.
810
+
811
+ .. code-block:: cpp
812
+
813
+ using namespace mongo;
814
+ OID make_an_id() {
815
+ OID x = OID::gen();
816
+ const unsigned char *p = x.getData();
817
+ swap( (unsigned short&) p[0], (unsigned short&) p[10] );
818
+ return x;
819
+ }
820
+
821
+ void foo() {
822
+ // create an object
823
+ BSONObj o = BSON( "_id" << make_an_id() << "x" << 3 << "name" << "jane" );
824
+ // now we might insert o into a sharded collection...
825
+ }
826
+
827
+ For information on choosing a shard key, see :ref:`sharding-shard-key`
828
+ and see :ref:`Shard Key Internals <sharding-internals-shard-keys>` (in
829
+ particular, :ref:`sharding-internals-operations-and-reliability` and
830
+ :ref:`sharding-internals-choose-shard-key`).
838
831
839
832
.. index:: balancing; operations
840
833
.. _sharding-balancing-operations:
0 commit comments