diff --git a/draft/core/write-operations.txt b/draft/core/write-operations.txt index 6058d6ca7f1..7fb9951bf7f 100644 --- a/draft/core/write-operations.txt +++ b/draft/core/write-operations.txt @@ -10,17 +10,15 @@ Synopsis Operations ---------- -The :doc:`/crud` section of this manual contains specific -documentation for the major classes of write operations for MongoDB -databases. Read the following pages for additional examples and -documentation: +The :doc:`/crud` section of this manual describes the major classes of +write operations for MongoDB databases: -:doc:`/applications/create` -:doc:`/applications/delete` -:doc:`/applications/update` +- :doc:`/applications/create` +- :doc:`/applications/update` +- :doc:`/applications/delete` -Also consider the following methods in the :program:`mongo` JavaScript -shell that allow you to write or change data in a MongoDB database. +The following methods in the :program:`mongo` JavaScript shell allow you +to write or change data in a MongoDB database. - :method:`db.collection.insert()` - :method:`db.collection.update()` @@ -29,37 +27,179 @@ shell that allow you to write or change data in a MongoDB database. - :method:`db.collection.remove()` - :method:`db.collection.delete()` -Consider the documentation for your client library or :doc:`driver +See the documentation for your client library or :doc:`driver ` for more information on how to access this functionality from within your application. -Write Concern and Write Safety ------------------------------- +.. index:: write concern +.. _write-operations-write-concern: -.. todo:: import and tweak section from the replica-set page. When we - publish this document we'll have to do a quick deletion/reduction - of the replica-set section, but during the editorial process the - content can be duplicated. +Write Concern +------------- + +.. todo add note about all drivers after `date` will have w:1 write concern for all operations by default. + +The :term:`write concern` option allows you to configure return +confirmation for some or all write operations. + +By default, when a :term:`client` sends a write operation to a database +server, MongoDB returns the operation without waiting for the operation +to complete and therefore without confirming the success of the +operation. + +To enable write concern and get return status on a write operation, use +the :dbcommand:`getLastError` command. + +By default, the command confirms that the :program:`mongod` instance +received the write operation and has committed the write +operation to the in-memory representation of the database. This provides +a simple and low-latency level of write concern and will allow your +application to detect situations where the :program:`mongod` instance +becomes inaccessible or insertion errors caused by :ref:`duplicate key +errors `. + +You can modify the level of write concern returned by the +:dbcommand:`getLastError` by issuing the command with one or both of +following options: + +- ``j`` or "journal" option. + + In addition to the default confirmation provided by + :dbcommand:`getLastError`, this option confirms that the + :program:`mongod` instance has written the data to the on-disk + journal. This ensures that the data is durable if :program:`mongod` or + the server itself crashes or shuts down unexpectedly. + +- ``w`` option. This applies only to :term:`replica sets `. + + This option confirms that the write operation has replicated to a + specified number of replica set members. You specify a specific number + of servers or specify ``majority`` to ensure that the write propagates + to a majority of set members. The following ensures the operation has + replicated to two members: + + .. code-block:: javascript + + db.runCommand( { getLastError: 1, w: 2 } ) + + The default value of ``w`` is ``1``. + + If you specify a ``w`` value greater than the number of available + non-:term:`arbiter` replica set members, the operation will block + until those members become available. This could cause the operation + to block forever. To specify a timeout threshold for the + :dbcommand:`getLastError` operation, use the ``wtimeout`` argument. + +Many drivers have a write concern that automatically issues +:dbcommand:`getLastError` after write operations to ensure the +operations complete. + +Write concern provides confirmation of write operations but can take +longer and are not required in all applications. Consider the following +operations: + +.. code-block:: javascript + + db.runCommand( { getLastError: 1, w: "majority" } ) + db.getLastErrorObj("majority") + +These equivalent :dbcommand:`getLastError` operations ensure that write +operations return only after a write operation has replicated to a +majority of the members of a replica set. + +You can configure default :dbcommand:`getLastError` behavior for a +replica set. Use the :data:`settings.getLastErrorDefaults` setting in +the :doc:`replica set configuration `. +For instance: + +.. code-block:: javascript + + cfg = rs.conf() + cfg.settings = {} + cfg.settings.getLastErrorDefaults = {w: "majority", j: true} + rs.reconfig(cfg) + +When the new configuration is active, the :dbcommand:`getLastError` +operation waits for the write operation to complete on a majority of the +set members before returning. Specifying ``j: true`` makes +:dbcommand:`getLastError` wait for a complete commit of the operations +to the journal before returning. + +The :data:`getLastErrorDefaults` setting only affects :dbcommand:`getLastError` +commands with *no* other arguments. + +.. note:: + + Use of inappropriate write concern can lead to :ref:`rollbacks + ` in the case of :ref:`replica set failover + `. Always ensure that your operations have + specified the required write concern for your application. + +For more information, see :ref:`replica-set-write-concern`. + +.. index:: read preference +.. index:: slaveOk +.. _write-operations-bulk-insert: Bulk Inserts ------------ -:issue:`SERVER-2395` +A bulk insert allows MongoDB to distribute the write performance penalty +when performing inserts to a large number of documents at once. Bulk +inserts let you pass multiple events to the :method:`insert()` method at +once. All write concern options apply to bulk inserts. + +If you insert data without write concern, the bulk insert gain might be +insignificant. But if you insert data with write concern configured, +bulk insert can bring significant performance gains by distributing the +penalty over the group of inserts. + +Bulk inserts are often used with :term:`sharded collections ` and are more effective when the collection is already +populated and MongoDB has already determined the key distribution. +Otherwise MongoDB needs time to learn and determine the distribution. + +If the collection is not populated, you can avoid the learning time by +predefining key ranges, as described in +:ref:`sharding-administration-pre-splitting`. + +When you perform bulk inserts, you can parallel import by sending +inserts to multiple :program:`mongos` instances. + +To distribute data *during* bulk inserts or if the cluster becomes +uneven, see :ref:`Migrating Chunks +`. -.. todo:: import the best content from: http://www.mongodb.org/display/DOCS/Bulk+Inserts sl - split between this section and the sharded clusters section. +If possible, consider using bulk inserts to insert event data. + +For more information see :ref:`write-operations-sharded-clusters` and +:doc:`/administration/import-export`. Indexing -------- -.. todo:: short section on the impact of indexes and index maintenance - on write operations. +After every insert, update, or delete operation, MongoDB updates not +only a collection but *every* index associated with the collection. +Therefore, every index on a collection adds some amount of +write-performance penalty. + +In general, the performance gains that indexes realize for read +operations are worth the insertion penalty. But if your application is +write-heavy, be careful when creating new indexes. + +For more information, see :doc:`/source/applications/indexes`. Isolation --------- -- atomicity -- :doc:`/tutorial/perform-two-phase-commits` +All operations inside of a MongoDB document are atomic. An update +operation may modify more than one document at more than one level +(nesting) in a single operation that will either succeed or fail and +cannot leave the document in an in-between state. + +For more information see :doc:`Isolated write operations +` and +:doc:`/tutorial/perform-two-phase-commits`. Architecture ------------ @@ -67,5 +207,74 @@ Architecture Replica Sets ~~~~~~~~~~~~ +If you are performing a large data ingestion or bulk load operation that +requires a large number of writes to the primary, the secondaries will +not be able to read the oplog fast enough to keep up with changes. +Setting some level of write concern can slow the overall progress of the +batch but will prevent the secondary from falling too far behind. + +To prevent this, use write concern so that MongoDB will perform a safe +write (i.e. call :dbcommand:`getLastError`) after every 100, 1,000, or +other designated number of operations. This provides an opportunity for +secondaries to catch up with the primary. Using safe writes, even in +batches, can impact write throughout; however, calling +:dbcommand:`getLastError` will prevents the secondaries from falling too +far behind the primary. + +For more information see :ref:`replica-set-write-concern`, +:ref:`replica-set-oplog-sizing`, :ref:`replica-set-oplog`, and +:ref:`replica-set-procedure-change-oplog-size`. + +.. _write-operations-sharded-clustsers: + Sharded Clusters ~~~~~~~~~~~~~~~~ + +In a :term:`sharded cluster`, a given write operation goes to a +particular :term:`shard` and :term:`chunk` in the cluster. Write +performance is affected by a number of factors, including the numbers of +writes and key ranges for the chunks. + +If you insert many documents in rapid succession, MongoDB initially +directs writes to a single chunk, which can affect performance. + +If your shard key monotonically increases and all inserts go the system +will adjust the metadata to keep balance, but at a given time ``t`` all +writes will go to a single shard, which is undesirable if insert rate is +extremely large. To avoid this, consider using a shard key that is not +increasing in value. For example in some cases you could reverse all the +bits of your shard key, which is information preserving yet then avoids +the increasing sequence of values. + +Note that :term:`BSON` :term:`ObjectIds ` have this property. You might wish at +generation time to reverse the bits of the ObjectIds, or swap the first +and last 16 bit words, to "shuffle" the inserts. Alternatively you might +use UUIDs instead (but check that your UUID generator does not generate +increasing UUIDs consistently or you would get the same behavior). + +Shard key values that are strictly increasing are fine if the insert +volume is within the range that a single shard can process at a given +point in time. + +.. example:: The following example, in C++, swaps the leading and + trailing 16 bit word of object IDs generated so that they are no + longer monotonically increasing. + + .. code-block:: none + + using namespace mongo; + OID make_an_id() { + OID x = OID::gen(); + const unsigned char *p = x.getData(); + swap( (unsigned short&) p[0], (unsigned short&) p[10] ); + return x; + } + + void foo() { + // create an object + BSONObj o = BSON( "_id" << make_an_id() << "x" << 3 << "name" << "jane" ); + // now we might insert o into a sharded collection... + } + +For more information, see :doc:`/administration/sharding` and +:ref:`write-operations-bulk-insert`. diff --git a/source/applications/replication.txt b/source/applications/replication.txt index 4b549f1071c..1e87f89fa3b 100644 --- a/source/applications/replication.txt +++ b/source/applications/replication.txt @@ -20,8 +20,8 @@ This document describes those options and their implications. .. _write-concern: .. _replica-set-write-concern: -Write Concern -------------- +Write Concern for Replica Sets +------------------------------ When a :term:`client` sends a write operation to a database server, the operation returns without waiting for the operation to succeed or @@ -125,8 +125,8 @@ commands with *no* other arguments. .. _replica-set-read-preference: .. _slaveOk: -Read Preference ---------------- +Read Preference for Replica Sets +-------------------------------- Read preference describes how MongoDB clients route read operations to :term:`secondary` members of a :term:`replica set`.