diff --git a/draft/core/document.txt b/draft/core/document.txt index 3e1a764661d..e3ea9118e1d 100644 --- a/draft/core/document.txt +++ b/draft/core/document.txt @@ -36,7 +36,7 @@ MongoDB contexts: of :dbcommand:`collStats` command, and - the :doc:`output ` of the - :dbcommand:`serverStatus` command. + :dbcommand:`serverStatus` command. Structure --------- @@ -58,23 +58,22 @@ the following structure: Having support for the full range of :term:`BSON types`, MongoDB documents may contain field and value pairs where the value can be -another document, an array, an array of documents as well -as the basic types such as ``Double``, ``String``, or ``Date``. +another document, an array, an array of documents as well as the basic +types such as ``Double``, ``String``, and ``Date``. See also +:ref:`document-bson-type-considerations`. -Consider the following document that contains values of varying -types: +Consider the following document that contains values of varying types: .. code-block:: javascript - { - _id: ObjectId("5099803df3f4948bd2f98391"), - name: { first: "Alan", last: "Turing" }, - birth: new Date('Jun 23, 1912'), - death: new Date('Jun 07, 1954'), - contribs: [ "Turing machine", "Turing test", "Turingery" ], - views : NumberLong(1250000), - update : Timestamp(1352237167000, 1) - } + var mydoc = { + _id: ObjectId("5099803df3f4948bd2f98391"), + name: { first: "Alan", last: "Turing" }, + birth: new Date('Jun 23, 1912'), + death: new Date('Jun 07, 1954'), + contribs: [ "Turing machine", "Turing test", "Turingery" ], + views : NumberLong(1250000) + } The document contains the following fields: @@ -89,20 +88,45 @@ The document contains the following fields: - ``views`` that holds a value of *NumberLong* type. -- ``update`` that holds a value of *Timestamp* type. +To determine the type of fields, the :program:`mongo` shell provides: + +- The ``instanceof`` operator to check if a field is a specific type. + +- The ``typeof`` operator to return the type of a field. + +Consider the following examples that demonstrate the use of the +``instanceof`` and the ``typeof`` operators: + +- The following operation tests whether the ``_id`` field is of type + ``ObjectId``: + + .. code-block:: javascript + + mydoc._id instanceof ObjectId + + The operation returns ``true``. + +- The following operation returns the type of the ``_id`` field: + + .. code-block:: javascript -Types ------ + typeof mydoc._id + + Rather than the specific ``ObjectId`` type, the operation returns the + generic ``object`` type. + +Document Types +-------------- .. _documents-records: Record Documents ~~~~~~~~~~~~~~~~ -Most documents in MongoDB are records in :term:`collections` which -store data from users' applications. +Most documents in MongoDB in :term:`collections ` store +data from users' applications. -These documents have the following limitations: +These documents have the following attributes: - .. include:: /includes/fact-document-max-size.rst @@ -132,7 +156,7 @@ The following document specifies a record in a collection: The document contains the following fields: -- ``_id``, which must hold a unique value. +- ``_id``, which must hold a unique value and is *immutable*. - ``name`` that holds another *document*. This sub-document contains the fields ``first`` and ``last``, which both hold *strings*. @@ -143,20 +167,54 @@ The document contains the following fields: - ``awards`` that holds an *array of documents*. +Take the following considerations for the ``_id`` field: + +- In documents, the ``_id`` field is always indexed for regular + collections. + +- The ``_id`` field may contain values of any BSON data type other than + an array. + +- Although it is common to assign ``ObjectId`` values to ``_id`` + fields, if your objects have a natural unique identifier, consider + using that for the value of ``_id`` to save space and to avoid an + additional index. + +- To set the ``_id`` field to: + + - ``ObjectId``, see the :doc:`ObjectId ` + documentation. + + - A sequence number, see the + :doc:`/tutorial/create-an-auto-incrementing-field` tutorial. + + - UUID, your application must generate the UUID itself. For + efficiency, store the UUID as a BSON BinData type to reduce the + UUID values and their respective keys in the _id index by half. If, + however, you know space and speed will not be an issue, you can + store as a hex string. + + .. note:: + + Different driver implementations of the UUID + serialization/deserialization logic may not be fully compatible + with each other. See your specific :api:`driver documentations + <>` for details on the level of interoperability. + .. _documents-query-selectors: -Query Specification Documents -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +Query Documents +~~~~~~~~~~~~~~~ -Query selector documents specify the conditions that determine which -records to select for read, update, and delete operations. You can use -field and value expressions to specify the equality condition and -:doc:`query operator ` expressions to specify -additional conditions. Refer to :doc:`read `, -:doc:`update `, and :doc:`delete -` pages for more examples. +Query documents specify the conditions that determine which records to +select for read, update, and delete operations. You can use field and +value expressions to specify the equality condition and :doc:`query +operator ` expressions to specify additional +conditions. Refer to :doc:`read `, :doc:`update +`, and :doc:`delete ` pages +for more examples. -Consider the following examples of query selector documents: +Consider the following examples of query documents: - The following document specifies the query criteria where ``_id`` is equal to ``1``: @@ -196,24 +254,24 @@ for MongoDB to return, remove, or update, as in the following: .. code-block:: javascript - db.csbios.find( { _id: 1 } ) - db.csbios.remove( { _id: { $gt: 3 } } ) - db.csbios.update( { _id: 1, name: { first: 'John', last: 'Backus' } }, + db.bios.find( { _id: 1 } ) + db.bios.remove( { _id: { $gt: 3 } } ) + db.bios.update( { _id: 1, name: { first: 'John', last: 'Backus' } }, ... ) .. _documents-update-actions: -Update Specification Documents -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +Update Documents +~~~~~~~~~~~~~~~~ -The update action documents specify the data modifications to perform -during an :method:`update() ` operation to -modify existing records in a collection. You can use :ref:`update -operators ` to specify the exact actions to perform -on the document fields. See the :ref:`update operators -` page for the available update operators and syntax. +The update documents specify the data modifications to perform during +an :method:`update() ` operation to modify +existing records in a collection. You can use :ref:`update operators +` to specify the exact actions to perform on the +document fields. See the :ref:`update operators ` +page for the available update operators and syntax. -Consider the update specification document example: +Consider the update document example: .. code-block:: javascript @@ -221,6 +279,8 @@ Consider the update specification document example: $push: { awards: { award: 'IBM Fellow', year: '1963', by: 'IBM' } + } + } When passed as an argument to the :method:`update() ` method, the update actions document: @@ -235,26 +295,31 @@ When passed as an argument to the :method:`update() .. code-block:: javascript - db.csbios.update( { _id: 1 }, - { $set: { 'name.middle': 'Warner' }, - $push: { awards: { award: 'IBM Fellow', - year: '1963', - by: 'IBM' } } } - ) + db.bios.update( + { _id: 1 }, + { $set: { 'name.middle': 'Warner' }, + $push: { awards: { + award: 'IBM Fellow', + year: '1963', + by: 'IBM' + } + } + } + ) .. _documents-index: .. _document-index-specification: -Index Specification Documents -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +Index Documents +~~~~~~~~~~~~~~~ Indexes optimize a number of key :doc:`read ` -and :doc:`write ` operations. Index -specification documents describe the fields to index on during the -:ref:`index creation `. See :doc:`indexes +and :doc:`write ` operations. Index documents +describe the fields to index on during the :doc:`index creation +`. See :doc:`indexes ` for an overview of indexes. -The index specification documents contain field and value pairs, in +The index documents contain field and value pairs, in the following form: .. code-block:: javascript @@ -279,12 +344,12 @@ the index to create: .. code-block:: javascript - db.csbios.ensureIndex( { _id: 1, 'name.last': 1 } ) + db.bios.ensureIndex( { _id: 1, 'name.last': 1 } ) .. _documents-sort-order: -Sort Order Specification Documents -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +Sort Order Documents +~~~~~~~~~~~~~~~~~~~~ The sort order documents specify the order of documents that a :method:`query() ` returns. Pass sort order @@ -292,7 +357,7 @@ specification documents as an argument to the :method:`sort() ` method. See the :method:`sort() ` page for more information on sorting. -The sort order specifications contain field and value pairs, in the following +The sort order documents contain field and value pairs, in the following form: .. code-block:: javascript @@ -317,4 +382,181 @@ method, the sort order document sorts the results of the .. code-block:: javascript - db.csbios.find().sort( { 'name.last': 1, 'name.first': 1 } ) + db.bios.find().sort( { 'name.last': 1, 'name.first': 1 } ) + +.. _document-mongodb-type-considerations: + +MongoDB Type Considerations +--------------------------- + +The following MongoDB types require special consideration: + +.. _document-bson-type-object-id: + +ObjectId +~~~~~~~~ + +ObjectId is small, most likely unique, fast to generate, and ordered. +It consists of 12-bytes where the first 4-bytes is the timestamp of the +ObjectId's creation. Refer to the :doc:`ObjectId ` +documentation for more information regarding the type and its benefits. + +.. _document-bson-type-string: + +String +~~~~~~ + +BSON strings are UTF-8. In general, drivers for each programming language +convert from the language's string format to UTF-8 when serializing and +deserializing BSON. In most cases, this means you can effectively store +most international characters in BSON strings. +[#sort-string-internationalization]_ In addition, MongoDB +:operator:`regex` queries support UTF-8 in the regex string. + +.. [#sort-string-internationalization] With internalization, + :method:`sort() ` on a string will be reasonably + correct; however, because internally :method:`sort() ` + uses the C++ ``strcmp`` api, the sort order will not be *fully* correct. + +.. _document-bson-type-timestamp: + +Timestamp +~~~~~~~~~ + +BSON Timestamp is a special type for *internal* MongoDB use and is +**not** associated with the regular :ref:`document-bson-type-date` +type. BSON Timestamp value is a 64 bit value where: + +- the first 32 bits are a ``time_t`` value (seconds since the Unix epoch) + +- the second 32 bits are an incrementing ``ordinal`` for operations + within a given second. + +On a single :program:`mongod` instance, BSON Timestamp values are guaranteed to +be unique. + +In replication, the oplog's ``ts`` field which holds the *OpTime*, or +the operation timestamp, is of type BSON Timestamp. + +Consider the following examples of creating BSON Timestamp values: + +.. note:: + + The BSON Timestamp type is for *internal* MongoDB use. The following + examples are only for illustration purposes of the Timestamp + constructor and do **not** represent the typical use of the type. + +- If a BSON Timestamp value is constructed using the empty constructor + (i.e. ``new Timestamp()``), the value depends on the order of the + field in the document: + + - If the field with the BSON Timestamp value is the *first* field of + the document or the *second* field if the ``_id`` field is the + first, the Timestamp value will automatically be set to a unique + value. + + .. code-block:: javascript + + db.bios.insert( { _id: 9, last_updated: new Timestamp() } ) + + The ``last_updated`` field has both its ``time_t`` value + automatically set to ``1352874017000`` and ``ordinal`` value set to + ``1``: + + .. code-block:: javascript + + { "_id" : 9, "last_updated" : Timestamp(1352874017000, 1) } + + - If the field with the BSON Timestamp value is not the *first* field + of the document nor the *second* field with the ``_id`` field as + the first field, the Timestamp value will be the empty Timestamp + value with ``time_t`` set to ``0`` and ``ordinal`` set to ``0``: + + .. code-block:: javascript + + db.bios.insert( { views: NumberLong(0), last_updated: new Timestamp() } ) + + The Timestamp value is the empty Timestamp value (i.e. ``Timestamp(0, 0)`` ): + + .. code-block:: javascript + + { + "_id" : ObjectId("50a33daf88d113a4ae94a959"), + "views" : NumberLong(0), + "last_updated" : Timestamp(0, 0) + } + +.. versionchanged:: 2.1 + :program:`mongo` shell displays the Timestamp value with the wrapper: + + .. code-block:: javascript + + Timestamp(, ) + + Prior to version 2.1, the :program:`mongo` shell display the + Timestamp value as a document: + + .. code-block:: javascript + + { t : , i : } + +.. _document-bson-type-date: + +Date +~~~~ + +BSON Date is a 64-bit integer that represents the number of +milliseconds since the Unix epoch (Jan 1, 1970). The `official BSON +specification `_ refers to the +BSON Date type as the *UTC datetime*. + +.. versionchanged:: 2.0 BSON Date type is signed. [#unsigned-date]_ + Negative values represent dates before 1970. + +Consider the following examples of BSON Date: + +- Construct a Date using the ``new Date()`` constructor in the + :program:`mongo` shell: + + .. code-block:: javascript + + var mydate1 = new Date() + +- Construct a Date using the ISODate() constructor in the + :program:`mongo` shell: + + .. code-block:: javascript + + var mydate2 = ISODate() + +- Print the Date value as string: + + .. code-block:: javascript + + mydate1.toString() + +- Print the month portion of the Date value; months start at zero for January : + + .. code-block:: javascript + + mydate1.getMonth() + +For more Date methods, see `JavaScript Date API +`_ +documentation. + +.. [#unsigned-date] Prior to version 2.0, Date values were incorrectly + interpreted as *unsigned* integers, adversely affecting sorts, range + queries, and indexes on Date fields. Because indexes are not + recreated when upgrading, please re-index if you created an index on + Date values with earlier versions, and dates before 1970 are + relevant to your application. + +.. note:: + + :program:`mongo` shell provides help for the ``ObjectId()``, + ``BinData()``, and ``HexData()`` shell classes: + + .. code-block:: javascript + + help misc diff --git a/source/core/object-id.txt b/source/core/object-id.txt index 7b308539e6e..a66fd4c9660 100644 --- a/source/core/object-id.txt +++ b/source/core/object-id.txt @@ -25,8 +25,8 @@ additional benefits: - you can access the timestamp of the ObjectId's creation, using the :method:`getTimestamp() ` method. -- Sorting on an ``_id`` field that stores ObjectId values, is - equivalent to sorting by creation time. +- Sorting on an ``_id`` field that stores ObjectId values is equivalent + to sorting by creation time. .. _core-object-id-class: @@ -144,3 +144,9 @@ Consider the example uses of the ``ObjectId()`` class in the .. code-block:: javascript 507f191e810c19729de860ea + +The :program:`mongo` shell provides help for the ``ObjectId()``: + +.. code-block:: javascript + + help misc diff --git a/source/includes/fact-document-field-name-restrictions.rst b/source/includes/fact-document-field-name-restrictions.rst index 9d044fd50f8..d3c445ef3af 100644 --- a/source/includes/fact-document-field-name-restrictions.rst +++ b/source/includes/fact-document-field-name-restrictions.rst @@ -2,7 +2,8 @@ names: - The field name ``_id`` is reserved for use as a primary key; its - value must be unique in the collection + value must be unique in the collection, is immutable, and may be of + any type other than an array. - The field names **cannot** start with the ``$`` character. diff --git a/source/tutorial.txt b/source/tutorial.txt index 6a32a29bb27..0415bdc8b74 100644 --- a/source/tutorial.txt +++ b/source/tutorial.txt @@ -39,6 +39,7 @@ Development Patterns tutorial/perform-two-phase-commits tutorial/enforce-unique-keys-for-sharded-collections tutorial/aggregation-examples + tutorial/create-an-auto-incrementing-field .. index:: tutorials; application development .. index:: application tutorials diff --git a/source/tutorial/create-an-auto-incrementing-field.txt b/source/tutorial/create-an-auto-incrementing-field.txt new file mode 100644 index 00000000000..f98524bbb77 --- /dev/null +++ b/source/tutorial/create-an-auto-incrementing-field.txt @@ -0,0 +1,206 @@ +========================================== +Create an Auto-Incrementing Sequence Field +========================================== + +.. default-domain:: mongodb + +Synopsis +-------- + +In documents, the field name ``_id`` is reserved for use as a primary +key; its value must be unique in the collection. This document +describes how to create an increasing sequence number to assign to the +``_id`` field using the following: + +- :ref:`auto-increment-counters-collection` + +- :ref:`auto-increment-optimistic-loop` + +.. warning:: + + Generally in MongoDB, you would not use an auto-increment pattern + for the ``_id`` field, or other fields, as this does not scale up + well on large database clusters. Instead you would use an + :term:`ObjectId `. + +.. _auto-increment-counters-collection: + +A Counters Collection +~~~~~~~~~~~~~~~~~~~~~ + +A separate ``counters`` collection tracks the *last* number sequence +used. The ``_id`` field contains the sequence name and the ``seq`` +contains the last value of the sequence. + +1. Insert into the ``counters`` collection, the initial value for the ``userid``: + + .. code-block:: javascript + + db.counters.insert( + { + _id: "userid", + seq: 0 + } + ) + +#. Create a ``getNextSequence`` function that accepts a ``name`` of the + sequence. The function uses the :method:`findAndModify() + `. + + .. code-block:: javascript + + db.users.insert( + { + _id: getNextSequence("userid"), + name: "Sarah C." + } + ) + + db.users.insert( + { + _id: getNextSequence("userid"), + name: "Bob D." + } + ) + + You can verify the results with :method:`find() `: + + .. code-block:: javascript + + db.users.find() + + The ``_id`` fields contain incrementing sequence values: + + .. code-block:: javascript + + { + _id : 1, + name : "Sarah C." + } + { + _id : 2, + name : "Bob D." + } + +.. _auto-increment-optimistic-loop: + +Optimistic Loop +~~~~~~~~~~~~~~~ + +The Optimistic Loop calculates the incremented ``_id`` value and +attempts to insert a document with the calculated ``_id`` value. If the +insert is successful, end the loop. Otherwise, iterate through the loop +recalculating the ``_id`` value until the insert is successful. + +#. Create a function named ``insertDocument`` that performs the "insert + if not present" loop. The function wraps the ``insert() + `` method and takes a ``doc`` and a + ``targetCollection`` arguments. + + .. code-block:: javascript + + function insertDocument(doc, targetCollection) { + + while (1) { + + var cursor = targetCollection.find( {}, { _id: 1 } ).sort( { _id: -1 } ).limit(1); + + var seq = cursor.hasNext() ? cursor.next()._id + 1 : 1; + + doc._id = seq; + + targetCollection.insert(doc); + + var err = db.getLastErrorObj(); + + if( err && err.code ) { + if( err.code == 11000 /* dup key */ ) + continue; + else + print( "unexpected error inserting data: " + tojson( err ) ); + } + + break; + } + } + + The ``while (1)`` loop performs the following actions: + + - Query the ``targetCollection`` for the document with the maximum + ``_id`` value. + + - Determine the next sequence value for ``_id``: + + - Add ``1`` to the returned ``_id`` value if the returned cursor + points to a document; else + + - Set to ``1`` if the returned cursor points to no document. + + - For the ``doc`` to insert, set its ``_id`` field to the calculated + sequence value ``seq``. + + - Insert the ``doc`` into the ``targetCollection``. + + - If the insert operation errors with duplicate key, loop again. + Otherwise, if the insert operation encounters some other error or + if the operation succeeds, break out of the loop. + +#. Use the ``insertDocument()`` function to perform an insert: + + .. code-block:: javascript + + var myCollection = db.users2; + + insertDocument( + { + name: "Grace H." + }, + myCollection + ); + + insertDocument( + { + name: "Ted R." + }, + myCollection + ) + + You can verify the results with :method:`find() `: + + .. code-block:: javascript + + db.users2.find() + + The ``_id`` fields contain incrementing sequence values: + + .. code-block:: javascript + + { + _id: 1, + name: "Grace H." + } + { + _id : 2, + "name" : "Ted R." + } + +High concurrent insert rate on the collection could result in high +iterations of the while-loop.