mongodb · tychoish · Mar 15, 2013 · Feb 26, 2013
diff --git a/source/applications.txt b/source/applications.txt
@@ -57,3 +57,18 @@ The following documents provide patterns for developing application features:
    tutorial/isolate-sequence-of-operations
    tutorial/create-an-auto-incrementing-field
    tutorial/expire-data
+
+Text Search Patterns
+--------------------
+
+The following tutorials provide some patterns for
+text search usage:
+
+.. toctree::
+   :maxdepth: 1
+
+   tutorial/enable-text-search
+   tutorial/search-for-text
+   tutorial/create-text-index-on-multi-language-collection
+   tutorial/return-text-queries-using-only-text-index
+   tutorial/limit-number-of-items-scanned-for-text-search
diff --git a/source/applications/text-search.txt b/source/applications/text-search.txt
@@ -0,0 +1,111 @@
+===========
+Text Search
+===========
+
+.. default-domain:: mongodb
+
+.. versionadded:: 2.4
+
+Overview
+--------
+
+Text search supports the search of string content in documents of a
+collection. Text search introduces a new :ref:`text
+<index-feature-text>` index type and a new :dbcommand:`text` command.
+
+The text search process:
+
+- tokenizes and stems the search term(s) during both the index creation
+  and the text command execution.
+
+- assigns a score to each document that contains the search term in the
+  indexed fields. The score determines the relevance of a document to a
+  given search query.
+
+By default, :dbcommand:`text` command returns at most the top 100
+matching documents as determined by the scores.
+
+.. _create-text-index:
+
+Create a ``text`` Index
+-----------------------
+
+To perform text search, create a ``text`` index on the field or fields
+whose value is a string or an array of string elements. To create a
+``text`` indexes, use the :method:`db.collection.ensureIndex()` method
+with a document that contains field and value pairs where the value is
+the string literal ``text``.
+
+.. important::
+
+   - Before you can :ref:`create a text index <create-text-index>` or
+     :ref:`run the text command <text-search-text-command>`, you need
+     to manually enable the text search. See
+     :doc:`/tutorial/enable-text-search` for information on how to
+     enable the text search feature.
+
+   - Text indexes have significant storage requirements and performance
+     costs. See :ref:`text index feature <index-feature-text>` for more
+     information.
+
+   - .. include:: /includes/fact-text-index-limit-one.rst
+
+The following example creates a ``text`` index on the fields
+``subject`` and ``content``:
+
+.. code-block:: javascript
+
+   db.collection.ensureIndex(
+                              {
+                                subject: "text",
+                                content: "text" 
+                              }
+                            )
+
+This ``text`` index catalogs all string data in the ``subject`` field
+and the ``content`` field, where the field value is either a string or
+an array of string elements.
+
+See :doc:`/core/text-index` for details on the options available when
+creating ``text`` indexes.
+
+Additionally, ``text`` indexes can also be combined with
+ascending/descending index fields. See:
+
+- :doc:`/tutorial/limit-number-of-items-scanned-for-text-search`
+
+- :doc:`/tutorial/return-text-queries-using-only-text-index`
+
+.. _text-search-text-command:
+
+``text`` Command
+----------------
+
+The :dbcommand:`text` command can search for words and phrases. The
+command matches on the complete stemmed words. For example, if a
+document field contains the word ``blueberry``, a search on the term
+``blue`` will not match the document. However, a search on either
+``blueberry`` or ``blueberries`` will match.
+
+By default, the :dbcommand:`text` returns the top 100 scoring documents
+in descending order, but you can specify a ``limit`` option to change
+the maximum number to return.
+
+Given a collection with a ``text`` index, use the
+:method:`~db.collection.runCommand()` method to execute the
+:dbcommand:`text` command, as in:
+
+.. code-block:: javascript
+
+   db.collection.runCommand( "text" , { search: <string> } )
+
+For information and examples on various text search patterns, see
+:doc:`/tutorial/search-for-text`.
+
+Text Search Output
+------------------
+
+The :dbcommand:`text` command returns a document that contains the
+result set.
+
+See :ref:`text-search-output` for information on the output.
diff --git a/source/contents.txt b/source/contents.txt
@@ -10,6 +10,7 @@ MongoDB Manual Contents
    security
    crud
    aggregation
+   applications/text-search
    indexes
    replication
    sharding

diff --git a/source/core/indexes.txt b/source/core/indexes.txt
@@ -770,6 +770,52 @@ indexes are not suited for finding the closest documents to a
 particular location, when the closest documents are far away compared
 to bucket size.
 
+.. index:: index; text
+.. index:: text index
+.. _index-feature-text:
+
+``text`` Indexes
+~~~~~~~~~~~~~~~~
+
+.. versionadded:: 2.4
+
+MongoDB provides ``text`` indexes to support :doc:`text search
+</applications/text-search>` on a collection. You can only access the
+``text`` index with the :dbcommand:`text` command.
+
+``text`` indexes are case-insensitive and can include any field that
+contains string data. ``text`` indexes drop language-specific stop
+words (e.g. in English, “the,” “an,” “a,” “and,” etc.) and uses simple
+language-specific suffix stemming. See :ref:`text-search-languages` for
+the supported languages.
+
+``text`` indexes have the following storage requirements and
+performance costs:
+
+- Text indexes can be large. They contain one index entry for each
+  unique post-stemmed word in each indexed field for each document
+  inserted.
+
+- Building a ``text`` index is very similar to building a large
+  multi-key index, and will take longer than building a simple ordered
+  (scalar) index on the same data.
+
+- When building a large ``text`` index on an existing collection,
+  ensure that you have a sufficiently-high open file descriptor limit.
+  See the :ref:`recommended settings <oom-killer>`.
+
+- ``text`` indexes will impact insertion throughput because MongoDB
+  must add an index entry for each unique post-stemmed word in each
+  indexed field of each new source document.
+
+- Additionally, ``text`` indexes do not store phrases or information
+  about the proximity of words in the documents. As a result, phrase
+  queries will run much more effectively when the entire collection
+  fits in RAM.
+
+See :doc:`/applications/text-search` for more information on the text
+search feature.
+
 .. index:: index; limitations
 .. _index-limitations:
 

diff --git a/source/core/text-index.txt b/source/core/text-index.txt
@@ -0,0 +1,186 @@
+:orphan:
+
+==============
+``text`` Index
+==============
+
+.. default-domain:: mongodb
+
+This document provides details on some of the options available when
+creating ``text`` indexes.
+
+Specify a Name for the ``text`` Index
+-------------------------------------
+
+The default name for the index consists of each index field name
+concatenated with ``_text``. Consider the ``text`` index on the fields
+``content``, ``users.comments``, and ``users.profiles``.
+
+.. code-block:: javascript
+
+   db.collection.ensureIndex(
+                              {
+                                content: "text",
+                                "users.comments": "text",
+                                "users.profiles": "text"
+                              } 
+                            )
+
+The default name for the index is:
+
+.. code-block:: javascript
+
+   "content_text_users.comments_text_users.profiles_text"
+
+To avoid creating an index with a name that exceeds the :limit:`index
+name length limit <Index Name Length>`, you can pass the ``name``
+option to the :method:`db.collection.ensureIndex()` method:
+
+.. code-block:: javascript
+
+   db.collection.ensureIndex(
+                              {
+                                content: "text",
+                                "users.comments": "text",
+                                "users.profiles": "text" 
+                              },
+                              {
+                                name: "MyTextIndex"
+                              }
+                            )
+
+.. note::
+
+   To drop the ``text`` index, use the index name. To get the name of
+   an index, use :method:`db.collection.getIndexes()`.
+
+Index All Fields
+----------------
+
+To allow for text search on all fields with string content, use the
+wildcard specifier (``$**``) to index all fields that contain string
+content.  
+
+The following example indexes any string value in the data of every
+field of every document in a collection and names it ``TextIndex``:
+
+.. code-block:: javascript
+
+   db.collection.ensureIndex(
+                              { "$**": "text" },
+                              { name: "TextIndex" }
+                            )
+
+.. _text-index-default-language:
+
+Specify Languages for Text Index
+--------------------------------
+
+The default language associated with the indexed data determines the
+list of stop words and the rules for the stemmer and tokenizer. The
+default language for the indexed data is ``english``.
+
+To specify a different language, use the ``default_language`` option
+when creating the ``text`` index. See :ref:`text-search-languages` for
+the languages available for ``default_language``.
+
+The following example creates a ``text`` index on the
+``content`` field and sets the ``default_language`` to
+``spanish``:
+
+.. code-block:: javascript
+
+   db.collection.ensureIndex(
+                              { content : "text" },
+                              { default_language: "spanish" }
+                            )
+
+.. seealso::
+
+   :doc:`/tutorial/create-text-index-on-multi-language-collection`
+
+.. _text-index-internals-weights:
+
+Control Results of Text Search with Weights
+-------------------------------------------
+
+By default, the :dbcommand:`text` command returns matching documents
+based on scores, from highest to lowest. For a ``text`` index, the
+*weight* of an indexed field denote the significance of the field
+relative to the other indexed fields in terms of the score. The score
+calculation for a given word in a document includes the weighted sum of
+the frequency for each of the indexed fields in that document.
+
+The default weight is 1 for the indexed fields. To adjust the weights
+for the indexed fields, include the ``weights`` option in the
+:method:`db.collection.ensureIndex()` method.
+
+.. warning::
+
+   Choose the weights carefully in order to prevent the need to reindex.
+
+A collection ``blog`` has the following documents:
+
+.. code-block:: javascript
+
+   { _id: 1,
+     content: "This morning I had a cup of coffee.",
+     about: "beverage",
+     keywords: [ "coffee" ]
+   }
+
+   { _id: 2,
+     content: "Who doesn't like cake?",
+     about: "food",
+     keywords: [ "cake", "food", "dessert" ]
+   }
+
+To create a ``text`` index with different field weights for the
+``content`` field and the ``keywords`` field, include the ``weights``
+option to the :method:`~db.collection.ensureIndex()` method.
+
+.. code-block:: javascript
+
+   db.blog.ensureIndex(
+                        { 
+                          content: "text",
+                          keywords: "text",
+                          about: "text"
+                        },
+                        {
+                          weights: {
+                                     content: 10,
+                                     keywords: 5,
+                                   },
+                          name: "TextIndex"
+                        }
+                      )
+
+The ``text`` index has the following fields and weights:
+
+- ``content`` has a weight of 10,
+
+- ``keywords`` has a weight of 5, and
+
+- ``about`` has the default weight of 1.
+
+These weights denote the relative significance of the indexed fields to
+each other. For instance, a term match in the ``content`` field has:
+
+- ``2`` times (i.e. ``10:5``) the impact as a term match in the
+  ``keywords`` field and
+
+- ``10`` times (i.e. ``10:1``) the impact as a term match in the
+  ``about`` field.
+
+Tutorials
+---------
+
+The following tutorials offer additional ``text`` index creation
+patterns:
+
+- :doc:`/tutorial/create-text-index-on-multi-language-collection`
+
+- :doc:`/tutorial/limit-number-of-items-scanned-for-text-search`
+
+- :doc:`/tutorial/return-text-queries-using-only-text-index`
diff --git a/source/includes/fact-text-index-limit-one.rst b/source/includes/fact-text-index-limit-one.rst
@@ -0,0 +1 @@
+A collection can have at most only **one** ``text`` index.
diff --git a/source/includes/fact-text-search-beta.rst b/source/includes/fact-text-search-beta.rst
@@ -0,0 +1,10 @@
+The :doc:`text search </applications/text-search>` is currently a
+*beta* feature. As a beta feature:
+
+- You need to explicitly enable the feature before :ref:`creating a text
+  index <create-text-index>` or using the :dbcommand:`text` command.
+
+- To enable text search on :doc:`replica sets </core/replication>` and
+  :doc:`sharded clusters </core/sharded-clusters>`, you need to
+  enable on **each and every** :program:`mongod` for replica
+  sets and on **each and every** :program:`mongos` for sharded clusters.
Original file line number	Diff line number	Diff line change
		@@ -0,0 +1 @@
		A collection can have at most only one ``text`` index.