diff --git a/source/core/sharding-internals.txt b/source/core/sharding-internals.txt index c2b5ce4af5d..aaaef9e9cfa 100644 --- a/source/core/sharding-internals.txt +++ b/source/core/sharding-internals.txt @@ -534,6 +534,41 @@ a document with a ``msg`` field that holds the string If the application is instead connected to a :program:`mongod`, the returned document does not include the ``isdbgrid`` string. +Shard GridFS Documents +---------------------- + +A common way to shard :term:`GridFS` is to configure the shard as follows: + +- Do not shard the ``files`` collection, as the keys in this collection do + not easily lend themselves to even distributions. + + Leaving ``files`` unsharded means that all the file metadata documents + live on one shard. It is recommended that the shard is a replica set + with at least three members, for high availability. + +- Shard the ``chunks`` collection using a new ``files_id : 1 , n : 1`` + index. You must create this index. Do not use the existing + ``files_id : 1 , n : 1`` index already created by the drivers. + + The new ``files_id : 1 , n : 1`` index ensures that all chunks of a + given file live on the same shard, which is safer and allows FileMD5 + hashing. + + To shard the ``chunks`` collection by ``files_id : 1 , n : 1``, issue + commands similar to the following: + + .. code-block:: javascript + + db.fs.chunks.ensureIndex( { files_id : 1 , n : 1 } ) + + db.runCommand( { shardcollection : "test.fs.chunks" , key : { files_id : 1 , n : 1 } } ) + + The default ``files_id`` is an :term:`ObjectId`. The ``files_id`` is + ascending, and all GridFS chunks are sent to a single sharding chunk. + If your write load is too high for a single server to handle, you may + want to shard on a different key or use a different value for ``_id`` + in the ``files`` collection. + .. index:: config database .. index:: database, config .. _sharding-internals-config-database: