Skip to content

DOCS-831 shard keys and gridfs #443

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Dec 12, 2012
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
35 changes: 35 additions & 0 deletions source/core/sharding-internals.txt
Original file line number Diff line number Diff line change
Expand Up @@ -534,6 +534,41 @@ a document with a ``msg`` field that holds the string
If the application is instead connected to a :program:`mongod`, the
returned document does not include the ``isdbgrid`` string.

Shard GridFS Documents
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shading GridFS

----------------------

A common way to shard :term:`GridFS` is to configure the shard as follows:

- Do not shard the ``files`` collection, as the keys in this collection do
not easily lend themselves to even distributions.

Leaving ``files`` unsharded means that all the file metadata documents
live on one shard. It is recommended that the shard is a replica set
with at least three members, for high availability.

- Shard the ``chunks`` collection using a new ``files_id : 1 , n : 1``
index. You must create this index. Do not use the existing
``files_id : 1 , n : 1`` index already created by the drivers.

The new ``files_id : 1 , n : 1`` index ensures that all chunks of a
given file live on the same shard, which is safer and allows FileMD5
hashing.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

delete this bullet.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should say "files collection is often smaller, you can shard or not. not required. if you shard shard on _id (or any application field, if you want to create unique index, {link to unique index collection})


To shard the ``chunks`` collection by ``files_id : 1 , n : 1``, issue
commands similar to the following:

.. code-block:: javascript

db.fs.chunks.ensureIndex( { files_id : 1 , n : 1 } )

db.runCommand( { shardcollection : "test.fs.chunks" , key : { files_id : 1 , n : 1 } } )

The default ``files_id`` is an :term:`ObjectId`. The ``files_id`` is
ascending, and all GridFS chunks are sent to a single sharding chunk.
If your write load is too high for a single server to handle, you may
want to shard on a different key or use a different value for ``_id``
in the ``files`` collection.

.. index:: config database
.. index:: database, config
.. _sharding-internals-config-database:
Expand Down