Skip to content

Time series updates for sharding support and updates / deletes #6085

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
33 changes: 5 additions & 28 deletions source/core/timeseries-collections.txt
Original file line number Diff line number Diff line change
Expand Up @@ -79,6 +79,8 @@ command:
}
)

.. _time-series-fields:

When creating a time series collection, specify the following options:

.. list-table::
Expand All @@ -95,44 +97,19 @@ When creating a time series collection, specify the following options:

- string

- Required. The name of the field which contains the date in each
time series document. Documents in a time series collection must
have a valid BSON date as the value for the ``timeField``.
- .. include:: /includes/time-series/fact-time-field-description.rst

* - ``timeseries.metaField``

- string

- Optional. The name of the field which contains metadata in each
time series document. The metadata in the specified field should
be data that is used to label a unique series of documents. The
metadata should rarely, if ever, change.

The name of the specified field may not be ``_id`` or the same as
the ``timeseries.timeField``. The field can be of any type.
- .. include:: /includes/time-series/fact-meta-field-description.rst

* - ``timeseries.granularity``

- string

- Optional. Possible values are ``"seconds"``, ``"minutes"``, and
``"hours"``. By default, MongoDB sets the ``granularity`` to
``"seconds"`` for high-frequency ingestion.

Manually set the ``granularity`` parameter to improve performance
by optimizing how data in the time series collection is stored
internally. To select a value for ``granularity``, choose the
closest match to the time span between consecutive incoming
measurements.

If you specify the ``timeseries.metaField``, consider the time
span between consecutive incoming measurements that have the same
unique value for the ``metaField`` field. Measurements often have
the same unique value for the ``metaField`` field if they come
from the same source.

If you do not specify ``timeseries.metaField``, consider the time
span between all measurements that are inserted in the collection.
- .. include:: /includes/time-series/fact-granularity-description.rst

* - ``expireAfterSeconds``

Expand Down
5 changes: 5 additions & 0 deletions source/core/timeseries/timeseries-granularity.txt
Original file line number Diff line number Diff line change
Expand Up @@ -143,3 +143,8 @@ a time. From ``"seconds"`` to ``"minutes"`` or from ``"minutes"`` to
``"hours"``. Other changes are not allowed. If you need to change the
``granularity`` from ``"seconds"`` to ``"hours"``, first increase the
``granularity`` to ``"minutes"`` and then to ``"hours"``.

.. note::

You cannot modify the ``granularity`` of a sharded time series
collection.
31 changes: 26 additions & 5 deletions source/core/timeseries/timeseries-limitations.txt
Original file line number Diff line number Diff line change
Expand Up @@ -21,12 +21,23 @@ Constraints

The maximum size of a measurement document is 4 MB.

.. _timeseries-limitations-updates-deletes:

Updates and Deletes
~~~~~~~~~~~~~~~~~~~

:ref:`Time series collections <manual-timeseries-collection>` only
support insert operations and read queries. Updates and manual delete operations
result in an error.
:ref:`Time series collections <manual-timeseries-collection>` support update
and delete operations with limitations. The following operations are not supported
on time series collections:

- :dbcommand:`findAndModify`.
- Updates that modify the ``timeField``.
- Updates with ``multi:false`` that modify the ``metaField``.
- Updates with ``upsert:true``.
- Updates or deletes in multi-document transactions.

MongoDB supports all other update and delete operations. For more information,
see :ref:`<collection-method>`.

To automatically delete old data, :ref:`set up automatic removal (TTL)
<set-up-automatic-removal>`.
Expand Down Expand Up @@ -117,11 +128,21 @@ Client-Side Field Level Encryption
</core/security-client-side-encryption>` is not supported for
:ref:`time series collections <manual-timeseries-collection>`.

.. _time-series-limitations-sharding:

Sharding
~~~~~~~~

:ref:`Time series collections <manual-timeseries-collection>` cannot
currently be sharded.
Starting in MongoDB 5.1, sharded time series collections are supported.
Copy link

@banarun banarun Nov 8, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably add more details here. Some of the key things to highlight would be,

  1. The changes to the shardCollection command, which now accepts timeseries parameter.
  2. The sharding admin commands (like split, moveChunk) are not supported for time-series collection.
  3. The limitations on the shard key pattern (Can only use metaField and/or timeField in the shard key). Details here: https://docs.google.com/document/d/1ljVx7gni5dg6vuLSL2li14lq13T2RqENWEGKKeliCzw/edit#heading=h.1qvrx81umcnx

When using sharded time series collections, you cannot:

- Modify the ``granularity`` of a sharded time series
collection.

- Run sharding administration commands, including:

- :dbcommand:`moveChunk`
- :dbcommand:`splitChunk`

Aggregation $out and $merge
~~~~~~~~~~~~~~~~~~~~~~~~~~~
Expand Down
7 changes: 7 additions & 0 deletions source/includes/5.1/5.1-release-notes-sharded-time-series.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
MongoDB 5.1 provides support for sharded :ref:`time series collections
<manual-timeseries-collection>`.

See:

- :dbcommand:`shardCollection`
- :ref:`Time Series Limitations <time-series-limitations-sharding>`
23 changes: 23 additions & 0 deletions source/includes/time-series/fact-granularity-description.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
Optional. Possible values are:

- ``"seconds"``
- ``"minutes"``
- ``"hours"``

By default, MongoDB sets the ``granularity`` to ``"seconds"`` for
high-frequency ingestion.

Manually set the ``granularity`` parameter to improve performance
by optimizing how data in the time series collection is stored
internally. To select a value for ``granularity``, choose the
closest match to the time span between consecutive incoming
measurements.

If you specify the ``timeseries.metaField``, consider the time
span between consecutive incoming measurements that have the same
unique value for the ``metaField`` field. Measurements often have
the same unique value for the ``metaField`` field if they come
from the same source.

If you do not specify ``timeseries.metaField``, consider the time
span between all measurements that are inserted in the collection.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as existing copy; shouldn't need review.

7 changes: 7 additions & 0 deletions source/includes/time-series/fact-meta-field-description.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
Optional. The name of the field which contains metadata in each
time series document. The metadata in the specified field should
be data that is used to label a unique series of documents. The
metadata should rarely, if ever, change.

The name of the specified field may not be ``_id`` or the same as
the ``timeseries.timeField``. The field can be of any type.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as existing copy; shouldn't need review.

3 changes: 3 additions & 0 deletions source/includes/time-series/fact-time-field-description.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
Required. The name of the field which contains the date in each
time series document. Documents in a time series collection must
have a valid BSON date as the value for the ``timeField``.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as existing copy; shouldn't need review.

2 changes: 2 additions & 0 deletions source/reference/command/findAndModify.txt
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
.. _find-and-modify:

=============
findAndModify
=============
Expand Down
77 changes: 76 additions & 1 deletion source/reference/command/shardCollection.txt
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,8 @@ Definition
unique: <boolean>,
numInitialChunks: <integer>,
presplitHashedZones: <boolean>,
collation: { locale: "simple" }
collation: { locale: "simple" },
timeseries: <object>
}

:dbcommand:`shardCollection` has the following fields:
Expand Down Expand Up @@ -170,6 +171,64 @@ Definition

.. versionadded:: 4.4

* - :ref:`timeseries <cmd-shard-collection-timeseries>`

- object

- .. _cmd-shard-collection-timeseries:

Optional. Specify this option to create a new sharded
:ref:`time series collection <manual-timeseries-collection>`.

To shard an existing time series collection, omit this
parameter. When you omit this parameter and specify a time
series collection in the ``shardCollection`` parameter,
MongoDB automatically uses the values from the time series
collection as the values for the ``timeseries`` field.

For detailed syntax, see
:ref:`sharded-time-series-collection-options`.

.. versionadded:: 5.1

.. _sharded-time-series-collection-options:

Time Series Options
~~~~~~~~~~~~~~~~~~~

.. versionadded:: 5.1

Specify the :ref:`timeseries <cmd-shard-collection-timeseries>` option
to :dbcommand:`shardCollection` to create a new sharded
:ref:`time series collection <manual-timeseries-collection>`.

The :ref:`timeseries <cmd-shard-collection-timeseries>` option takes
the following fields:

.. list-table::
:header-rows: 1
:widths: 20 20 80

* - Field
- Type
- Description

* - ``timeField``
- string
- .. include:: /includes/time-series/fact-time-field-description.rst

* - ``metaField``
- string
- .. include:: /includes/time-series/fact-meta-field-description.rst

* - ``granularity``
- string
- .. include:: /includes/time-series/fact-granularity-description.rst

* - ``bucketMaxSpanSeconds``
- integer
- Optional. The maximum range of time values for a bucket,
in seconds.

Considerations
--------------
Expand All @@ -194,6 +253,21 @@ avoid scalability and perfomance issues.
- :ref:`sharding-shard-key-selection`
- :ref:`sharding-shard-key`

Shard Keys on Time Series Collections
`````````````````````````````````````

When sharding time series collections, you can only specify the
``metaField`` (or sub-fields of ``metaField``), ``timeField``, or both
in the shard key. No other fields, including ``_id``, are allowed in the
shard key pattern.

- ``metaField`` can be either a :ref:`hashed shard key
<sharding-hashed-sharding>` or a :ref:`ranged shard key
<sharding-ranged>`.

- ``timeField`` can only be a :ref:`ranged shard key
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The timeField can only be at the end of the shard key pattern.

<sharding-ranged>` and must be at the end of the shard key pattern.

.. _hashed-shard-keys:

Hashed Shard Keys
Expand Down Expand Up @@ -274,6 +348,7 @@ in the ``records`` database and uses the ``zipcode`` field as the
.. code-block:: javascript

db.adminCommand( { shardCollection: "records.people", key: { zipcode: 1 } } )


.. seealso::

Expand Down
4 changes: 4 additions & 0 deletions source/reference/method/db.collection.update.txt
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
.. _collection-update:

======================
db.collection.update()
======================
Expand Down Expand Up @@ -51,6 +53,8 @@ The :method:`db.collection.update()` method has the following form:
}
)

.. _update-parameters:

Parameters
~~~~~~~~~~

Expand Down
2 changes: 2 additions & 0 deletions source/reference/method/js-collection.txt
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
.. _collection-method:

==================
Collection Methods
==================
Expand Down
29 changes: 22 additions & 7 deletions source/release-notes/5.1.txt
Original file line number Diff line number Diff line change
Expand Up @@ -59,6 +59,28 @@ Starting in MongoDB 5.1, :expression:`$dateSubtract` and
:expression:`$dateAdd` report an error when an overflow is detected for
``amount`` values.

Time Series Collections
-----------------------

Geo Indexing for Time Series collections
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Starting in MongoDB 5.1, you can use :ref:`geo indexes
<timeseries-limitations-secondary-indexes>` on the ``metaField`` of time
series collections.

Updates and Deletes
~~~~~~~~~~~~~~~~~~~

Starting in MongoDB 5.1, time series collections support
:ref:`update and delete operations
<timeseries-limitations-updates-deletes>` with limitations.

Sharded Time Series Collections
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. include:: /includes/5.1/5.1-release-notes-sharded-time-series.rst

.. _5.1-rel-notes-sbe:

Slot-Based Query Execution Engine
Expand Down Expand Up @@ -135,13 +157,6 @@ Opcode Counters
Resharding Statistics
- :serverstatus:`shardingStatistics.resharding.lastOpEndingChunkImbalance`

Geo Indexing for Time Series collections
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Starting in MongoDB 5.1, you can use :ref:`geo indexes
<timeseries-limitations-secondary-indexes>` on the ``metaField`` of time
series collections.

Schema Validation Errors Contain Description Field
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Expand Down