diff --git a/snooty.toml b/snooty.toml index d0ba468e..408a78bd 100644 --- a/snooty.toml +++ b/snooty.toml @@ -6,7 +6,7 @@ toc_landing_pages = [ "/write", "/indexes", "/databases-collection", - "/security/authentication" + "/security/authentication", ] intersphinx = ["https://www.mongodb.com/docs/manual/objects.inv"] @@ -19,9 +19,13 @@ language = "Ruby" mdb-server = "MongoDB Server" mongo-community = "MongoDB Community Edition" mongo-enterprise = "MongoDB Enterprise Edition" -docs-branch = "master" # always set this to the docs branch (i.e. master, 1.7, 1.8, etc.) +# always set this to the docs branch (i.e. master, v1.7, v1.8, etc.) +docs-branch = "master" +# always set this to the driver branch (i.e. 1.7 1.8, etc.) version-number = "2.21" -patch-version-number = "{+version-number+}.0" # always set this to the driver branch (i.e. 1.7.0, 1.8.0, etc.) +patch-version-number = "{+version-number+}.0" version = "v{+version-number+}" stable-api = "Stable API" api-root = "https://www.mongodb.com/docs/ruby-driver/current/api/" +bson-version = "5.0.2" +avs = "Atlas Vector Search" diff --git a/source/aggregation.txt b/source/aggregation.txt index d68cba3b..7c6c70c9 100644 --- a/source/aggregation.txt +++ b/source/aggregation.txt @@ -18,13 +18,6 @@ Transform Your Data with Aggregation :depth: 2 :class: singlecol -.. TODO: - .. toctree:: - :titlesonly: - :maxdepth: 1 - - /aggregation/aggregation-tutorials - Overview -------- @@ -125,7 +118,7 @@ following stages: .. io-code-block:: :copyable: - .. input:: /includes/aggregation.rb + .. input:: /includes/aggregation/aggregation.rb :start-after: start-aggregation :end-before: end-aggregation :language: ruby @@ -160,7 +153,7 @@ from the preceding :ref:`ruby-aggregation-example`: .. io-code-block:: :copyable: - .. input:: /includes/aggregation.rb + .. input:: /includes/aggregation/aggregation.rb :start-after: start-explain-aggregation :end-before: end-explain-aggregation :language: ruby @@ -205,7 +198,7 @@ This example creates pipeline stages to perform the following actions: .. io-code-block:: :copyable: - .. input:: /includes/aggregation.rb + .. input:: /includes/aggregation/aggregation.rb :start-after: start-search-aggregation :end-before: end-search-aggregation :language: ruby diff --git a/source/data-formats/bson.txt b/source/data-formats/bson.txt index 5285bec9..c159206b 100644 --- a/source/data-formats/bson.txt +++ b/source/data-formats/bson.txt @@ -1,10 +1,17 @@ .. _ruby-bson-tutorial: +.. _ruby-bson: -============= -BSON Tutorial -============= +========================== +Document Data Format: BSON +========================== -.. default-domain:: mongodb +.. facet:: + :name: genre + :values: reference + +.. meta:: + :keywords: code example, serialization, representation + :description: Learn how to use BSON types in the MongoDB Ruby Driver. .. contents:: On this page :local: @@ -12,146 +19,191 @@ BSON Tutorial :depth: 2 :class: twocols -In this tutorial, you can learn how to use the Ruby BSON library. +Overview +-------- -Installation ------------- +In this guide, you can learn about the BSON data format, how MongoDB +uses BSON to organize and store data, and how to install the BSON +library independently of the {+driver-short+}. + +BSON Data Format +---------------- + +**BSON**, or Binary JSON, is the data format that MongoDB uses to organize +and store data. This data format includes all JSON data structure types and +adds support for types including dates, differently-sized integers (32-bit and 64-bit), +ObjectIds, and binary data. For a complete list of supported types, see the +:manual:`BSON Types ` in the {+mdb-server+} documentation. -The BSON library can be installed from `Rubygems `_ -manually or with bundler. +BSON is not human-readable, but you can use the +{+language+} BSON library to convert it to the human-readable JSON +representation. You can read more about the relationship between these +formats in the :website:`JSON and BSON ` guide on the +MongoDB website. -To install the gem manually: +Install the BSON Library +------------------------ + +You can install the BSON library (``bson``) from `Rubygems +`__ manually or by using the bundler. + +Run the following command to install the ``bson`` gem: .. code-block:: sh - gem install bson + gem install bson -To install the gem with bundler, include the following in your ``Gemfile``: +To install the gem by using bundler, include the following line in your +application's ``Gemfile``: .. code-block:: ruby - gem 'bson' + gem 'bson' -The BSON library is compatible with MRI >= 2.5 and JRuby >= 9.2. +The BSON library is compatible with MRI v2.5 and later and JRuby v9.2 +and later. -Use With ActiveSupport ----------------------- +ActiveSupport +------------- -Serialization for ActiveSupport-defined classes, such as TimeWithZone, is -not loaded by default to avoid a hard dependency of BSON on ActiveSupport. -When using BSON in an application that also uses ActiveSupport, the -ActiveSupport-related code must be explicitly required: +Serialization for classes defined in Active Support, such as +``TimeWithZone``, is not loaded by default to avoid a hard dependency of +BSON on Active Support. When using BSON in an application that also uses +Active Support, you must require the Active Support code support: .. code-block:: ruby - require 'bson' - require 'bson/active_support' + require 'bson' + require 'bson/active_support' BSON Serialization ------------------ -Getting a Ruby object's raw BSON representation is done by calling ``to_bson`` -on the Ruby object, which will return a ``BSON::ByteBuffer``. For example: +You can retrieve a {+language+} object's raw BSON representation by +calling ``to_bson`` on the object. The ``to_bson`` method returns a +``BSON::ByteBuffer``. + +The following code demonstrates how to call the ``to_bson`` method on +{+language+} objects: .. code-block:: ruby - "Shall I compare thee to a summer's day".to_bson - 1024.to_bson + "Shall I compare thee to a summer's day".to_bson + 1024.to_bson -Generating an object from BSON is done via calling ``from_bson`` on the class -you wish to instantiate and passing it a ``BSON::ByteBuffer`` instance. +You can generate a {+language+} object from BSON by calling +``from_bson`` on the class you wish to instantiate and passing it a +``BSON::ByteBuffer`` instance: .. code-block:: ruby - String.from_bson(byte_buffer) - BSON::Int32.from_bson(byte_buffer) - + String.from_bson(byte_buffer) + BSON::Int32.from_bson(byte_buffer) Byte Buffers ------------ -BSON library 4.0 introduces the use of native byte buffers in MRI and JRuby -instead of using ``StringIO``, for improved performance. +``bson`` v4.0 introduces the use of native byte buffers in MRI and JRuby +instead of using ``StringIO`` for improved performance. -Writing -~~~~~~~ +Write to a Byte Buffer +~~~~~~~~~~~~~~~~~~~~~~ -To create a ``ByteBuffer`` for writing (i.e. serializing to BSON), -instantiate ``BSON::ByteBuffer`` with no arguments: +To create a ``ByteBuffer`` for writing, instantiate a +``BSON::ByteBuffer`` with no arguments: .. code-block:: ruby - buffer = BSON::ByteBuffer.new + buffer = BSON::ByteBuffer.new + +Raw Bytes +````````` + +To write raw bytes to the byte buffer with no transformations, use the +``put_byte`` and ``put_bytes`` methods. Each method takes a byte string +as its argument and copies this string into the buffer. The ``put_byte`` +method enforces that the argument is a string of length ``1``. ``put_bytes`` +accepts any length of strings. The strings can contain null bytes. -To write raw bytes to the byte buffer with no transformations, use -``put_byte`` and ``put_bytes`` methods. They take a byte string as the argument -and copy this string into the buffer. ``put_byte`` enforces that the argument -is a string of length 1; ``put_bytes`` accepts any length strings. -The strings can contain null bytes. +The following code demonstrates how to write raw bytes to a byte buffer: .. code-block:: ruby - buffer.put_byte("\x00") + buffer.put_byte("\x00") - buffer.put_bytes("\xff\xfe\x00\xfd") + buffer.put_bytes("\xff\xfe\x00\xfd") .. note:: - ``put_byte`` and ``put_bytes`` do not write a BSON type byte prior to - writing the argument to the byte buffer. + ``put_byte`` and ``put_bytes`` do not write a BSON type byte to the + buffer before writing the byte string. This means that the buffer + does not information about the type of data that the raw byte string encodes. + +Typed Byte Write Methods +```````````````````````` + +The write methods described in the following sections write objects of +particular types in the `BSON specification +`__. The type indicated by the method +name takes precedence over the type of the argument. For example, if a +floating-point value is passed to ``put_int32``, it is coerced into an +integer, and the driver writes the resulting integer to the byte buffer. -Subsequent write methods write objects of particular types in the -`BSON spec `_. Note that the type indicated -by the method name takes precedence over the type of the argument - -for example, if a floating-point value is given to ``put_int32``, it is -coerced into an integer and the resulting integer is written to the byte -buffer. +Strings +``````` -To write a UTF-8 string (BSON type 0x02) to the byte buffer, use ``put_string``: +To write a UTF-8 string (BSON type 0x02) to the byte buffer, use the +``put_string`` method: .. code-block:: ruby - buffer.put_string("hello, world") + buffer.put_string("hello, world") -Note that BSON strings are always encoded in UTF-8. Therefore, the -argument must be either in UTF-8 or in an encoding convertable to UTF-8 -(i.e. not binary). If the argument is in an encoding other than UTF-8, -the string is first converted to UTF-8 and the UTF-8 encoded version is -written to the buffer. The string must be valid in its claimed encoding, -including being valid UTF-8 if the encoding is UTF-8. -The string may contain null bytes. +BSON strings are always encoded in UTF-8. This means that the +argument to ``put_string`` must be either in UTF-8 or in an encoding +convertable to UTF-8 (not binary). If the argument is in an encoding +other than UTF-8, the string is first converted to UTF-8 and then the +UTF-8 encoded version is written to the buffer. The string must be valid +in its claimed encoding. The string can contain null bytes. -The BSON specification also defines a CString type, which is used for -example for document keys. To write CStrings to the buffer, use ``put_cstring``: +The BSON specification also defines a CString type, which is used, for +example, for document keys. To write CStrings to the buffer, use +``put_cstring``: .. code-block:: ruby - buffer.put_cstring("hello, world") + buffer.put_cstring("hello, world") As with regular strings, CStrings in BSON must be UTF-8 encoded. If the argument is not in UTF-8, it is converted to UTF-8 and the resulting string is written to the buffer. Unlike ``put_string``, the UTF-8 encoding of the argument given to ``put_cstring`` cannot have any null bytes, since the -CString serialization format in BSON is null terminated. +CString serialization format in BSON is null-terminated. Unlike ``put_string``, ``put_cstring`` also accepts symbols and integers. -In all cases the argument is stringified prior to being written: +In all cases the argument is stringified prior to being written to the buffer: .. code-block:: ruby - buffer.put_cstring(:hello) - buffer.put_cstring(42) + buffer.put_cstring(:hello) + buffer.put_cstring(42) + +Numbers +``````` To write a 32-bit or a 64-bit integer to the byte buffer, use -``put_int32`` and ``put_int64`` methods respectively. Note that Ruby +``put_int32`` and ``put_int64`` methods, respectively. Note that {+language+} integers can be arbitrarily large; if the value being written exceeds the range of a 32-bit or a 64-bit integer, ``put_int32`` and ``put_int64`` -raise ``RangeError``. +raise a ``RangeError`` error. + +The following code demonstrates how to write integer values to a byte +buffer: .. code-block:: ruby - buffer.put_int32(12345) - buffer.put_int64(123456789012345) + buffer.put_int32(12345) + buffer.put_int64(123456789012345) .. note:: @@ -163,77 +215,95 @@ To write a 64-bit floating point value to the byte buffer, use ``put_double``: .. code-block:: ruby - buffer.put_double(3.14159) + buffer.put_double(3.14159) + +Convert Bytes to Strings +```````````````````````` -To obtain the serialized data as a byte string (for example, to send the data -over a socket), call ``to_s`` on the buffer: +To retrieve the serialized data as a byte string, call ``to_s`` on the +buffer: .. code-block:: ruby - buffer = BSON::ByteBuffer.new - buffer.put_string('testing') - socket.write(buffer.to_s) + buffer = BSON::ByteBuffer.new + buffer.put_string('testing') + socket.write(buffer.to_s) .. note:: - ``ByteBuffer`` keeps track of read and write positions separately. - There is no way to rewind the buffer for writing - ``rewind`` only affects - the read position. + ``ByteBuffer`` keeps track of read and write positions separately. + There is no way to rewind the buffer for writing. The ``rewind`` + method affects only the read position. -Reading -~~~~~~~ +Read from a Byte Buffer +~~~~~~~~~~~~~~~~~~~~~~~ -To create a ``ByteBuffer`` for reading (i.e. deserializing from BSON), +To create a ``ByteBuffer`` for reading, or deserializing from BSON, instantiate ``BSON::ByteBuffer`` with a byte string as the argument: .. code-block:: ruby - buffer = BSON::ByteBuffer.new(string) # a read mode buffer. + buffer = BSON::ByteBuffer.new(string) -Reading from the buffer is done via the following API: +You can read from the buffer by using following methods that correspond +to different data types: .. code-block:: ruby - buffer.get_byte # Pulls a single byte from the buffer. - buffer.get_bytes(value) # Pulls n number of bytes from the buffer. - buffer.get_cstring # Pulls a null-terminated string from the buffer. - buffer.get_double # Pulls a 64-bit floating point from the buffer. - buffer.get_int32 # Pulls a 32-bit integer (4 bytes) from the buffer. - buffer.get_int64 # Pulls a 64-bit integer (8 bytes) from the buffer. - buffer.get_string # Pulls a UTF-8 string from the buffer. + buffer.get_byte # Pulls a single byte from the buffer + buffer.get_bytes(value) # Pulls n number of bytes from the buffer + buffer.get_cstring # Pulls a null-terminated string from the buffer + buffer.get_double # Pulls a 64-bit floating point from the buffer + buffer.get_int32 # Pulls a 32-bit integer (4 bytes) from the buffer + buffer.get_int64 # Pulls a 64-bit integer (8 bytes) from the buffer + buffer.get_string # Pulls a UTF-8 string from the buffer To restart reading from the beginning of a buffer, use ``rewind``: .. code-block:: ruby - buffer.rewind + buffer.rewind .. note:: - ``ByteBuffer`` keeps track of read and write positions separately. - ``rewind`` only affects the read position. + ``ByteBuffer`` keeps track of read and write positions separately. + The ``rewind`` method affects only the read position. Supported Classes ----------------- -Core Ruby classes that have representations in the BSON specification and -will have a ``to_bson`` method defined for them are: ``Object``, ``Array``, -``FalseClass``, ``Float``, ``Hash``, ``Integer``, ``BigDecimal``, ``NilClass``, -``Regexp``, ``String``, ``Symbol`` (deprecated), ``Time``, ``TrueClass``. - -In addition to the core Ruby objects, BSON also provides some special types -specific to the specification: - -``BSON::Binary`` -~~~~~~~~~~~~~~~~ +The following list provides the {+language+} classes that have +representations in the BSON specification and have a ``to_bson`` method +defined: + +- ``Object`` +- ``Array`` +- ``FalseClass`` +- ``Float`` +- ``Hash`` +- ``Integer`` +- ``BigDecimal`` +- ``NilClass`` +- ``Regexp`` +- ``String`` +- ``Symbol`` (deprecated) +- ``Time`` +- ``TrueClass`` + +In addition to the core {+language+} objects, BSON also provides some special types +specific to the specification. The following sections describe other +types that are supported in the driver. + +BSON::Binary +~~~~~~~~~~~~ Use ``BSON::Binary`` objects to store arbitrary binary data. You can construct ``Binary`` objects from binary strings, as shown in the following code: .. code-block:: ruby - BSON::Binary.new("binary_string") - # => + BSON::Binary.new("binary_string") + # => By default, ``Binary`` objects are created with BSON binary subtype 0 (``:generic``). You can explicitly specify the subtype to indicate that @@ -241,385 +311,317 @@ the bytes encode a particular type of data: .. code-block:: ruby - BSON::Binary.new("binary_string", :user) - # => + BSON::Binary.new("binary_string", :user) + # => -Valid subtypes are ``:generic``, ``:function``, ``:old``, ``:uuid_old``, -``:uuid``, ``:md5`` and ``:user``. +The valid subtype specifications are ``:generic``, ``:function``, +``:old``, ``:uuid_old``, ``:uuid``, ``:md5``, and ``:user``. You can use the ``data`` and ``type`` attributes to retrieve a ``Binary`` object's data and the subtype, as shown in the following code: .. code-block:: ruby - binary = BSON::Binary.new("binary_string", :user) - binary.data - => "binary_string" - binary.type - => :user + binary = BSON::Binary.new("binary_string", :user) + binary.data + # => "binary_string" + binary.type + # => :user You can compare ``Binary`` objects by using the ``<=>`` operator, which allows you to sort objects that have the same binary subtype. To compare ``Binary`` objects, ensure that you install v5.0.2 or later of the BSON library. -.. note:: +.. note:: BINARY Encoding - ``BSON::Binary`` objects always store the data in ``BINARY`` encoding, - regardless of the encoding that the string passed to the constructor - was in: + ``BSON::Binary`` objects always store the data in ``BINARY`` encoding, + regardless of the encoding of the string passed to the constructor: - .. code-block:: ruby + .. code-block:: ruby - str = "binary_string" - str.encoding - # => # - binary = BSON::Binary.new(str) - binary.data - # => "binary_string" - binary.data.encoding - # => # + str = "binary_string" + str.encoding + # => # + binary = BSON::Binary.new(str) + binary.data + # => "binary_string" + binary.data.encoding + # => # UUID Methods ```````````` -To create a UUID BSON::Binary (binary subtype 4) from its RFC 4122-compliant +To create a UUID ``BSON::Binary`` (binary subtype 4) from its RFC 4122-compliant string representation, use the ``from_uuid`` method: .. code-block:: ruby - uuid_str = "00112233-4455-6677-8899-aabbccddeeff" - BSON::Binary.from_uuid(uuid_str) - # => + uuid_str = "00112233-4455-6677-8899-aabbccddeeff" + BSON::Binary.from_uuid(uuid_str) + # => -To stringify a UUID BSON::Binary to an RFC 4122-compliant representation, +To stringify a UUID ``BSON::Binary`` to an RFC 4122-compliant representation, use the ``to_uuid`` method: .. code-block:: ruby - binary = BSON::Binary.new("\x00\x11\x22\x33\x44\x55\x66\x77\x88\x99\xAA\xBB\xCC\xDD\xEE\xFF".force_encoding('BINARY'), :uuid) - => - binary.to_uuid - => "00112233-4455-6677-8899aabbccddeeff" + binary = BSON::Binary.new("\x00\x11\x22\x33\x44\x55\x66\x77\x88\x99\xAA\xBB\xCC\xDD\xEE\xFF".force_encoding('BINARY'), :uuid) + # => + binary.to_uuid + # => "00112233-4455-6677-8899aabbccddeeff" -The standard representation may be explicitly specified when invoking both -``from_uuid`` and ``to_uuid`` methods: +You can explicitly specify standard UUID representation in +the ``from_uuid`` and ``to_uuid`` methods: .. code-block:: ruby - binary = BSON::Binary.from_uuid(uuid_str, :standard) - binary.to_uuid(:standard) + binary = BSON::Binary.from_uuid(uuid_str, :standard) + binary.to_uuid(:standard) -Note that the ``:standard`` representation can only be used with a Binary -of subtype ``:uuid`` (not ``:uuid_old``). +You can use the ``:standard`` representation only with a ``Binary`` +value of subtype ``:uuid``, not ``:uuid_old``. Legacy UUIDs ```````````` -Data stored in BSON::Binary objects of subtype 3 (``:uuid_old``) may be -persisted in one of three different byte orders depending on which driver -created the data. The byte orders are CSharp legacy, Java legacy and Python +Data stored in ``BSON::Binary`` objects of subtype 3 (``:uuid_old``) can be +persisted in one of three different byte orders depending on the driver +that created the data. The byte orders are CSharp legacy, Java legacy, and Python legacy. The Python legacy byte order is the same as the standard RFC 4122 -byte order; CSharp legacy and Java legacy byte orders have some of the bytes -swapped. +byte order. The CSharp legacy and Java legacy byte orders have some of +the bytes in different locations. -The Binary object containing a legacy UUID does not encode *which* format +The ``Binary`` object containing a legacy UUID does not encode *which* format the UUID is stored in. Therefore, methods that convert to and from the legacy UUID format take the desired format, or representation, as their argument. -An application may copy legacy UUID Binary objects without knowing which byte +An application may copy legacy UUID ``Binary`` objects without knowing which byte order they store their data in. The following methods for working with legacy UUIDs are provided for interoperability with existing deployments storing data in legacy UUID formats. -It is recommended that new applications use the ``:uuid`` (subtype 4) format +In new applications, use the ``:uuid`` (subtype 4) format only, which is compliant with RFC 4122. -To stringify a legacy UUID BSON::Binary, use the ``to_uuid`` method specifying -the desired representation. Accepted representations are ``:csharp_legacy``, -``:java_legacy`` and ``:python_legacy``. Note that a legacy UUID BSON::Binary +To stringify a legacy UUID ``BSON::Binary``, use the ``to_uuid`` method +and specify the desired representation. Accepted representations are ``:csharp_legacy``, +``:java_legacy`` and ``:python_legacy``. A legacy UUID ``BSON::Binary`` cannot be stringified without specifying a representation. .. code-block:: ruby - binary = BSON::Binary.new("\x00\x11\x22\x33\x44\x55\x66\x77\x88\x99\xAA\xBB\xCC\xDD\xEE\xFF".force_encoding('BINARY'), :uuid_old) - => + binary = BSON::Binary.new("\x00\x11\x22\x33\x44\x55\x66\x77\x88\x99\xAA\xBB\xCC\xDD\xEE\xFF".force_encoding('BINARY'), :uuid_old) + # => - binary.to_uuid - # => ArgumentError (Representation must be specified for BSON::Binary objects of type :uuid_old) + binary.to_uuid + # => ArgumentError (Representation must be specified for BSON::Binary objects of type :uuid_old) - binary.to_uuid(:csharp_legacy) - # => "33221100-5544-7766-8899aabbccddeeff" + binary.to_uuid(:csharp_legacy) + # => "33221100-5544-7766-8899aabbccddeeff" - binary.to_uuid(:java_legacy) - # => "77665544-3322-1100-ffeeddccbbaa9988" + binary.to_uuid(:java_legacy) + # => "77665544-3322-1100-ffeeddccbbaa9988" - binary.to_uuid(:python_legacy) - # => "00112233-4455-6677-8899aabbccddeeff" + binary.to_uuid(:python_legacy) + # => "00112233-4455-6677-8899aabbccddeeff" -To create a legacy UUID BSON::Binary from the string representation of the -UUID, use the ``from_uuid`` method specifying the desired representation: +To create a legacy UUID ``BSON::Binary`` from the string representation of the +UUID, use the ``from_uuid`` method and specify the desired representation: .. code-block:: ruby - uuid_str = "00112233-4455-6677-8899-aabbccddeeff" - - BSON::Binary.from_uuid(uuid_str, :csharp_legacy) - # => - - BSON::Binary.from_uuid(uuid_str, :java_legacy) - # => - - BSON::Binary.from_uuid(uuid_str, :python_legacy) - # => + uuid_str = "00112233-4455-6677-8899-aabbccddeeff" -These methods can be used to convert from one representation to another: + BSON::Binary.from_uuid(uuid_str, :csharp_legacy) + # => -.. code-block:: ruby - - BSON::Binary.from_uuid('77665544-3322-1100-ffeeddccbbaa9988',:java_legacy).to_uuid(:csharp_legacy) - # => "33221100-5544-7766-8899aabbccddeeff" + BSON::Binary.from_uuid(uuid_str, :java_legacy) + # => -``BSON::Code`` -~~~~~~~~~~~~~~ + BSON::Binary.from_uuid(uuid_str, :python_legacy) + # => -Represents a string of JavaScript code. +You can use these methods to convert from one representation to another: .. code-block:: ruby - BSON::Code.new("this.value = 5;") + BSON::Binary.from_uuid('77665544-3322-1100-ffeeddccbbaa9988',:java_legacy).to_uuid(:csharp_legacy) + # => "33221100-5544-7766-8899aabbccddeeff" -``BSON::CodeWithScope`` -~~~~~~~~~~~~~~~~~~~~~~~ - -.. note:: - - The ``CodeWithScope`` type is deprecated as of MongoDB 4.2.1. Starting - with MongoDB 4.4, support from ``CodeWithScope`` is being removed from - various server commands and operators such as ``$where``. Please use - other BSON types and operators when working with MongoDB 4.4 and newer. +BSON::Code +~~~~~~~~~~ -Represents a string of JavaScript code with a hash of values. +This type represents a string of JavaScript code: .. code-block:: ruby - BSON::CodeWithScope.new("this.value = age;", age: 5) + BSON::Code.new("this.value = 5;") -``BSON::DBRef`` -~~~~~~~~~~~~~~~ +BSON::DBRef +~~~~~~~~~~~ This is a subclass of ``BSON::Document`` that provides accessors for the -collection, id, and database of the DBRef. +collection, identifier, and database of the ``DBRef``. .. code-block:: ruby - BSON::DBRef.new({"$ref" => "collection", "$id" => "id"}) - BSON::DBRef.new({"$ref" => "collection", "$id" => "id", "database" => "db"}) + BSON::DBRef.new({"$ref" => "collection", "$id" => "id"}) + BSON::DBRef.new({"$ref" => "collection", "$id" => "id", "database" => "db"}) .. note:: - The BSON::DBRef constructor will validate the given hash and will raise an ArgumentError - if it is not a valid DBRef. ``BSON::ExtJSON.parse_obj`` and ``Hash.from_bson`` will not - raise an error if given an invalid DBRef, and will parse a Hash or deserialize a - BSON::Document instead. + The ``BSON::DBRef`` constructor validates the given hash and raises an ``ArgumentError`` + if it is not a valid ``DBRef``. The ``BSON::ExtJSON.parse_obj`` and + ``Hash.from_bson`` methods do not raise an error if passed an invalid + ``DBRef``, and parse a ``Hash`` or deserialize a ``BSON::Document`` instead. .. note:: - All BSON documents are deserialized into instances of BSON::DBRef if they are - valid DBRefs, otherwise they are deserialized into instances of BSON::Document. - This is true even when the invocation is made from the ``Hash`` class: - - .. code-block:: ruby - - bson = {"$ref" => "collection", "$id" => "id"}.to_bson.to_s - loaded = Hash.from_bson(BSON::ByteBuffer.new(bson)) - => {"$ref"=>"collection", "$id"=>"id"} - loaded.class - => BSON::DBRef + All BSON documents are deserialized into instances of ``BSON::DBRef`` if they are + valid ``DBRef`` instances, otherwise they are deserialized into + instances of ``BSON::Document``. This is true even when the + invocation is made from the ``Hash`` class: -For backwards compatibility with the MongoDB Ruby driver versions 2.17 and -earlier, ``BSON::DBRef`` also can be constructed using the legacy driver API. -This API is deprecated and will be removed in a future version of ``bson-ruby``: + .. code-block:: ruby -.. code-block:: ruby - - BSON::DBRef.new("collection", BSON::ObjectId('61eeb760a15d5d0f9f1e401d')) - BSON::DBRef.new("collection", BSON::ObjectId('61eeb760a15d5d0f9f1e401d'), "db") + bson = {"$ref" => "collection", "$id" => "id"}.to_bson.to_s + loaded = Hash.from_bson(BSON::ByteBuffer.new(bson)) + => {"$ref"=>"collection", "$id"=>"id"} + loaded.class + => BSON::DBRef -``BSON::Document`` -~~~~~~~~~~~~~~~~~~ +BSON::Document +~~~~~~~~~~~~~~ -This is a subclass of ``Hash`` that stores all keys as strings, but allows -access to them with symbol keys. +``BSON::Document`` is a subclass of ``Hash`` that stores all keys as +strings, but allows access to them by using symbol keys. .. code-block:: ruby - BSON::Document[:key, "value"] - BSON::Document.new + BSON::Document[:key, "value"] + BSON::Document.new .. note:: - All BSON documents are deserialized into instances of BSON::Document - (or BSON::DBRef, if they happen to be a valid DBRef), even when the - invocation is made from the ``Hash`` class: + All BSON documents are deserialized into instances of ``BSON::Document``, + or ``BSON::DBRef``, if they are valid ``DBRef`` instances, even when the + invocation is made from the ``Hash`` class: - .. code-block:: ruby + .. code-block:: ruby - bson = {test: 1}.to_bson.to_s - loaded = Hash.from_bson(BSON::ByteBuffer.new(bson)) - => {"test"=>1} - loaded.class - => BSON::Document + bson = {test: 1}.to_bson.to_s + loaded = Hash.from_bson(BSON::ByteBuffer.new(bson)) + # => {"test"=>1} + loaded.class + # => BSON::Document -``BSON::MaxKey`` -~~~~~~~~~~~~~~~~ +BSON::MaxKey +~~~~~~~~~~~~ -Represents a value in BSON that will always compare higher to another value. +``BSON::MaxKey`` represents a value in BSON that always compares +higher than any other value: .. code-block:: ruby - BSON::MaxKey.new + BSON::MaxKey.new -``BSON::MinKey`` -~~~~~~~~~~~~~~~~ +BSON::MinKey +~~~~~~~~~~~~ -Represents a value in BSON that will always compare lower to another value. +``BSON::MinKey`` represents a value in BSON that always compares +lower than any other value: .. code-block:: ruby - BSON::MinKey.new + BSON::MinKey.new -``BSON::ObjectId`` -~~~~~~~~~~~~~~~~~~ +BSON::ObjectId +~~~~~~~~~~~~~~ -Represents a 12 byte unique identifier for an object on a given machine. +``BSON::ObjectId`` represents a 12 byte unique identifier for an object: .. code-block:: ruby - BSON::ObjectId.new + BSON::ObjectId.new -``BSON::Timestamp`` -~~~~~~~~~~~~~~~~~~~ +BSON::Timestamp +~~~~~~~~~~~~~~~ -Represents a special time with a start and increment value. +``BSON::Timestamp`` represents a time with a start and increment value: .. code-block:: ruby - BSON::Timestamp.new(5, 30) + BSON::Timestamp.new(5, 30) -``BSON::Undefined`` -~~~~~~~~~~~~~~~~~~~ +BSON::Undefined +~~~~~~~~~~~~~~~ -Represents a placeholder for a value that was not provided. +``BSON::Undefined`` represents a placeholder for a value that is undefined: .. code-block:: ruby - BSON::Undefined.new + BSON::Undefined.new -``BSON::Decimal128`` -~~~~~~~~~~~~~~~~~~~~ +BSON::Decimal128 +~~~~~~~~~~~~~~~~ -Represents a 128-bit decimal-based floating-point value capable of emulating -decimal rounding with exact precision. +``BSON::Decimal128`` represents a 128-bit decimal-based floating-point +value that can emulate decimal rounding with exact precision: .. code-block:: ruby - # Instantiate with a String - BSON::Decimal128.new("1.28") + # Instantiate with a String + BSON::Decimal128.new("1.28") + + # Instantiate with a BigDecimal + d = BigDecimal(1.28, 3) + BSON::Decimal128.new(d) - # Instantiate with a BigDecimal - d = BigDecimal(1.28, 3) - BSON::Decimal128.new(d) +BSON::Decimal128 and BigDecimal +``````````````````````````````` -BSON::Decimal128 vs BigDecimal -`````````````````````````````` -The ``BigDecimal`` ``from_bson`` and ``to_bson`` methods use the same -``BSON::Decimal128`` methods under the hood. This leads to some limitations -that are imposed on the ``BigDecimal`` values that can be serialized to BSON +The ``BigDecimal#from_bson`` and ``BigDecimal#to_bson`` methods use the +equivalent ``BSON::Decimal128`` methods internally. This leads to some limitations +on ``BigDecimal`` values that can be serialized to BSON and those that can be deserialized from existing ``decimal128`` BSON -values. This change was made because serializing ``BigDecimal`` instances as -``BSON::Decimal128`` instances allows for more flexibility in terms of querying -and aggregation in MongoDB. The limitations imposed on ``BigDecimal`` are as -follows: +values. -- ``decimal128`` has a limited range and precision, while ``BigDecimal`` has no - restrictions in terms of range and precision. ``decimal128`` has a max value +Serializing ``BigDecimal`` instances as ``BSON::Decimal128`` instances +allows for more flexibility when querying and performing aggregations in +MongoDB. The following list describes the limitations on ``BigDecimal``: + +- ``Decimal128`` has a limited range and precision, while ``BigDecimal`` has no + restrictions in terms of range and precision. ``Decimal128`` has a max value of approximately ``10^6145`` and a min value of approximately ``-10^6145``, and has a maximum of 34 bits of precision. -- ``decimal128`` is able to accept signed ``NaN`` values, while ``BigDecimal`` +- ``Decimal128`` is able to accept signed ``NaN`` values, while ``BigDecimal`` is not. All signed ``NaN`` values that are deserialized into ``BigDecimal`` instances will be unsigned. -- ``decimal128`` maintains trailing zeroes when serializing to and +- ``Decimal128`` maintains trailing zeroes when serializing to and deserializing from BSON. ``BigDecimal``, however, does not maintain trailing zeroes and therefore using ``BigDecimal`` may result in a lack of precision. .. note:: - In BSON 5.0, ``decimal128`` is deserialized into ``BigDecimal`` by - default. In order to have ``decimal128`` values in BSON documents - deserialized into ``BSON::Decimal128``, the ``mode: :bson`` option can be set - on ``from_bson``. - -``Symbol`` -~~~~~~~~~~ - -The BSON specification defines a symbol type which allows round-tripping -Ruby ``Symbol`` values (i.e., a Ruby ``Symbol``is encoded into a BSON symbol -and a BSON symbol is decoded into a Ruby ``Symbol``). However, since most -programming langauges do not have a native symbol type, to promote -interoperabilty, MongoDB deprecated the BSON symbol type and encourages -strings to be used instead. - -.. note:: - - In BSON, hash *keys* are always strings. Non-string values will be - stringified when used as hash keys: - - .. code-block:: ruby - - Hash.from_bson({foo: 'bar'}.to_bson) - # => {"foo"=>"bar"} - - Hash.from_bson({1 => 2}.to_bson) - # => {"1"=>2} - -By default, the BSON library encodes ``Symbol`` hash values as strings and -decodes BSON symbols into Ruby ``Symbol`` values: - -.. code-block:: ruby - - {foo: :bar}.to_bson.to_s - # => "\x12\x00\x00\x00\x02foo\x00\x04\x00\x00\x00bar\x00\x00" - - # 0x02 is the string type - Hash.from_bson(BSON::ByteBuffer.new("\x12\x00\x00\x00\x02foo\x00\x04\x00\x00\x00bar\x00\x00".force_encoding('BINARY'))) - # => {"foo"=>"bar"} - - # 0x0E is the symbol type - Hash.from_bson(BSON::ByteBuffer.new("\x12\x00\x00\x00\x0Efoo\x00\x04\x00\x00\x00bar\x00\x00".force_encoding('BINARY'))) - # => {"foo"=>:bar} - -To force encoding of Ruby symbols to BSON symbols, wrap the Ruby symbols in -``BSON::Symbol::Raw``: - -.. code-block:: ruby - - {foo: BSON::Symbol::Raw.new(:bar)}.to_bson.to_s - # => "\x12\x00\x00\x00\x0Efoo\x00\x04\x00\x00\x00bar\x00\x00" + In BSON library v5.0, ``Decimal128`` is deserialized into ``BigDecimal`` by + default. In order to have ``Decimal128`` values in BSON documents + deserialized into ``BSON::Decimal128``, you can set the ``mode: :bson`` option + when calling ``from_bson``. JSON Serialization ------------------ -Some BSON types have special representations in JSON. These are as follows -and will be automatically serialized in the form when calling ``to_json`` on -them. +Some BSON types have special representations in JSON. The following table +describes the serialization behavior for the specified types when +you call ``to_json`` on them. .. list-table:: :header-rows: 1 - :widths: 40 105 + :widths: 35 65 - * - Object - - JSON + * - {+language+} BSON Object + - JSON Representation * - ``BSON::Binary`` - ``{ "$binary" : "\x01", "$type" : "md5" }`` @@ -648,304 +650,309 @@ them. * - ``Regexp`` - ``{ "$regex" : "[abc]", "$options" : "i" }`` - Time Instances -------------- -Times in Ruby can have nanosecond precision. Times in BSON (and MongoDB) -can only have millisecond precision. When Ruby ``Time`` instances are -serialized to BSON or Extended JSON, the times are floored to the nearest -millisecond. +Times in {+language+} have nanosecond precision. Times in BSON +have millisecond precision. When you serialize {+language+} ``Time`` instances +to BSON or Extended JSON, the times are rounded to the nearest millisecond. .. note:: - The time as always rounded down. If the time precedes the Unix epoch - (January 1, 1970 00:00:00 UTC), the absolute value of the time would - increase: + Time values are rounded down. If the time precedes the Unix epoch + (January 1, 1970 00:00:00 UTC), the absolute value of the time + increases: - .. code-block:: ruby + .. code-block:: ruby - time = Time.utc(1960, 1, 1, 0, 0, 0, 999_999) - time.to_f - # => -315619199.000001 - time.floor(3).to_f - # => -315619199.001 + time = Time.utc(1960, 1, 1, 0, 0, 0, 999_999) + time.to_f + # => -315619199.000001 + time.floor(3).to_f + # => -315619199.001 -.. note:: +Because of this rounding behavior, we recommend that you perform +all time calculations by using integer math, as inexactness of floating point +calculations might produce unexpected results. - JRuby as of version 9.2.11.0 `rounds pre-Unix epoch times up rather than - down `_. bson-ruby works around - this and correctly floors the times when serializing on JRuby. +.. note:: -Because of this flooring, applications are strongly recommended to perform -all time calculations using integer math, as inexactness of floating point -calculations may produce unexpected results. + JRuby 9.2.11.0 rounds pre-Unix epoch times up rather than + down. To learn more about this behavior, see the :github:`related + GitHub issue `. The BSON library corrects + this behavior and floors the times when serializing on JRuby. DateTime Instances ------------------ -BSON only supports storing the time as the number of seconds since the -Unix epoch. Ruby's ``DateTime`` instances can be serialized to BSON, +BSON supports storing time values as the number of seconds since the +Unix epoch. {+language+} ``DateTime`` instances can be serialized to BSON, but when the BSON is deserialized the times will be returned as ``Time`` instances. -``DateTime`` class in Ruby supports non-Gregorian calendars. When non-Gregorian -``DateTime`` instances are serialized, they are first converted to Gregorian -calendar, and the respective date in the Gregorian calendar is stored in the -database. - +The ``DateTime`` class in Ruby supports non-Gregorian calendars. When +non-Gregorian ``DateTime`` instances are serialized, they are first +converted to Gregorian calendar, and the respective date in the +Gregorian calendar is stored in the database. Date Instances -------------- -BSON only supports storing the time as the number of seconds since the -Unix epoch. Ruby's ``Date`` instances can be serialized to BSON, but when -the BSON is deserialized the times will be returned as ``Time`` instances. +BSON supports storing time values as the number of seconds since the +Unix epoch. {+language+} ``Date`` instances can be serialized to BSON, +but when the BSON is deserialized the times will be returned as ``Time`` +instances. When ``Date`` instances are serialized, the time value used is midnight -of the day that the ``Date`` refers to in UTC. - +on the ``Date`` in UTC. Regular Expressions ------------------- -Both MongoDB and Ruby provide facilities for working with regular expressions, +Both MongoDB and {+language+} provide support for working with regular expressions, but they use regular expression engines. The following subsections detail the -differences between Ruby regular expressions and MongoDB regular expressions -and describe how to work with both. +differences between {+language+} regular expressions and MongoDB regular +expressions. -Ruby vs MongoDB Regular Expressions -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +MongoDB Regular Expressions +~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +MongoDB uses `Perl-compatible regular expressions implemented by using +the PCRE library `__. `{+language+} regular expressions +`__ are implemented by using the +`Onigmo regular expression engine `__, +which is a fork of the `Oniguruma `__ +library. -MongoDB server uses `Perl-compatible regular expressions implemented using -the PCRE library `_ and `Ruby regular expressions -`_ are implemented using the -`Onigmo regular expression engine `_, -which is a fork of `Oniguruma `_. The two regular expression implementations generally provide equivalent -functionality but have several important syntax differences, as described -below. +functionality but have several important syntax differences, which are described +in the following sections. + +There is no simple way to programmatically convert a PCRE +regular expression into the equivalent {+language+} regular expression, +as there are currently no {+language+} bindings for PCRE. -Unfortunately, there is no simple way to programmatically convert a PCRE -regular expression into the equivalent Ruby regular expression, -and there are currently no Ruby bindings for PCRE. +Options, Flags, and Modifiers +````````````````````````````` -Options / Flags / Modifiers -``````````````````````````` +Both {+language+} and PCRE regular expressions support modifiers. These are +also called "options" in {+language+} contexts and "flags" in PCRE contexts. +The meaning of ``s`` and ``m`` modifiers differs in {+language+} and +PCRE in the following ways: -Both Ruby and PCRE regular expressions support modifiers. These are -also called "options" in Ruby parlance and "flags" in PCRE parlance. -The meaning of ``s`` and ``m`` modifiers differs in Ruby and PCRE: +- {+language+} does not have the ``s`` modifier. Instead, the {+language+} ``m`` modifier + performs the same function as the PCRE ``s`` modifier, which is to make the + period (``.``) match any character including newlines. The + {+language+} documentation refers to the ``m`` modifier as enabling multi-line mode. -- Ruby does not have the ``s`` modifier, instead the Ruby ``m`` modifier - performs the same function as the PCRE ``s`` modifier which is to make the - period (``.``) match any character including newlines. Confusingly, the - Ruby documentation refers to the ``m`` modifier as "enabling multi-line mode". -- Ruby always operates in the equivalent of PCRE's multi-line mode, enabled by +- {+language+} always operates in the equivalent of PCRE's multi-line mode, enabled by the ``m`` modifier in PCRE regular expressions. In Ruby the ``^`` anchor always refers to the beginning of line and the ``$`` anchor always refers to the end of line. -When writing regular expressions intended to be used in both Ruby and -PCRE environments (including MongoDB server and most other MongoDB drivers), -henceforth referred to as "portable regular expressions", avoid using -the ``^`` and ``$`` anchors. The following sections provide workarounds and -recommendations for authoring portable regular expressions. +When writing regular expressions intended to be used in both {+language+} and +PCRE environments, including {+mdb-server+} and most other MongoDB drivers, +avoid using the ``^`` and ``$`` anchors. The following sections provide +workarounds and recommendations for authoring portable regular +expressions that can be used in multiple contexts. -``^`` Anchor -```````````` +^ Anchor +```````` -In Ruby regular expressions, the ``^`` anchor always refers to the beginning +In {+language+} regular expressions, the ``^`` anchor always refers to the beginning of line. In PCRE regular expressions, the ``^`` anchor refers to the beginning -of input by default and the ``m`` flag changes its meaning to the beginning +of input by default, and the ``m`` flag changes its meaning to the beginning of line. -Both Ruby and PCRE regular expressions support the ``\A`` anchor to refer to -the beginning of input, regardless of modifiers. - -When writing portable regular expressions: +Both {+language+} and PCRE regular expressions support the ``\A`` anchor to refer to +the beginning of input, regardless of modifiers. The following +suggestions allow you to write portable regular expressions: - Use the ``\A`` anchor to refer to the beginning of input. -- Use the ``^`` anchor to refer to the beginning of line (this requires - setting the ``m`` flag in PCRE regular expressions). Alternatively use + +- Use the ``^`` anchor to refer to the beginning of line if you set the + ``m`` flag in PCRE regular expressions. Alternatively, use one of the following constructs which work regardless of modifiers: - - ``(?:\A|(?<=\n))`` (handles LF and CR+LF line ends) - - ``(?:\A|(?<=[\r\n]))`` (handles CR, LF and CR+LF line ends) -``$`` Anchor -```````````` + - ``(?:\A|(?<=\n))``: handles ``LF`` and ``CR+LF`` line ends + + - ``(?:\A|(?<=[\r\n]))``: handles ``CR``, ``LF`` and ``CR+LF`` line ends -In Ruby regular expressions, the ``$`` anchor always refers to the end +$ Anchor +```````` + +In {+language+} regular expressions, the ``$`` anchor always refers to the end of line. In PCRE regular expressions, the ``$`` anchor refers to the end of input by default and the ``m`` flag changes its meaning to the end of line. -Both Ruby and PCRE regular expressions support the ``\z`` anchor to refer to +Both {+language+} and PCRE regular expressions support the ``\z`` anchor to refer to the end of input, regardless of modifiers. -When writing portable regular expressions: +The following suggestions allow you to write portable regular expressions: - Use the ``\z`` anchor to refer to the end of input. -- Use the ``$`` anchor to refer to the beginning of line (this requires - setting the ``m`` flag in PCRE regular expressions). Alternatively use + +- Use the ``$`` anchor to refer to the beginning of line if you set the + ``m`` flag in PCRE regular expressions. Alternatively, use one of the following constructs which work regardless of modifiers: - - ``(?:\z|(?=\n))`` (handles LF and CR+LF line ends) - - ``(?:\z|(?=[\n\n]))`` (handles CR, LF and CR+LF line ends) -``BSON::Regexp::Raw`` Class -~~~~~~~~~~~~~~~~~~~~~~~~~~~ + - ``(?:\z|(?=\n))``: handles ``LF`` and ``CR+LF`` line ends + + - ``(?:\z|(?=[\n\n]))``: handles ``CR``, ``LF`` and ``CR+LF`` line ends + +BSON::Regexp::Raw +~~~~~~~~~~~~~~~~~ Since there is no simple way to programmatically convert a PCRE -regular expression into the equivalent Ruby regular expression, -bson-ruby provides the ``BSON::Regexp::Raw`` class for holding MongoDB/PCRE -regular expressions. Instances of this class are called "BSON regular -expressions" in this documentation. +regular expression into the equivalent {+language+} regular expression, +the BSON library provides the ``BSON::Regexp::Raw`` class for storing PCRE +regular expressions. -Instances of this class can be created using the regular expression text -as a string and optional PCRE modifiers: +You can create instances ``BSON::Regexp::Raw`` by using the regular +expression text as a string and optional PCRE modifiers: .. code-block:: ruby - BSON::Regexp::Raw.new("^b403158") - # => # + BSON::Regexp::Raw.new("^b403158") + # => # - BSON::Regexp::Raw.new("^Hello.world$", "s") - # => # + BSON::Regexp::Raw.new("^Hello.world$", "s") + # => # -The ``BSON::Regexp`` module is included in the Ruby ``Regexp`` class, such that -the ``BSON::`` prefix may be omitted: +The ``BSON::Regexp`` module is included in the {+language+} ``Regexp`` +class, such that the ``BSON::`` prefix can be omitted: .. code-block:: ruby - Regexp::Raw.new("^b403158") - # => # + Regexp::Raw.new("^b403158") + # => # - Regexp::Raw.new("^Hello.world$", "s") - # => # + Regexp::Raw.new("^Hello.world$", "s") + # => # Regular Expression Conversion ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -To convert a Ruby regular expression to a BSON regular expression, -instantiate a ``BSON::Regexp::Raw`` object as follows: +The following code converts a {+language+} regular expression to a +``BSON::Regexp::Raw`` instance: .. code-block:: ruby - regexp = /^Hello.world/ - bson_regexp = BSON::Regexp::Raw.new(regexp.source, regexp.options) - # => # + regexp = /^Hello.world/ + bson_regexp = BSON::Regexp::Raw.new(regexp.source, regexp.options) + # => # -Note that the ``BSON::Regexp::Raw`` constructor accepts both the Ruby numeric +The ``BSON::Regexp::Raw`` constructor accepts both the {+language+} numeric options and the PCRE modifier strings. -To convert a BSON regular expression to a Ruby regular expression, call the +To convert a BSON regular expression to a {+language+} regular expression, call the ``compile`` method on the BSON regular expression: .. code-block:: ruby - bson_regexp = BSON::Regexp::Raw.new("^hello.world", "s") - bson_regexp.compile - # => /^hello.world/m - - bson_regexp = BSON::Regexp::Raw.new("^hello", "") - bson_regexp.compile - # => /^hello.world/ + bson_regexp = BSON::Regexp::Raw.new("^hello.world", "s") + bson_regexp.compile + # => /^hello.world/m + + bson_regexp = BSON::Regexp::Raw.new("^hello.world", "") + bson_regexp.compile + # => /^hello.world/ + + bson_regexp = BSON::Regexp::Raw.new("^hello.world", "m") + bson_regexp.compile + # => /^hello.world/ - bson_regexp = BSON::Regexp::Raw.new("^hello.world", "m") - bson_regexp.compile - # => /^hello.world/ - -Note that the ``s`` PCRE modifier was converted to the ``m`` Ruby modifier -in the first example, and the last two examples were converted to the same -regular expression even though the original BSON regular expressions had -different meanings. +The ``s`` PCRE modifier was converted to the ``m`` Ruby modifier +in the first example in the preceding code, and the last two examples +were converted to the same regular expression even though the original +BSON regular expressions had different meanings. When a BSON regular expression uses the non-portable ``^`` and ``$`` -anchors, its conversion to a Ruby regular expression can change its meaning: +anchors, its conversion to a {+language+} regular expression can change +its meaning: .. code-block:: ruby - BSON::Regexp::Raw.new("^hello.world", "").compile =~ "42\nhello world" - # => 3 + BSON::Regexp::Raw.new("^hello.world", "").compile =~ "42\nhello world" + # => 3 -When a Ruby regular expression is converted to a BSON regular expression -(for example, to send to the server as part of a query), the BSON regular -expression always has the ``m`` modifier set reflecting the behavior of -``^`` and ``$`` anchors in Ruby regular expressions. +When a {+language+} regular expression is converted to a BSON regular +expression, for example as part of a query, the BSON regular +expression always has the ``m`` modifier set, reflecting the behavior of +``^`` and ``$`` anchors in {+language+} regular expressions. -Reading and Writing -~~~~~~~~~~~~~~~~~~~ +Read and Write Regex +~~~~~~~~~~~~~~~~~~~~ -Both Ruby and BSON regular expressions implement the ``to_bson`` method -for serialization to BSON: +Both {+language+} and BSON regular expressions implement the ``to_bson`` +method for serializing to BSON: .. code-block:: ruby - regexp_ruby = /^b403158/ - # => /^b403158/ - regexp_ruby.to_bson - # => # - _.to_s - # => "^b403158\x00m\x00" + regexp_ruby = /^b403158/ + # => /^b403158/ + regexp_ruby.to_bson + # => # + _.to_s + # => "^b403158\x00m\x00" - regexp_raw = Regexp::Raw.new("^b403158") - # => # - regexp_raw.to_bson - # => # - _.to_s - # => "^b403158\x00\x00" + regexp_raw = Regexp::Raw.new("^b403158") + # => # + regexp_raw.to_bson + # => # + _.to_s + # => "^b403158\x00\x00" Both ``Regexp`` and ``BSON::Regexp::Raw`` classes implement the ``from_bson`` class method that deserializes a regular expression from a BSON byte buffer. Methods of both classes return a ``BSON::Regexp::Raw`` instance that -must be converted to a Ruby regular expression using the ``compile`` method -as described above. +must be converted to a {+language+} regular expression by using the +``compile`` method as described in the preceding code. -.. code-block:: ruby +The following code demonstrates how to use the ``from_bson`` method to +deserialize a regular expression: - byte_buffer = BSON::ByteBuffer.new("^b403158\x00\x00") - regex = Regexp.from_bson(byte_buffer) - # => # - regex.pattern - # => "^b403158" - regex.options - # => "" - regex.compile - # => /^b403158/ +.. code-block:: ruby + byte_buffer = BSON::ByteBuffer.new("^b403158\x00\x00") + regex = Regexp.from_bson(byte_buffer) + # => # + regex.pattern + # => "^b403158" + regex.options + # => "" + regex.compile + # => /^b403158/ Key Order --------- -BSON documents preserve the order of keys, because the documents are stored -as lists of key-value pairs. Hashes in Ruby also preserve key order; thus -the order of keys specified in Ruby will be respected when serializing a -hash to a BSON document, and when deserializing a BSON document into a hash -the order of keys in the document will match the order of keys in the hash. - +BSON documents preserve the order of keys, because documents are stored +as lists of key-value pairs. Hashes in {+language+} also preserve key order, +so the order of keys specified in your application are preserved when +you serialize a hash to a BSON document, and when you deserialize a BSON +document into a hash. Duplicate Keys -------------- -BSON specification allows BSON documents to have duplicate keys, because the -documents are stored as lists of key-value pairs. Applications should refrain -from generating such documents, because MongoDB server behavior is undefined -when a BSON document contains duplicate keys. +The BSON specification allows BSON documents to have duplicate keys, because +documents are stored as lists of key-value pairs. Avoid creating +documents that contain duplicate keys, because {+mdb-server+} behavior +is undefined when a BSON document contains duplicate keys. -Since in Ruby hashes cannot have duplicate keys, when serializing Ruby hashes -to BSON documents no duplicate keys will be generated. (It is still possible -to hand-craft a BSON document that would have duplicate keys in Ruby, and -some of the other MongoDB BSON libraries may permit creating BSON documents -with duplicate keys.) +In {+language+}, hashes cannot have duplicate keys. When you serialize +{+language+} hashes to BSON documents, no duplicate keys are generated. -Note that, since keys in BSON documents are always stored as strings, -specifying the same key as as string and a symbol in Ruby only retains the +Because keys in BSON documents are always stored as strings, +specifying the same key as as string and a symbol in Ruby retains only the most recent specification: .. code-block:: ruby - BSON::Document.new(test: 1, 'test' => 2) - => {"test"=>2} + BSON::Document.new(test: 1, 'test' => 2) + # => {"test"=>2} When loading a BSON document with duplicate keys, the last value for a duplicated key overwrites previous values for the same key. diff --git a/source/includes/aggregation.rb b/source/includes/aggregation/aggregation.rb similarity index 100% rename from source/includes/aggregation.rb rename to source/includes/aggregation/aggregation.rb diff --git a/source/includes/aggregation/vector_search.rb b/source/includes/aggregation/vector_search.rb new file mode 100644 index 00000000..d8995f97 --- /dev/null +++ b/source/includes/aggregation/vector_search.rb @@ -0,0 +1,70 @@ +require 'bundler/inline' +gemfile do + source 'https://rubygems.org' + gem 'mongo' +end + +uri = '' + +Mongo::Client.new(uri) do |client| + + database = client.use('sample_mflix') + collection = database[:embedded_movies] + + # start-basic-query + query_vector = [ -0.0016261312, -0.028070757, -0.011342932, -0.012775794, -0.0027440966, 0.008683807, -0.02575152, -0.02020668, -0.010283281, -0.0041719596, 0.021392956, 0.028657231, -0.006634482, 0.007490867, 0.018593878, 0.0038187427, 0.029590257, -0.01451522, 0.016061379, 0.00008528442, -0.008943722, 0.01627464, 0.024311995, -0.025911469, 0.00022596726, -0.008863748, 0.008823762, -0.034921836, 0.007910728, -0.01515501, 0.035801545, -0.0035688248, -0.020299982, -0.03145631, -0.032256044, -0.028763862, -0.0071576433, -0.012769129, 0.012322609, -0.006621153, 0.010583182, 0.024085402, -0.001623632, 0.007864078, -0.021406285, 0.002554159, 0.012229307, -0.011762793, 0.0051682983, 0.0048484034, 0.018087378, 0.024325324, -0.037694257, -0.026537929, -0.008803768, -0.017767483, -0.012642504, -0.0062712682, 0.0009771782, -0.010409906, 0.017754154, -0.004671795, -0.030469967, 0.008477209, -0.005218282, -0.0058480743, -0.020153364, -0.0032805866, 0.004248601, 0.0051449724, 0.006791097, 0.007650814, 0.003458861, -0.0031223053, -0.01932697, -0.033615597, 0.00745088, 0.006321252, -0.0038154104, 0.014555207, 0.027697546, -0.02828402, 0.0066711367, 0.0077107945, 0.01794076, 0.011349596, -0.0052715978, 0.014755142, -0.019753495, -0.011156326, 0.011202978, 0.022126047, 0.00846388, 0.030549942, -0.0041386373, 0.018847128, -0.00033655585, 0.024925126, -0.003555496, -0.019300312, 0.010749794, 0.0075308536, -0.018287312, -0.016567878, -0.012869096, -0.015528221, 0.0078107617, -0.011156326, 0.013522214, -0.020646535, -0.01211601, 0.055928253, 0.011596181, -0.017247654, 0.0005939711, -0.026977783, -0.003942035, -0.009583511, -0.0055248477, -0.028737204, 0.023179034, 0.003995351, 0.0219661, -0.008470545, 0.023392297, 0.010469886, -0.015874773, 0.007890735, -0.009690142, -0.00024970944, 0.012775794, 0.0114762215, 0.013422247, 0.010429899, -0.03686786, -0.006717788, -0.027484283, 0.011556195, -0.036068123, -0.013915418, -0.0016327957, 0.0151016945, -0.020473259, 0.004671795, -0.012555866, 0.0209531, 0.01982014, 0.024485271, 0.0105431955, -0.005178295, 0.033162415, -0.013795458, 0.007150979, 0.010243294, 0.005644808, 0.017260984, -0.0045618312, 0.0024725192, 0.004305249, -0.008197301, 0.0014203656, 0.0018460588, 0.005015015, -0.011142998, 0.01439526, 0.022965772, 0.02552493, 0.007757446, -0.0019726837, 0.009503538, -0.032042783, 0.008403899, -0.04609149, 0.013808787, 0.011749465, 0.036388017, 0.016314628, 0.021939443, -0.0250051, -0.017354285, -0.012962398, 0.00006107364, 0.019113706, 0.03081652, -0.018114036, -0.0084572155, 0.009643491, -0.0034721901, 0.0072642746, -0.0090636825, 0.01642126, 0.013428912, 0.027724205, 0.0071243206, -0.6858542, -0.031029783, -0.014595194, -0.011449563, 0.017514233, 0.01743426, 0.009950057, 0.0029706885, -0.015714826, -0.001806072, 0.011856096, 0.026444625, -0.0010663156, -0.006474535, 0.0016161345, -0.020313311, 0.0148351155, -0.0018393943, 0.0057347785, 0.018300641, -0.018647194, 0.03345565, -0.008070676, 0.0071443142, 0.014301958, 0.0044818576, 0.003838736, -0.007350913, -0.024525259, -0.001142124, -0.018620536, 0.017247654, 0.007037683, 0.010236629, 0.06046009, 0.0138887605, -0.012122675, 0.037694257, 0.0055081863, 0.042492677, 0.00021784494, -0.011656162, 0.010276617, 0.022325981, 0.005984696, -0.009496873, 0.013382261, -0.0010563189, 0.0026507939, -0.041639622, 0.008637156, 0.026471283, -0.008403899, 0.024858482, -0.00066686375, -0.0016252982, 0.027590916, 0.0051449724, 0.0058647357, -0.008743787, -0.014968405, 0.027724205, -0.011596181, 0.0047650975, -0.015381602, 0.0043718936, 0.002159289, 0.035908177, -0.008243952, -0.030443309, 0.027564257, 0.042625964, -0.0033688906, 0.01843393, 0.019087048, 0.024578573, 0.03268257, -0.015608194, -0.014128681, -0.0033538956, -0.0028757197, -0.004121976, -0.032389335, 0.0034322033, 0.058807302, 0.010943064, -0.030523283, 0.008903735, 0.017500903, 0.00871713, -0.0029406983, 0.013995391, -0.03132302, -0.019660193, -0.00770413, -0.0038853872, 0.0015894766, -0.0015294964, -0.006251275, -0.021099718, -0.010256623, -0.008863748, 0.028550599, 0.02020668, -0.0012962399, -0.003415542, -0.0022509254, 0.0119360695, 0.027590916, -0.046971202, -0.0015194997, -0.022405956, 0.0016677842, -0.00018535563, -0.015421589, -0.031802863, 0.03814744, 0.0065411795, 0.016567878, -0.015621523, 0.022899127, -0.011076353, 0.02841731, -0.002679118, -0.002342562, 0.015341615, 0.01804739, -0.020566562, -0.012989056, -0.002990682, 0.01643459, 0.00042527664, 0.008243952, -0.013715484, -0.004835075, -0.009803439, 0.03129636, -0.021432944, 0.0012087687, -0.015741484, -0.0052016205, 0.00080890034, -0.01755422, 0.004811749, -0.017967418, -0.026684547, -0.014128681, 0.0041386373, -0.013742141, -0.010056688, -0.013268964, -0.0110630235, -0.028337335, 0.015981404, -0.00997005, -0.02424535, -0.013968734, -0.028310679, -0.027750863, -0.020699851, 0.02235264, 0.001057985, 0.00081639783, -0.0099367285, 0.013522214, -0.012016043, -0.00086471526, 0.013568865, 0.0019376953, -0.019020405, 0.017460918, -0.023045745, 0.008503866, 0.0064678704, -0.011509543, 0.018727167, -0.003372223, -0.0028690554, -0.0027024434, -0.011902748, -0.012182655, -0.015714826, -0.0098634185, 0.00593138, 0.018753825, 0.0010146659, 0.013029044, 0.0003521757, -0.017620865, 0.04102649, 0.00552818, 0.024485271, -0.009630162, -0.015608194, 0.0006718621, -0.0008418062, 0.012395918, 0.0057980907, 0.016221326, 0.010616505, 0.004838407, -0.012402583, 0.019900113, -0.0034521967, 0.000247002, -0.03153628, 0.0011038032, -0.020819811, 0.016234655, -0.00330058, -0.0032289368, 0.00078973995, -0.021952773, -0.022459272, 0.03118973, 0.03673457, -0.021472929, 0.0072109587, -0.015075036, 0.004855068, -0.0008151483, 0.0069643734, 0.010023367, -0.010276617, -0.023019087, 0.0068244194, -0.0012520878, -0.0015086699, 0.022046074, -0.034148756, -0.0022192693, 0.002427534, -0.0027124402, 0.0060346797, 0.015461575, 0.0137554705, 0.009230294, -0.009583511, 0.032629255, 0.015994733, -0.019167023, -0.009203636, 0.03393549, -0.017274313, -0.012042701, -0.0009930064, 0.026777849, -0.013582194, -0.0027590916, -0.017594207, -0.026804507, -0.0014236979, -0.022032745, 0.0091236625, -0.0042419364, -0.00858384, -0.0033905501, -0.020739838, 0.016821127, 0.022539245, 0.015381602, 0.015141681, 0.028817179, -0.019726837, -0.0051283115, -0.011489551, -0.013208984, -0.0047017853, -0.0072309524, 0.01767418, 0.0025658219, -0.010323267, 0.012609182, -0.028097415, 0.026871152, -0.010276617, 0.021912785, 0.0022542577, 0.005124979, -0.0019710176, 0.004518512, -0.040360045, 0.010969722, -0.0031539614, -0.020366628, -0.025778178, -0.0110030435, -0.016221326, 0.0036587953, 0.016207997, 0.003007343, -0.0032555948, 0.0044052163, -0.022046074, -0.0008822095, -0.009363583, 0.028230704, -0.024538586, 0.0029840174, 0.0016044717, -0.014181997, 0.031349678, -0.014381931, -0.027750863, 0.02613806, 0.0004136138, -0.005748107, -0.01868718, -0.0010138329, 0.0054348772, 0.010703143, -0.003682121, 0.0030856507, -0.004275259, -0.010403241, 0.021113047, -0.022685863, -0.023032416, 0.031429652, 0.001792743, -0.005644808, -0.011842767, -0.04078657, -0.0026874484, 0.06915057, -0.00056939584, -0.013995391, 0.010703143, -0.013728813, -0.022939114, -0.015261642, -0.022485929, 0.016807798, 0.007964044, 0.0144219175, 0.016821127, 0.0076241563, 0.005461535, -0.013248971, 0.015301628, 0.0085171955, -0.004318578, 0.011136333, -0.0059047225, -0.010249958, -0.018207338, 0.024645219, 0.021752838, 0.0007614159, -0.013648839, 0.01111634, -0.010503208, -0.0038487327, -0.008203966, -0.00397869, 0.0029740208, 0.008530525, 0.005261601, 0.01642126, -0.0038753906, -0.013222313, 0.026537929, 0.024671877, -0.043505676, 0.014195326, 0.024778508, 0.0056914594, -0.025951454, 0.017620865, -0.0021359634, 0.008643821, 0.021299653, 0.0041686273, -0.009017031, 0.04044002, 0.024378639, -0.027777521, -0.014208655, 0.0028623908, 0.042119466, 0.005801423, -0.028124074, -0.03129636, 0.022139376, -0.022179363, -0.04067994, 0.013688826, 0.013328944, 0.0046184794, -0.02828402, -0.0063412455, -0.0046184794, -0.011756129, -0.010383247, -0.0018543894, -0.0018593877, -0.00052024535, 0.004815081, 0.014781799, 0.018007403, 0.01306903, -0.020433271, 0.009043689, 0.033189073, -0.006844413, -0.019766824, -0.018767154, 0.00533491, -0.0024575242, 0.018727167, 0.0058080875, -0.013835444, 0.0040719924, 0.004881726, 0.012029372, 0.005664801, 0.03193615, 0.0058047553, 0.002695779, 0.009290274, 0.02361889, 0.017834127, 0.0049017193, -0.0036388019, 0.010776452, -0.019793482, 0.0067777685, -0.014208655, -0.024911797, 0.002385881, 0.0034988478, 0.020899786, -0.0025858153, -0.011849431, 0.033189073, -0.021312982, 0.024965113, -0.014635181, 0.014048708, -0.0035921505, -0.003347231, 0.030869836, -0.0017161017, -0.0061346465, 0.009203636, -0.025165047, 0.0068510775, 0.021499587, 0.013782129, -0.0024475274, -0.0051149824, -0.024445284, 0.006167969, 0.0068844, -0.00076183246, 0.030150073, -0.0055948244, -0.011162991, -0.02057989, -0.009703471, -0.020646535, 0.008004031, 0.0066378145, -0.019900113, -0.012169327, -0.01439526, 0.0044252095, -0.004018677, 0.014621852, -0.025085073, -0.013715484, -0.017980747, 0.0071043274, 0.011456228, -0.01010334, -0.0035321703, -0.03801415, -0.012036037, -0.0028990454, -0.05419549, -0.024058744, -0.024272008, 0.015221654, 0.027964126, 0.03182952, -0.015354944, 0.004855068, 0.011522872, 0.004771762, 0.0027874154, 0.023405626, 0.0004242353, -0.03132302, 0.007057676, 0.008763781, -0.0027057757, 0.023005757, -0.0071176565, -0.005238275, 0.029110415, -0.010989714, 0.013728813, -0.009630162, -0.029137073, -0.0049317093, -0.0008630492, -0.015248313, 0.0043219104, -0.0055681667, -0.013175662, 0.029723546, 0.025098402, 0.012849103, -0.0009996708, 0.03118973, -0.0021709518, 0.0260181, -0.020526575, 0.028097415, -0.016141351, 0.010509873, -0.022965772, 0.002865723, 0.0020493253, 0.0020509914, -0.0041419696, -0.00039695262, 0.017287642, 0.0038987163, 0.014795128, -0.014661839, -0.008950386, 0.004431874, -0.009383577, 0.0012604183, -0.023019087, 0.0029273694, -0.033135757, 0.009176978, -0.011023037, -0.002102641, 0.02663123, -0.03849399, -0.0044152127, 0.0004527676, -0.0026924468, 0.02828402, 0.017727496, 0.035135098, 0.02728435, -0.005348239, -0.001467017, -0.019766824, 0.014715155, 0.011982721, 0.0045651635, 0.023458943, -0.0010046692, -0.0031373003, -0.0006972704, 0.0019043729, -0.018967088, -0.024311995, 0.0011546199, 0.007977373, -0.004755101, -0.010016702, -0.02780418, -0.004688456, 0.013022379, -0.005484861, 0.0017227661, -0.015394931, -0.028763862, -0.026684547, 0.0030589928, -0.018513903, 0.028363993, 0.0044818576, -0.009270281, 0.038920518, -0.016008062, 0.0093902415, 0.004815081, -0.021059733, 0.01451522, -0.0051583014, 0.023765508, -0.017874114, -0.016821127, -0.012522544, -0.0028390652, 0.0040886537, 0.020259995, -0.031216389, -0.014115352, -0.009176978, 0.010303274, 0.020313311, 0.0064112223, -0.02235264, -0.022872468, 0.0052449396, 0.0005723116, 0.0037321046, 0.016807798, -0.018527232, -0.009303603, 0.0024858483, -0.0012662497, -0.007110992, 0.011976057, -0.007790768, -0.042999174, -0.006727785, -0.011829439, 0.007024354, 0.005278262, -0.017740825, -0.0041519664, 0.0085905045, 0.027750863, -0.038387362, 0.024391968, 0.00087721116, 0.010509873, -0.00038508154, -0.006857742, 0.0183273, -0.0037054466, 0.015461575, 0.0017394272, -0.0017944091, 0.014181997, -0.0052682655, 0.009023695, 0.00719763, -0.013522214, 0.0034422, 0.014941746, -0.0016711164, -0.025298337, -0.017634194, 0.0058714002, -0.005321581, 0.017834127, 0.0110630235, -0.03369557, 0.029190388, -0.008943722, 0.009363583, -0.0034222065, -0.026111402, -0.007037683, -0.006561173, 0.02473852, -0.007084334, -0.010110005, -0.008577175, 0.0030439978, -0.022712521, 0.0054582027, -0.0012620845, -0.0011954397, -0.015741484, 0.0129557345, -0.00042111133, 0.00846388, 0.008930393, 0.016487904, 0.010469886, -0.007917393, -0.011762793, -0.0214596, 0.000917198, 0.021672864, 0.010269952, -0.007737452, -0.010243294, -0.0067244526, -0.015488233, -0.021552904, 0.017127695, 0.011109675, 0.038067464, 0.00871713, -0.0025591573, 0.021312982, -0.006237946, 0.034628596, -0.0045251767, 0.008357248, 0.020686522, 0.0010696478, 0.0076708077, 0.03772091, -0.018700508, -0.0020676525, -0.008923728, -0.023298996, 0.018233996, -0.010256623, 0.0017860786, 0.009796774, -0.00897038, -0.01269582, -0.018527232, 0.009190307, -0.02372552, -0.042119466, 0.008097334, -0.0066778013, -0.021046404, 0.0019593548, 0.011083017, -0.0016028056, 0.012662497, -0.000059095124, 0.0071043274, -0.014675168, 0.024831824, -0.053582355, 0.038387362, 0.0005698124, 0.015954746, 0.021552904, 0.031589597, -0.009230294, -0.0006147976, 0.002625802, -0.011749465, -0.034362018, -0.0067844326, -0.018793812, 0.011442899, -0.008743787, 0.017474247, -0.021619547, 0.01831397, -0.009037024, -0.0057247817, -0.02728435, 0.010363255, 0.034415334, -0.024032086, -0.0020126705, -0.0045518344, -0.019353628, -0.018340627, -0.03129636, -0.0034038792, -0.006321252, -0.0016161345, 0.033642255, -0.000056075285, -0.005005019, 0.004571828, -0.0024075406, -0.00010215386, 0.0098634185, 0.1980148, -0.003825407, -0.025191706, 0.035161756, 0.005358236, 0.025111731, 0.023485601, 0.0023342315, -0.011882754, 0.018287312, -0.0068910643, 0.003912045, 0.009243623, -0.001355387, -0.028603915, -0.012802451, -0.030150073, -0.014795128, -0.028630573, -0.0013487226, 0.002667455, 0.00985009, -0.0033972147, -0.021486258, 0.009503538, -0.017847456, 0.013062365, -0.014341944, 0.005078328, 0.025165047, -0.015594865, -0.025924796, -0.0018177348, 0.010996379, -0.02993681, 0.007324255, 0.014475234, -0.028577257, 0.005494857, 0.00011725306, -0.013315615, 0.015941417, 0.009376912, 0.0025158382, 0.008743787, 0.023832154, -0.008084005, -0.014195326, -0.008823762, 0.0033455652, -0.032362677, -0.021552904, -0.0056081535, 0.023298996, -0.025444955, 0.0097301295, 0.009736794, 0.015274971, -0.0012937407, -0.018087378, -0.0039387033, 0.008637156, -0.011189649, -0.00023846315, -0.011582852, 0.0066411467, -0.018220667, 0.0060846633, 0.0376676, -0.002709108, 0.0072776037, 0.0034188742, -0.010249958, -0.0007747449, -0.00795738, -0.022192692, 0.03910712, 0.032122757, 0.023898797, 0.0076241563, -0.007397564, -0.003655463, 0.011442899, -0.014115352, -0.00505167, -0.031163072, 0.030336678, -0.006857742, -0.022259338, 0.004048667, 0.02072651, 0.0030156737, -0.0042119464, 0.00041861215, -0.005731446, 0.011103011, 0.013822115, 0.021512916, 0.009216965, -0.006537847, -0.027057758, -0.04054665, 0.010403241, -0.0056281467, -0.005701456, -0.002709108, -0.00745088, -0.0024841821, 0.009356919, -0.022659205, 0.004061996, -0.013175662, 0.017074378, -0.006141311, -0.014541878, 0.02993681, -0.00028448965, -0.025271678, 0.011689484, -0.014528549, 0.004398552, -0.017274313, 0.0045751603, 0.012455898, 0.004121976, -0.025458284, -0.006744446, 0.011822774, -0.015035049, -0.03257594, 0.014675168, -0.0039187097, 0.019726837, -0.0047251107, 0.0022825818, 0.011829439, 0.005391558, -0.016781142, -0.0058747325, 0.010309938, -0.013049036, 0.01186276, -0.0011246296, 0.0062112883, 0.0028190718, -0.021739509, 0.009883412, -0.0073175905, -0.012715813, -0.017181009, -0.016607866, -0.042492677, -0.0014478565, -0.01794076, 0.012302616, -0.015194997, -0.04433207, -0.020606548, 0.009696807, 0.010303274, -0.01694109, -0.004018677, 0.019353628, -0.001991011, 0.000058938927, 0.010536531, -0.17274313, 0.010143327, 0.014235313, -0.024152048, 0.025684876, -0.0012504216, 0.036601283, -0.003698782, 0.0007310093, 0.004165295, -0.0029157067, 0.017101036, -0.046891227, -0.017460918, 0.022965772, 0.020233337, -0.024072073, 0.017220996, 0.009370248, 0.0010363255, 0.0194336, -0.019606877, 0.01818068, -0.020819811, 0.007410893, 0.0019326969, 0.017887443, 0.006651143, 0.00067394477, -0.011889419, -0.025058415, -0.008543854, 0.021579562, 0.0047484366, 0.014062037, 0.0075508473, -0.009510202, -0.009143656, 0.0046817916, 0.013982063, -0.0027990784, 0.011782787, 0.014541878, -0.015701497, -0.029350337, 0.021979429, 0.01332228, -0.026244693, -0.0123492675, -0.003895384, 0.0071576433, -0.035454992, -0.00046984528, 0.0033522295, 0.039347045, 0.0005119148, 0.00476843, -0.012995721, 0.0024042083, -0.006931051, -0.014461905, -0.0127558, 0.0034555288, -0.0074842023, -0.030256703, -0.007057676, -0.00807734, 0.007804097, -0.006957709, 0.017181009, -0.034575284, -0.008603834, -0.005008351, -0.015834786, 0.02943031, 0.016861115, -0.0050849924, 0.014235313, 0.0051449724, 0.0025924798, -0.0025741523, 0.04289254, -0.002104307, 0.012969063, -0.008310596, 0.00423194, 0.0074975314, 0.0018810473, -0.014248641, -0.024725191, 0.0151016945, -0.017527562, 0.0018727167, 0.0002830318, 0.015168339, 0.0144219175, -0.004048667, -0.004358565, 0.011836103, -0.010343261, -0.005911387, 0.0022825818, 0.0073175905, 0.00403867, 0.013188991, 0.03334902, 0.006111321, 0.008597169, 0.030123414, -0.015474904, 0.0017877447, -0.024551915, 0.013155668, 0.023525586, -0.0255116, 0.017220996, 0.004358565, -0.00934359, 0.0099967085, 0.011162991, 0.03092315, -0.021046404, -0.015514892, 0.0011946067, -0.01816735, 0.010876419, -0.10124666, -0.03550831, 0.0056348112, 0.013942076, 0.005951374, 0.020419942, -0.006857742, -0.020873128, -0.021259667, 0.0137554705, 0.0057880944, -0.029163731, -0.018767154, -0.021392956, 0.030896494, -0.005494857, -0.0027307675, -0.006801094, -0.014821786, 0.021392956, -0.0018110704, -0.0018843795, -0.012362596, -0.0072176233, -0.017194338, -0.018713837, -0.024272008, 0.03801415, 0.00015880188, 0.0044951867, -0.028630573, -0.0014070367, -0.00916365, -0.026537929, -0.009576847, -0.013995391, -0.0077107945, 0.0050016865, 0.00578143, -0.04467862, 0.008363913, 0.010136662, -0.0006268769, -0.006591163, 0.015341615, -0.027377652, -0.00093136, 0.029243704, -0.020886457, -0.01041657, -0.02424535, 0.005291591, -0.02980352, -0.009190307, 0.019460259, -0.0041286405, 0.004801752, 0.0011787785, -0.001257086, -0.011216307, -0.013395589, 0.00088137644, -0.0051616337, 0.03876057, -0.0033455652, 0.00075850025, -0.006951045, -0.0062112883, 0.018140694, -0.006351242, -0.008263946, 0.018154023, -0.012189319, 0.0075508473, -0.044358727, -0.0040153447, 0.0093302615, -0.010636497, 0.032789204, -0.005264933, -0.014235313, -0.018393943, 0.007297597, -0.016114693, 0.015021721, 0.020033404, 0.0137688, 0.0011046362, 0.010616505, -0.0039453674, 0.012109346, 0.021099718, -0.0072842683, -0.019153694, -0.003768759, 0.039320387, -0.006747778, -0.0016852784, 0.018154023, 0.0010963057, -0.015035049, -0.021033075, -0.04345236, 0.017287642, 0.016341286, -0.008610498, 0.00236922, 0.009290274, 0.028950468, -0.014475234, -0.0035654926, 0.015434918, -0.03372223, 0.004501851, -0.012929076, -0.008483873, -0.0044685286, -0.0102233, 0.01615468, 0.0022792495, 0.010876419, -0.0059647025, 0.01895376, -0.0069976957, -0.0042952523, 0.017207667, -0.00036133936, 0.0085905045, 0.008084005, 0.03129636, -0.016994404, -0.014915089, 0.020100048, -0.012009379, -0.006684466, 0.01306903, 0.00015765642, -0.00530492, 0.0005277429, 0.015421589, 0.015528221, 0.032202728, -0.003485519, -0.0014286962, 0.033908837, 0.001367883, 0.010509873, 0.025271678, -0.020993087, 0.019846799, 0.006897729, -0.010216636, -0.00725761, 0.01818068, -0.028443968, -0.011242964, -0.014435247, -0.013688826, 0.006101324, -0.0022509254, 0.013848773, -0.0019077052, 0.017181009, 0.03422873, 0.005324913, -0.0035188415, 0.014128681, -0.004898387, 0.005038341, 0.0012320944, -0.005561502, -0.017847456, 0.0008538855, -0.0047884234, 0.011849431, 0.015421589, -0.013942076, 0.0029790192, -0.013702155, 0.0001199605, -0.024431955, 0.019926772, 0.022179363, -0.016487904, -0.03964028, 0.0050849924, 0.017487574, 0.022792496, 0.0012504216, 0.004048667, -0.00997005, 0.0076041627, -0.014328616, -0.020259995, 0.0005598157, -0.010469886, 0.0016852784, 0.01716768, -0.008990373, -0.001987679, 0.026417969, 0.023792166, 0.0046917885, -0.0071909656, -0.00032051947, -0.023259008, -0.009170313, 0.02071318, -0.03156294, -0.030869836, -0.006324584, 0.013795458, -0.00047151142, 0.016874444, 0.00947688, 0.00985009, -0.029883493, 0.024205362, -0.013522214, -0.015075036, -0.030603256, 0.029270362, 0.010503208, 0.021539574, 0.01743426, -0.023898797, 0.022019416, -0.0068777353, 0.027857494, -0.021259667, 0.0025758184, 0.006197959, 0.006447877, -0.00025200035, -0.004941706, -0.021246338, -0.005504854, -0.008390571, -0.0097301295, 0.027244363, -0.04446536, 0.05216949, 0.010243294, -0.016008062, 0.0122493, -0.0199401, 0.009077012, 0.019753495, 0.006431216, -0.037960835, -0.027377652, 0.016381273, -0.0038620618, 0.022512587, -0.010996379, -0.0015211658, -0.0102233, 0.007071005, 0.008230623, -0.009490209, -0.010083347, 0.024431955, 0.002427534, 0.02828402, 0.0035721571, -0.022192692, -0.011882754, 0.010056688, 0.0011904413, -0.01426197, -0.017500903, -0.00010985966, 0.005591492, -0.0077707744, -0.012049366, 0.011869425, 0.00858384, -0.024698535, -0.030283362, 0.020140035, 0.011949399, -0.013968734, 0.042732596, -0.011649498, -0.011982721, -0.016967745, -0.0060913274, -0.007130985, -0.013109017, -0.009710136 ] + + pipeline = [ + { + '$vectorSearch' => { + 'queryVector' => query_vector, + 'index' => 'vs_idx', + 'path' => 'plot_embedding', + 'numCandidates' => 150, + 'limit' => 5 + } + }, + { + '$project' => { + '_id' => 0, + 'plot' => 1, + 'title' => 1, + } + } + ] + + results = collection.aggregate(pipeline) + + results.each do |doc| + puts doc + end + # end-basic-query + + # start-score-query + score_pipeline = [ + { + '$vectorSearch' => { + 'queryVector' => query_vector, + 'index' => 'vs_idx', + 'path' => 'plot_embedding', + 'numCandidates' => 150, + 'limit' => 5 + } + }, + { + '$project' => { + '_id' => 0, + 'title' => 1, + 'score' => { '$meta' => 'vectorSearchScore' } + } + } + ] + + results = collection.aggregate(score_pipeline) + + results.each do |document| + puts document + end + # end-score-query + +end diff --git a/source/index.txt b/source/index.txt index 72662bd0..1c0498c8 100644 --- a/source/index.txt +++ b/source/index.txt @@ -24,6 +24,7 @@ Indexes Monitor Your Application Data Aggregation + {+avs+} Security Data Formats View the Source @@ -92,6 +93,13 @@ Transform Your Data with Aggregation Learn how to use the {+driver-short+} to perform aggregation operations in the :ref:`ruby-aggregation` section. +{+avs+} +------------------- + +Learn how to perform similarity searches on vector embeddings by using the +{+avs+} feature. To learn more, see the +:ref:`ruby-atlas-vector-search` guide. + Secure Your Data ---------------- diff --git a/source/vector-search.txt b/source/vector-search.txt new file mode 100644 index 00000000..208f8df5 --- /dev/null +++ b/source/vector-search.txt @@ -0,0 +1,212 @@ +.. _ruby-atlas-vector-search: + +================================ +Run an {+avs+} Query +================================ + +.. facet:: + :name: genre + :values: reference + +.. meta:: + :keywords: code example, semantic, text, embeddings + :description: Learn how to use the Ruby driver to perform Atlas Vector Search queries. + +.. contents:: On this page + :local: + :backlinks: none + :depth: 2 + :class: singlecol + +Overview +-------- + +In this guide, you can learn how to perform searches on your documents +by using the {+avs+} feature. The {+driver-short+} allows you to +perform {+avs+} queries by using the aggregation framework. +To learn more about performing aggregations, see the +:ref:`ruby-aggregation` guide. + +.. note:: Deployment Compatibility + + You can use the {+avs+} feature only when + you connect to MongoDB Atlas clusters. This feature is not + available for self-managed deployments. + +To learn more about {+avs+}, see the :atlas:`{+avs+} Overview +` in the Atlas +documentation. + +.. note:: Atlas Search + + To perform advanced full-text search on your documents, you can use the + Atlas Search feature. To learn about this feature, see the + :atlas:`Atlas Search Overview `. + +{+avs+} Index +~~~~~~~~~~~~~~~~~~~~~~~~~ + +Before you can perform {+avs+} queries, you must create an +{+avs+} index on your collection. To learn more about +creating this index, see the :ref:`ruby-atlas-search-index` guide. + +Vector Search Aggregation Stage +------------------------------- + +To create a ``$vectorSearch`` stage in your aggregation pipeline, perform the +following actions: + +1. Create a vector to store the pipeline stages. + +#. Specify the ``$vectorSearch`` operator and provide details about the + vector search query. + +Then, include the stage in an aggregation pipeline and pass the pipeline +to the ``aggregate`` method. + +You must define the following fields in your ``$vectorSearch`` stage: + +.. list-table:: + :header-rows: 1 + + * - Parameter + - Type + - Description + + * - ``index`` + - string + - Name of the vector search index + + * - ``path`` + - string + - Field that contains vector embeddings + + * - ``queryVector`` + - array of numbers + - Vector representation of your query + + * - ``limit`` + - number + - Number of results to return + +Atlas Search Query Examples +--------------------------- + +In this section, you can learn how to perform Atlas Vector +Search queries. The examples in this section use sample data from the +``sample_mflix.embedded_movies`` collection. To learn how to load this +sample data, see the :atlas:`Load Data into Atlas ` +tutorial in the Atlas documentation. + +Basic Vector Search Query +~~~~~~~~~~~~~~~~~~~~~~~~~ + +The following code performs an {+avs+} query on the +``plot_embedding`` vector field by using a query vector that is a vector +embedding of the phrase "time travel": + +.. io-code-block:: + :copyable: true + + .. input:: /includes/aggregation/vector_search.rb + :language: ruby + :dedent: + :start-after: start-basic-query + :end-before: end-basic-query + + .. output:: + :language: none + :visible: false + + {"plot"=>"A reporter, learning of time travelers visiting 20th century disasters, tries to change the history they know by averting upcoming disasters.", "title"=>"Thrill Seekers"} + {"plot"=>"At the age of 21, Tim discovers he can travel in time and change what happens and has happened in his own life. His decision to make his world a better place by getting a girlfriend turns out not to be as easy as you might think.", "title"=>"About Time"} + {"plot"=>"Hoping to alter the events of the past, a 19th century inventor instead travels 800,000 years into the future, where he finds humankind divided into two warring races.", "title"=>"The Time Machine"} + {"plot"=>"An officer for a security agency that regulates time travel, must fend for his life against a shady politician who has a tie to his past.", "title"=>"Timecop"} + {"plot"=>"After using his mother's newly built time machine, Dolf gets stuck involuntary in the year 1212. He ends up in a children's crusade where he confronts his new friends with modern techniques...", "title"=>"Crusade in Jeans"} + +Vector Search Score +~~~~~~~~~~~~~~~~~~~ + +The following code performs the same query as in the preceding example, +but outputs only the ``title`` field and ``vectorSearchScore`` meta +field, which describes how well the document matches the query vector: + +.. io-code-block:: + :copyable: true + + .. input:: /includes/aggregation/vector_search.rb + :language: ruby + :dedent: + :start-after: start-score-query + :end-before: end-score-query + :emphasize-lines: 15 + + .. output:: + :language: none + :visible: false + + {"title"=>"Thrill Seekers", "score"=>0.9253387451171875} + {"title"=>"About Time", "score"=>0.9246978759765625} + {"title"=>"The Time Machine", "score"=>0.9229583740234375} + {"title"=>"Timecop", "score"=>0.9228057861328125} + {"title"=>"Crusade in Jeans", "score"=>0.9222259521484375} + +Vector Search Options +--------------------- + +You can use a ``$vectorSearch`` stage to perform many types of Atlas +Vector Search queries. Depending on your desired query, you can specify the +following options in the stage definition: + +.. list-table:: + :widths: 20 20 40 20 + :header-rows: 1 + + * - Optional Parameter + - Type + - Description + - Default Value + + * - ``exact`` + - boolean + - Specifies whether to run an Exact Nearest Neighbor (``true``) or + Approximate Nearest Neighbor (``false``) search + - ``false`` + + * - ``filter`` + - document + - Specifies a pre-filter for documents to search on + - No filtering + + * - ``numCandidates`` + - number + - Specifies the number of nearest neighbors to use during the + search + - No limit + +To learn more about these options, see the :atlas:`Fields +` section of the +``$vectorSearch`` operator reference in the Atlas documentation. + +.. _ruby-avs-addtl-info: + +Additional Information +---------------------- + +To learn more about the concepts mentioned in this guide, see the +following Server manual entries: + +- :atlas:`Run Vector Search Queries ` +- :manual:`Aggregation Pipeline ` +- :manual:`Aggregation Stages ` + +To learn more about the behavior of the ``aggregate`` method, see the +:ref:`ruby-aggregation` guide. + +API Documentation +~~~~~~~~~~~~~~~~~ + +To learn more about the methods and types mentioned in this +guide, see the following API documentation: + +- :ruby-api:`aggregate `