Skip to content

Commit 6a62f0a

Browse files
authored
RUBY-1977 clarify more edge cases and update tutorial (#144)
1 parent 8285332 commit 6a62f0a

File tree

1 file changed

+79
-11
lines changed

1 file changed

+79
-11
lines changed

docs/tutorials/bson-v4.txt

Lines changed: 79 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -83,21 +83,89 @@ instantiate ``BSON::ByteBuffer`` with no arguments:
8383

8484
.. code-block:: ruby
8585

86-
buffer = BSON::ByteBuffer.new # a write mode buffer.
86+
buffer = BSON::ByteBuffer.new
87+
88+
To write raw bytes to the byte buffer with no transformations, use
89+
``put_byte`` and ``put_bytes`` methods. They take a byte string as the argument
90+
and copy this string into the buffer. ``put_byte`` enforces that the argument
91+
is a string of length 1; ``put_bytes`` accepts any length strings.
92+
The strings can contain null bytes.
93+
94+
.. code-block:: ruby
95+
96+
buffer.put_byte("\x00")
97+
98+
buffer.put_bytes("\xff\xfe\x00\xfd")
99+
100+
.. note::
101+
102+
``put_byte`` and ``put_bytes`` do not write a BSON type byte prior to
103+
writing the argument to the byte buffer.
104+
105+
Subsequent write methods write objects of particular types in the
106+
`BSON spec <http://bsonspec.org/spec.html>`_. Note that the type indicated
107+
by the method name takes precedence over the type of the argument -
108+
for example, if a floating-point value is given to ``put_int32``, it is
109+
coerced into an integer and the resulting integer is written to the byte
110+
buffer.
111+
112+
To write a UTF-8 string (BSON type 0x02) to the byte buffer, use ``put_string``:
113+
114+
.. code-block:: ruby
115+
116+
buffer.put_string("hello, world")
117+
118+
Note that BSON strings are always encoded in UTF-8. Therefore, the
119+
argument must be either in UTF-8 or in an encoding convertable to UTF-8
120+
(i.e. not binary). If the argument is in an encoding other than UTF-8,
121+
the string is first converted to UTF-8 and the UTF-8 encoded version is
122+
written to the buffer. The string must be valid in its claimed encoding,
123+
including being valid UTF-8 if the encoding is UTF-8.
124+
The string may contain null bytes.
125+
126+
The BSON specification also defines a CString type, which is used for
127+
example for document keys. To write CStrings to the buffer, use ``put_cstring``:
128+
129+
.. code-block:: ruby
130+
131+
buffer.put_cstring("hello, world")
132+
133+
As with regular strings, CStrings in BSON must be UTF-8 encoded. If the
134+
argument is not in UTF-8, it is converted to UTF-8 and the resulting string
135+
is written to the buffer. Unlike ``put_string``, the UTF-8 encoding of
136+
the argument given to ``put_cstring`` cannot have any null bytes, since the
137+
CString serialization format in BSON is null terminated.
138+
139+
Unlike ``put_string``, ``put_cstring`` also accepts symbols and integers.
140+
In all cases the argument is stringified prior to being written:
141+
142+
.. code-block:: ruby
143+
144+
buffer.put_cstring(:hello)
145+
buffer.put_cstring(42)
146+
147+
To write a 32-bit or a 64-bit integer to the byte buffer, use
148+
``put_int32`` and ``put_int64`` methods respectively. Note that Ruby
149+
integers can be arbitrarily large; if the value being written exceeds the
150+
range of a 32-bit or a 64-bit integer, ``put_int32`` and ``put_int64``
151+
raise ``RangeError``.
152+
153+
.. code-block:: ruby
154+
155+
buffer.put_int32(12345)
156+
buffer.put_int64(123456789012345)
157+
158+
.. note::
159+
160+
If ``put_int32`` or ``put_int64`` are given floating point arguments,
161+
the arguments are first coerced into integers and the integers are
162+
written to the byte buffer.
87163

88-
Writing to the buffer is done via the following API:
164+
To write a 64-bit floating point value to the byte buffer, use ``put_double``:
89165

90166
.. code-block:: ruby
91167

92-
buffer.put_byte(value) # Appends a single byte.
93-
buffer.put_double(value) # Appends a 64-bit floating point.
94-
buffer.put_int32(value) # Appends a 32-bit integer (4 bytes).
95-
buffer.put_int64(value) # Appends a 64-bit integer (8 bytes).
96-
buffer.put_string(value) # Appends a UTF-8 string.
97-
98-
# Converts value to string, which must not contain any null bytes, and
99-
# writes the string to the buffer.
100-
buffer.put_cstring(value)
168+
buffer.put_double(3.14159)
101169

102170
To obtain the serialized data as a byte string (for example, to send the data
103171
over a socket), call ``to_s`` on the buffer:

0 commit comments

Comments
 (0)