Skip to content

Commit 4faee16

Browse files
committed
Documentation updates for working with DataBuffers
Issue: SPR-17409
1 parent 8223ed3 commit 4faee16

File tree

3 files changed

+173
-146
lines changed

3 files changed

+173
-146
lines changed
Lines changed: 137 additions & 144 deletions
Original file line numberDiff line numberDiff line change
@@ -1,176 +1,169 @@
11
[[databuffers]]
22
= Data Buffers and Codecs
33

4-
The `DataBuffer` interface defines an abstraction over byte buffers.
5-
The main reason for introducing it (and not using the standard `java.nio.ByteBuffer` instead) is Netty.
6-
Netty does not use `ByteBuffer` but instead offers `ByteBuf` as an alternative.
7-
Spring's `DataBuffer` is a simple abstraction over `ByteBuf` that can also be used on non-Netty
8-
platforms (that is, Servlet 3.1+).
4+
Java NIO provides `ByteBuffer` but many libraries build their own byte buffer API on top,
5+
especially for network operations where reusing buffers and/or using direct buffers is
6+
beneficial for performance. For example Netty has the `ByteBuf` hierarchy, Undertow uses
7+
XNIO, Jetty uses pooled byte buffers with a callback to be released, and so on.
8+
The `spring-core` module provides a set of abstractions to work with various byte buffer
9+
APIs as follows:
910

11+
* <<databuffers-factory>> abstracts the creation of a data buffer.
12+
* <<databuffers-buffer>> represents a byte buffer, which may be
13+
<<databuffers-buffer-pooled,pooled>>.
14+
* <<databuffers-utils>> offers utility methods for data buffers.
15+
* <<Codecs>> decode or encode streams data buffer streams into higher level objects.
1016

1117

1218

19+
20+
[[databuffers-factory]]
1321
== `DataBufferFactory`
1422

15-
The `DataBufferFactory` offers functionality to allocate new data buffers as well as to wrap
16-
existing data.
17-
The `allocateBuffer` methods allocate a new data buffer with a default or given capacity.
18-
Though `DataBuffer` implementations grow and shrink on demand, it is more efficient to give the
19-
capacity upfront, if known.
20-
The `wrap` methods decorate an existing `ByteBuffer` or byte array.
21-
Wrapping does not involve allocation. It decorates the given data with a `DataBuffer`
22-
implementation.
23+
`DataBufferFactory` is used to create data buffers in one of two ways:
2324

24-
There are two implementation of `DataBufferFactory`: the `NettyDataBufferFactory`
25-
(for Netty platforms, such as Reactor Netty) and `DefaultDataBufferFactory`
26-
(for other platforms, such as Servlet 3.1+ servers).
25+
. Allocate a new data buffer, optionally specifying capacity upfront, if known, which is
26+
more efficient even though implementations of `DataBuffer` can grow and shrink on demand.
27+
. Wrap an existing `byte[]` or `java.nio.ByteBuffer`, which decorates the given data with
28+
a `DataBuffer` implementation and that does not involve allocation.
2729

30+
Note that WebFlux applications do not create a `DataBufferFactory` directly but instead
31+
access it through the `ServerHttpResponse` or the `ClientHttpRequest` on the client side.
32+
The type of factory depends on the underlying client or server, e.g.
33+
`NettyDataBufferFactory` for Reactor Netty, `DefaultDataBufferFactory` for others.
2834

2935

3036

31-
== The `DataBuffer` Interface
3237

33-
The `DataBuffer` interface is similar to `ByteBuffer` but offers a number of advantages.
34-
Similar to Netty's `ByteBuf`, the `DataBuffer` abstraction offers independent read and write
35-
positions.
36-
This is different from the JDK's `ByteBuffer`, which exposes only one position for both reading and
37-
writing and a separate `flip()` operation to switch between the two I/O operations.
38-
In general, the following invariant holds for the read position, write position, and the capacity:
38+
[[databuffers-buffer]]
39+
== `DataBuffer`
3940

40-
====
41-
[literal]
42-
[subs="verbatim,quotes"]
43-
--
44-
0 <= read position <= write position <= capacity
45-
--
46-
====
41+
The `DataBuffer` interface offers similar operations as `java.nio.ByteBuffer` but also
42+
brings a few additional benefits some of which are inspired by the Netty `ByteBuf`.
43+
Below is a partial list of benefits:
4744

48-
When reading bytes from the `DataBuffer`, the read position is automatically updated in accordance with
49-
the amount of data read from the buffer.
50-
Similarly, when writing bytes to the `DataBuffer`, the write position is updated with the amount of
51-
data written to the buffer.
52-
Also, when writing data, the capacity of a `DataBuffer` is automatically expanded, in the same fashion as `StringBuilder`,
53-
`ArrayList`, and similar types.
54-
55-
Besides the reading and writing functionality mentioned above, the `DataBuffer` also has methods to
56-
view a (slice of a) buffer as a `ByteBuffer`, an `InputStream`, or an `OutputStream`.
57-
Additionally, it offers methods to determine the index of a given byte.
58-
59-
As mentioned earlier, there are two implementation of `DataBufferFactory`: the `NettyDataBufferFactory`
60-
(for Netty platforms, such as Reactor Netty) and
61-
`DefaultDataBufferFactory` (for other platforms, such as
62-
Servlet 3.1+ servers).
63-
64-
65-
66-
=== `PooledDataBuffer`
67-
68-
The `PooledDataBuffer` is an extension to `DataBuffer` that adds methods for reference counting.
69-
The `retain` method increases the reference count by one.
70-
The `release` method decreases the count by one and releases the buffer's memory when the count
71-
reaches 0.
72-
Both of these methods are related to reference counting, a mechanism that we explain <<databuffer-reference-counting,later>>.
73-
74-
Note that `DataBufferUtils` offers useful utility methods for releasing and retaining pooled data
75-
buffers.
76-
These methods take a plain `DataBuffer` as a parameter but only call `retain` or `release` if the
77-
passed data buffer is an instance of `PooledDataBuffer`.
78-
79-
80-
[[databuffer-reference-counting]]
81-
==== Reference Counting
82-
83-
Reference counting is not a common technique in Java. It is much more common in other programming
84-
languages, such as Object C and C++.
85-
In and of itself, reference counting is not complex. It basically involves tracking the number of
86-
references that apply to an object.
87-
The reference count of a `PooledDataBuffer` starts at 1, is incremented by calling `retain`,
88-
and is decremented by calling `release`.
89-
As long as the buffer's reference count is larger than 0, the buffer is not released.
90-
When the number decreases to 0, the instance is released.
91-
In practice, this means that the reserved memory captured by the buffer is returned back to
92-
the memory pool, ready to be used for future allocations.
93-
94-
In general, the last component to access a `DataBuffer` is responsible for releasing it.
95-
Within Spring, there are two sorts of components that release buffers: decoders and transports.
96-
Decoders are responsible for transforming a stream of buffers into other types (see <<codecs>>),
97-
and transports are responsible for sending buffers across a network boundary, typically as an HTTP message.
98-
This means that, if you allocate data buffers for the purpose of putting them into an outbound HTTP
99-
message (that is, a client-side request or server-side response), they do not have to be released.
100-
The other consequence of this rule is that if you allocate data buffers that do not end up in the
101-
body (for instance, because of a thrown exception), you have to release them yourself.
102-
The following snippet shows a typical `DataBuffer` usage scenario when dealing with methods that
103-
throw exceptions:
45+
* Read and write with independent positions, i.e. not requiring a call to `flip()` to
46+
alternate between read and write.
47+
* Capacity expanded on demand as with `java.lang.StringBuilder`.
48+
* Pooled buffers and reference counting via <<databuffers-buffer-pooled>>.
49+
* View a buffer as `java.nio.ByteBuffer`, `InputStream`, or `OutputStream`.
50+
* Determine the index, or the last index, for a given byte.
10451

105-
====
106-
[source,java,indent=0]
107-
[subs="verbatim,quotes"]
108-
----
109-
DataBufferFactory factory = ...
110-
DataBuffer buffer = factory.allocateBuffer(); <1>
111-
boolean release = true; <2>
112-
try {
113-
writeDataToBuffer(buffer); <3>
114-
putBufferInHttpBody(buffer);
115-
release = false; <4>
116-
}
117-
finally {
118-
if (release) {
119-
DataBufferUtils.release(buffer); <5>
120-
}
121-
}
12252

123-
private void writeDataToBuffer(DataBuffer buffer) throws IOException { <3>
124-
...
125-
}
126-
----
12753

128-
<1> A new buffer is allocated.
129-
<2> A boolean flag indicates whether the allocated buffer should be released.
130-
<3> This example method loads data into the buffer. Note that the method can throw an `IOException`.
131-
Therefore, a `finally` block to release the buffer is required.
132-
<4> If no exception occurred, we switch the `release` flag to `false` as the buffer is now
133-
released as part of sending the HTTP body across the wire.
134-
<5> If an exception did occur, the flag is still set to `true`, and the buffer is released
135-
here.
136-
====
13754

55+
[[databuffers-buffer-pooled]]
56+
== `PooledDataBuffer`
57+
58+
As explained in the Javadoc for
59+
https://docs.oracle.com/javase/8/docs/api/java/nio/ByteBuffer.html[ByteBuffer],
60+
byte buffers can be direct or non-direct. Direct buffers may reside outside the Java heap
61+
which eliminates the need for copying for native I/O operations. That makes direct buffers
62+
particularly useful for receiving and sending data over a socket, but they're also more
63+
expensive to create and release, which leads to the idea of pooling buffers.
64+
65+
`PooledDataBuffer` is an extension of `DataBuffer` that helps with reference counting which
66+
is essential for byte buffer pooling. How does it work? When a `PooledDataBuffer` is
67+
allocated the reference count is at 1. Calls to `retain()` increment the count, while
68+
calls to `release()` decrement it. As long as the count is above 0, the buffer is
69+
guaranteed not to be released. When the count is decreased to 0, the pooled buffer can be
70+
released, which in practice could mean the reserved memory for the buffer is returned to
71+
the memory pool.
13872

73+
Note that instead of operating on `PooledDataBuffer` directly, in most cases it's better
74+
to use the convenience methods in `DataBufferUtils` that apply release or retain to a
75+
`DataBuffer` only if it is an instance of `PooledDataBuffer`.
13976

140-
=== `DataBufferUtils`
14177

142-
The `DataBufferUtils` class contains various utility methods that operate on data buffers.
143-
It contains methods for reading a `Flux` of `DataBuffer` objects from an `InputStream` or NIO
144-
`Channel` and methods for writing a data buffer `Flux` to an `OutputStream` or `Channel`.
145-
`DataBufferUtils` also exposes `retain` and `release` methods that operate on plain `DataBuffer`
146-
instances (so that casting to a `PooledDataBuffer` is not required).
14778

148-
Additionally, `DataBufferUtils` exposes `compose`, which merges a stream of data buffers into one.
149-
For instance, this method can be used to convert the entire HTTP body into a single buffer (and
150-
from that, a `String` or `InputStream`).
151-
This is particularly useful when dealing with older, blocking APIs.
152-
Note, however, that this puts the entire body in memory, and therefore uses more memory than a pure
153-
streaming solution would.
79+
80+
[[databuffers-utils]]
81+
== `DataBufferUtils`
82+
83+
`DataBufferUtils` offers a number of utility methods to operate on data buffers:
84+
85+
* Join a stream of data buffers into a single buffer possibly with zero copy, e.g. via
86+
composite buffers, if that's supported by the underlying byte buffer API.
87+
* Turn `InputStream` or NIO `Channel` into `Flux<DataBuffer>`, and vice versa a
88+
`Publisher<DataBuffer>` into `OutputStream` or NIO `Channel`.
89+
* Methods to release or retain a `DataBuffer` if the buffer is an instance of
90+
`PooledDataBuffer`.
91+
* Skip or take from a stream of bytes until a specific byte count.
92+
15493

15594

15695

15796

15897
[[codecs]]
15998
== Codecs
16099

161-
The `org.springframework.core.codec` package contains the two main abstractions for converting a
162-
stream of bytes into a stream of objects or vice-versa.
163-
The `Encoder` is a strategy interface that encodes a stream of objects into an output stream of
164-
data buffers.
165-
The `Decoder` does the reverse: It turns a stream of data buffers into a stream of objects.
166-
Note that a decoder instance needs to consider <<databuffer-reference-counting,reference counting>>.
167-
168-
Spring comes with a wide array of default codecs (to convert from and to `String`,
169-
`ByteBuffer`, and byte arrays) and codecs that support marshalling libraries such as JAXB and
170-
Jackson (with https://github.com/FasterXML/jackson-core/issues/57[Jackson 2.9+ support for non-blocking parsing]).
171-
Within the context of Spring WebFlux, codecs are used to convert the request body into a
172-
`@RequestMapping` parameter or to convert the return type into the response body that is sent back
173-
to the client.
174-
The default codecs are configured in the `WebFluxConfigurationSupport` class. You can
175-
change them by overriding the `configureHttpMessageCodecs` when you inherit from that class.
176-
For more information about using codecs in WebFlux, see <<web-reactive#webflux-codecs>>.
100+
The `org.springframework.core.codec` package provides the following stragy interfaces:
101+
102+
* `Encoder` to encode `Publisher<T>` into a stream of data buffers.
103+
* `Decoder` to decode `Publisher<DataBuffer>` into a stream of higher level objects.
104+
105+
The `spring-core` module provides `byte[]`, `ByteBuffer`, `DataBuffer`, `Resource`, and
106+
`String` encoder and decoder implementations. The `spring-web` module adds Jackson JSON,
107+
Jackson Smile, JAXB2, Protocol Buffers and other encoders and decoders. See
108+
<<web-reactive.adoc#webflux-codecs,Codecs>> in the WebFlux section.
109+
110+
111+
112+
113+
[[databuffers-using]]
114+
== Using `DataBuffer`
115+
116+
When working with data buffers, special care must be taken to ensure buffers are released
117+
since they may be <<databuffers-buffer-pooled,pooled>>. We'll use codecs to illustrate
118+
how that works but the concepts apply more generally. Let's see what codecs must do
119+
internally to manage data buffers.
120+
121+
A `Decoder` is the last to read input data buffers, before creating higher level
122+
objects, and therefore it must release them as follows:
123+
124+
. If a `Decoder` simply reads each input buffer and is ready to
125+
release it immediately, it can do so via `DataBufferUtils.release(dataBuffer)`.
126+
. If a `Decoder` is using `Flux` or `Mono` operators such as `flatMap`, `reduce`, and
127+
others that prefetch and cache data items internally, or is using operators such as
128+
`filter`, `skip`, and others that leave out items, then
129+
`doOnDiscard(PooledDataBuffer.class, DataBufferUtils::release)` must be added to the
130+
composition chain to ensure such buffers are released prior to being discarded, possibly
131+
also as a result an error or cancellation signal.
132+
. If a `Decoder` holds on to one or more data buffers in any other way, it must
133+
ensure they are released when fully read, or in case an error or cancellation signals that
134+
take place before the cached data buffers have been read and released.
135+
136+
Note that `DataBufferUtils#join` offers a safe and efficient way to aggregate a data
137+
buffer stream into a single data buffer. Likewise `skipUntilByteCount` and
138+
`takeUntilByteCount` are additional safe methods for decoders to use.
139+
140+
An `Encoder` allocates data buffers that others must read (and release). So an `Encoder`
141+
doesn't have much to do. However an `Encoder` must take care to release a data buffer if
142+
a serialization error occurs while populating the buffer with data. For example:
143+
144+
====
145+
[source,java,indent=0]
146+
[subs="verbatim,quotes"]
147+
----
148+
DataBuffer buffer = factory.allocateBuffer();
149+
boolean release = true;
150+
try {
151+
// serialize and populate buffer..
152+
release = false;
153+
}
154+
finally {
155+
if (release) {
156+
DataBufferUtils.release(buffer);
157+
}
158+
}
159+
return buffer;
160+
----
161+
====
162+
163+
The consumer of an `Encoder` is responsible for releasing the data buffers it receives.
164+
In a WebFlux application, the output of the `Encoder` is used to write to the HTTP server
165+
response, or to the client HTTP request, in which case releasing the data buffers is the
166+
responsibility of the code writing to the server response, or to the client request.
167+
168+
Note that when running on Netty, there are debugging options for
169+
https://github.com/netty/netty/wiki/Reference-counted-objects#troubleshooting-buffer-leaks[troubleshooting buffer leaks].

src/docs/asciidoc/web/webflux-websocket.adoc

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -204,6 +204,22 @@ class ExampleHandler implements WebSocketHandler {
204204

205205

206206

207+
[[webflux-websocket-databuffer]]
208+
=== `DataBuffer`
209+
210+
`DataBuffer` is the representation for a byte buffer in WebFlux. The Spring Core part of
211+
the reference has more on that in the section on
212+
<<core#databuffers,Data Buffers and Codecs>>. The key point to understand is that on some
213+
servers like Netty, byte buffers are pooled and reference counted, and must be released
214+
when consumed to avoid memory leaks.
215+
216+
When running on Netty, applications must use `DataBufferUtils.retain(dataBuffer)` if they
217+
wish to hold on input data buffers in order to ensure they are not released, and
218+
subsequently use `DataBufferUtils.release(dataBuffer)` when the buffers are consumed.
219+
220+
221+
222+
207223
[[webflux-websocket-server-handshake]]
208224
=== Handshake
209225
[.small]#<<web.adoc#websocket-server-handshake,Same as in the Servlet stack>>#

src/docs/asciidoc/web/webflux.adoc

Lines changed: 20 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -671,7 +671,7 @@ to encode and decode HTTP message content.
671671
application, while a `Decoder` can be wrapped with `DecoderHttpMessageReader`.
672672
* {api-spring-framework}/core/io/buffer/DataBuffer.html[`DataBuffer`] abstracts different
673673
byte buffer representations (e.g. Netty `ByteBuf`, `java.nio.ByteBuffer`, etc.) and is
674-
what all codecs work on. See <<core#databuffers, Data Buffers and Codecs>> in the
674+
what all codecs work on. See <<core#databuffers,Data Buffers and Codecs>> in the
675675
"Spring Core" section for more on this topic.
676676

677677
The `spring-core` module provides `byte[]`, `ByteBuffer`, `DataBuffer`, `Resource`, and
@@ -741,7 +741,7 @@ consistently for access to the cached form data versus reading from the raw requ
741741

742742

743743
[[webflux-codecs-multipart]]
744-
==== Multipart Data
744+
==== Multipart
745745

746746
`MultipartHttpMessageReader` and `MultipartHttpMessageWriter` support decoding and
747747
encoding "multipart/form-data" content. In turn `MultipartHttpMessageReader` delegates to
@@ -772,6 +772,24 @@ comment-only, empty SSE event or any other "no-op" data that would effectively s
772772
a heartbeat.
773773

774774

775+
[[webflux-codecs-buffers]]
776+
==== `DataBuffer`
777+
778+
`DataBuffer` is the representation for a byte buffer in WebFlux. The Spring Core part of
779+
the reference has more on that in the section on
780+
<<core#databuffers,Data Buffers and Codecs>>. The key point to understand is that on some
781+
servers like Netty, byte buffers are pooled and reference counted, and must be released
782+
when consumed to avoid memory leaks.
783+
784+
WebFlux applications generally do not need to be concerned with such issues, unless they
785+
consume or produce data buffers directly, as opposed to relying on codecs to convert to
786+
and from higher level objects. Or unless they choose to create custom codecs. For such
787+
cases please review the the information in <<core#databuffers,Data Buffers and Codecs>>,
788+
especially the section on <<core#databuffers-using,Using DataBuffer>>.
789+
790+
791+
792+
775793

776794
[[webflux-logging]]
777795
=== Logging

0 commit comments

Comments
 (0)