|
1 | 1 | [[databuffers]]
|
2 | 2 | = Data Buffers and Codecs
|
3 | 3 |
|
4 |
| -The `DataBuffer` interface defines an abstraction over byte buffers. |
5 |
| -The main reason for introducing it (and not using the standard `java.nio.ByteBuffer` instead) is Netty. |
6 |
| -Netty does not use `ByteBuffer` but instead offers `ByteBuf` as an alternative. |
7 |
| -Spring's `DataBuffer` is a simple abstraction over `ByteBuf` that can also be used on non-Netty |
8 |
| -platforms (that is, Servlet 3.1+). |
| 4 | +Java NIO provides `ByteBuffer` but many libraries build their own byte buffer API on top, |
| 5 | +especially for network operations where reusing buffers and/or using direct buffers is |
| 6 | +beneficial for performance. For example Netty has the `ByteBuf` hierarchy, Undertow uses |
| 7 | +XNIO, Jetty uses pooled byte buffers with a callback to be released, and so on. |
| 8 | +The `spring-core` module provides a set of abstractions to work with various byte buffer |
| 9 | +APIs as follows: |
9 | 10 |
|
| 11 | +* <<databuffers-factory>> abstracts the creation of a data buffer. |
| 12 | +* <<databuffers-buffer>> represents a byte buffer, which may be |
| 13 | +<<databuffers-buffer-pooled,pooled>>. |
| 14 | +* <<databuffers-utils>> offers utility methods for data buffers. |
| 15 | +* <<Codecs>> decode or encode streams data buffer streams into higher level objects. |
10 | 16 |
|
11 | 17 |
|
12 | 18 |
|
| 19 | + |
| 20 | +[[databuffers-factory]] |
13 | 21 | == `DataBufferFactory`
|
14 | 22 |
|
15 |
| -The `DataBufferFactory` offers functionality to allocate new data buffers as well as to wrap |
16 |
| -existing data. |
17 |
| -The `allocateBuffer` methods allocate a new data buffer with a default or given capacity. |
18 |
| -Though `DataBuffer` implementations grow and shrink on demand, it is more efficient to give the |
19 |
| -capacity upfront, if known. |
20 |
| -The `wrap` methods decorate an existing `ByteBuffer` or byte array. |
21 |
| -Wrapping does not involve allocation. It decorates the given data with a `DataBuffer` |
22 |
| -implementation. |
| 23 | +`DataBufferFactory` is used to create data buffers in one of two ways: |
23 | 24 |
|
24 |
| -There are two implementation of `DataBufferFactory`: the `NettyDataBufferFactory` |
25 |
| -(for Netty platforms, such as Reactor Netty) and `DefaultDataBufferFactory` |
26 |
| -(for other platforms, such as Servlet 3.1+ servers). |
| 25 | +. Allocate a new data buffer, optionally specifying capacity upfront, if known, which is |
| 26 | +more efficient even though implementations of `DataBuffer` can grow and shrink on demand. |
| 27 | +. Wrap an existing `byte[]` or `java.nio.ByteBuffer`, which decorates the given data with |
| 28 | +a `DataBuffer` implementation and that does not involve allocation. |
27 | 29 |
|
| 30 | +Note that WebFlux applications do not create a `DataBufferFactory` directly but instead |
| 31 | +access it through the `ServerHttpResponse` or the `ClientHttpRequest` on the client side. |
| 32 | +The type of factory depends on the underlying client or server, e.g. |
| 33 | +`NettyDataBufferFactory` for Reactor Netty, `DefaultDataBufferFactory` for others. |
28 | 34 |
|
29 | 35 |
|
30 | 36 |
|
31 |
| -== The `DataBuffer` Interface |
32 | 37 |
|
33 |
| -The `DataBuffer` interface is similar to `ByteBuffer` but offers a number of advantages. |
34 |
| -Similar to Netty's `ByteBuf`, the `DataBuffer` abstraction offers independent read and write |
35 |
| -positions. |
36 |
| -This is different from the JDK's `ByteBuffer`, which exposes only one position for both reading and |
37 |
| -writing and a separate `flip()` operation to switch between the two I/O operations. |
38 |
| -In general, the following invariant holds for the read position, write position, and the capacity: |
| 38 | +[[databuffers-buffer]] |
| 39 | +== `DataBuffer` |
39 | 40 |
|
40 |
| -==== |
41 |
| -[literal] |
42 |
| -[subs="verbatim,quotes"] |
43 |
| --- |
44 |
| - 0 <= read position <= write position <= capacity |
45 |
| --- |
46 |
| -==== |
| 41 | +The `DataBuffer` interface offers similar operations as `java.nio.ByteBuffer` but also |
| 42 | +brings a few additional benefits some of which are inspired by the Netty `ByteBuf`. |
| 43 | +Below is a partial list of benefits: |
47 | 44 |
|
48 |
| -When reading bytes from the `DataBuffer`, the read position is automatically updated in accordance with |
49 |
| -the amount of data read from the buffer. |
50 |
| -Similarly, when writing bytes to the `DataBuffer`, the write position is updated with the amount of |
51 |
| -data written to the buffer. |
52 |
| -Also, when writing data, the capacity of a `DataBuffer` is automatically expanded, in the same fashion as `StringBuilder`, |
53 |
| -`ArrayList`, and similar types. |
54 |
| - |
55 |
| -Besides the reading and writing functionality mentioned above, the `DataBuffer` also has methods to |
56 |
| -view a (slice of a) buffer as a `ByteBuffer`, an `InputStream`, or an `OutputStream`. |
57 |
| -Additionally, it offers methods to determine the index of a given byte. |
58 |
| - |
59 |
| -As mentioned earlier, there are two implementation of `DataBufferFactory`: the `NettyDataBufferFactory` |
60 |
| -(for Netty platforms, such as Reactor Netty) and |
61 |
| -`DefaultDataBufferFactory` (for other platforms, such as |
62 |
| -Servlet 3.1+ servers). |
63 |
| - |
64 |
| - |
65 |
| - |
66 |
| -=== `PooledDataBuffer` |
67 |
| - |
68 |
| -The `PooledDataBuffer` is an extension to `DataBuffer` that adds methods for reference counting. |
69 |
| -The `retain` method increases the reference count by one. |
70 |
| -The `release` method decreases the count by one and releases the buffer's memory when the count |
71 |
| -reaches 0. |
72 |
| -Both of these methods are related to reference counting, a mechanism that we explain <<databuffer-reference-counting,later>>. |
73 |
| - |
74 |
| -Note that `DataBufferUtils` offers useful utility methods for releasing and retaining pooled data |
75 |
| -buffers. |
76 |
| -These methods take a plain `DataBuffer` as a parameter but only call `retain` or `release` if the |
77 |
| -passed data buffer is an instance of `PooledDataBuffer`. |
78 |
| - |
79 |
| - |
80 |
| -[[databuffer-reference-counting]] |
81 |
| -==== Reference Counting |
82 |
| - |
83 |
| -Reference counting is not a common technique in Java. It is much more common in other programming |
84 |
| -languages, such as Object C and C++. |
85 |
| -In and of itself, reference counting is not complex. It basically involves tracking the number of |
86 |
| -references that apply to an object. |
87 |
| -The reference count of a `PooledDataBuffer` starts at 1, is incremented by calling `retain`, |
88 |
| -and is decremented by calling `release`. |
89 |
| -As long as the buffer's reference count is larger than 0, the buffer is not released. |
90 |
| -When the number decreases to 0, the instance is released. |
91 |
| -In practice, this means that the reserved memory captured by the buffer is returned back to |
92 |
| -the memory pool, ready to be used for future allocations. |
93 |
| - |
94 |
| -In general, the last component to access a `DataBuffer` is responsible for releasing it. |
95 |
| -Within Spring, there are two sorts of components that release buffers: decoders and transports. |
96 |
| -Decoders are responsible for transforming a stream of buffers into other types (see <<codecs>>), |
97 |
| -and transports are responsible for sending buffers across a network boundary, typically as an HTTP message. |
98 |
| -This means that, if you allocate data buffers for the purpose of putting them into an outbound HTTP |
99 |
| -message (that is, a client-side request or server-side response), they do not have to be released. |
100 |
| -The other consequence of this rule is that if you allocate data buffers that do not end up in the |
101 |
| -body (for instance, because of a thrown exception), you have to release them yourself. |
102 |
| -The following snippet shows a typical `DataBuffer` usage scenario when dealing with methods that |
103 |
| -throw exceptions: |
| 45 | +* Read and write with independent positions, i.e. not requiring a call to `flip()` to |
| 46 | +alternate between read and write. |
| 47 | +* Capacity expanded on demand as with `java.lang.StringBuilder`. |
| 48 | +* Pooled buffers and reference counting via <<databuffers-buffer-pooled>>. |
| 49 | +* View a buffer as `java.nio.ByteBuffer`, `InputStream`, or `OutputStream`. |
| 50 | +* Determine the index, or the last index, for a given byte. |
104 | 51 |
|
105 |
| -==== |
106 |
| -[source,java,indent=0] |
107 |
| -[subs="verbatim,quotes"] |
108 |
| ----- |
109 |
| - DataBufferFactory factory = ... |
110 |
| - DataBuffer buffer = factory.allocateBuffer(); <1> |
111 |
| - boolean release = true; <2> |
112 |
| - try { |
113 |
| - writeDataToBuffer(buffer); <3> |
114 |
| - putBufferInHttpBody(buffer); |
115 |
| - release = false; <4> |
116 |
| - } |
117 |
| - finally { |
118 |
| - if (release) { |
119 |
| - DataBufferUtils.release(buffer); <5> |
120 |
| - } |
121 |
| - } |
122 | 52 |
|
123 |
| - private void writeDataToBuffer(DataBuffer buffer) throws IOException { <3> |
124 |
| - ... |
125 |
| - } |
126 |
| ----- |
127 | 53 |
|
128 |
| -<1> A new buffer is allocated. |
129 |
| -<2> A boolean flag indicates whether the allocated buffer should be released. |
130 |
| -<3> This example method loads data into the buffer. Note that the method can throw an `IOException`. |
131 |
| -Therefore, a `finally` block to release the buffer is required. |
132 |
| -<4> If no exception occurred, we switch the `release` flag to `false` as the buffer is now |
133 |
| -released as part of sending the HTTP body across the wire. |
134 |
| -<5> If an exception did occur, the flag is still set to `true`, and the buffer is released |
135 |
| -here. |
136 |
| -==== |
137 | 54 |
|
| 55 | +[[databuffers-buffer-pooled]] |
| 56 | +== `PooledDataBuffer` |
| 57 | + |
| 58 | +As explained in the Javadoc for |
| 59 | +https://docs.oracle.com/javase/8/docs/api/java/nio/ByteBuffer.html[ByteBuffer], |
| 60 | +byte buffers can be direct or non-direct. Direct buffers may reside outside the Java heap |
| 61 | +which eliminates the need for copying for native I/O operations. That makes direct buffers |
| 62 | +particularly useful for receiving and sending data over a socket, but they're also more |
| 63 | +expensive to create and release, which leads to the idea of pooling buffers. |
| 64 | + |
| 65 | +`PooledDataBuffer` is an extension of `DataBuffer` that helps with reference counting which |
| 66 | +is essential for byte buffer pooling. How does it work? When a `PooledDataBuffer` is |
| 67 | +allocated the reference count is at 1. Calls to `retain()` increment the count, while |
| 68 | +calls to `release()` decrement it. As long as the count is above 0, the buffer is |
| 69 | +guaranteed not to be released. When the count is decreased to 0, the pooled buffer can be |
| 70 | +released, which in practice could mean the reserved memory for the buffer is returned to |
| 71 | +the memory pool. |
138 | 72 |
|
| 73 | +Note that instead of operating on `PooledDataBuffer` directly, in most cases it's better |
| 74 | +to use the convenience methods in `DataBufferUtils` that apply release or retain to a |
| 75 | +`DataBuffer` only if it is an instance of `PooledDataBuffer`. |
139 | 76 |
|
140 |
| -=== `DataBufferUtils` |
141 | 77 |
|
142 |
| -The `DataBufferUtils` class contains various utility methods that operate on data buffers. |
143 |
| -It contains methods for reading a `Flux` of `DataBuffer` objects from an `InputStream` or NIO |
144 |
| -`Channel` and methods for writing a data buffer `Flux` to an `OutputStream` or `Channel`. |
145 |
| -`DataBufferUtils` also exposes `retain` and `release` methods that operate on plain `DataBuffer` |
146 |
| -instances (so that casting to a `PooledDataBuffer` is not required). |
147 | 78 |
|
148 |
| -Additionally, `DataBufferUtils` exposes `compose`, which merges a stream of data buffers into one. |
149 |
| -For instance, this method can be used to convert the entire HTTP body into a single buffer (and |
150 |
| -from that, a `String` or `InputStream`). |
151 |
| -This is particularly useful when dealing with older, blocking APIs. |
152 |
| -Note, however, that this puts the entire body in memory, and therefore uses more memory than a pure |
153 |
| -streaming solution would. |
| 79 | + |
| 80 | +[[databuffers-utils]] |
| 81 | +== `DataBufferUtils` |
| 82 | + |
| 83 | +`DataBufferUtils` offers a number of utility methods to operate on data buffers: |
| 84 | + |
| 85 | +* Join a stream of data buffers into a single buffer possibly with zero copy, e.g. via |
| 86 | +composite buffers, if that's supported by the underlying byte buffer API. |
| 87 | +* Turn `InputStream` or NIO `Channel` into `Flux<DataBuffer>`, and vice versa a |
| 88 | +`Publisher<DataBuffer>` into `OutputStream` or NIO `Channel`. |
| 89 | +* Methods to release or retain a `DataBuffer` if the buffer is an instance of |
| 90 | +`PooledDataBuffer`. |
| 91 | +* Skip or take from a stream of bytes until a specific byte count. |
| 92 | + |
154 | 93 |
|
155 | 94 |
|
156 | 95 |
|
157 | 96 |
|
158 | 97 | [[codecs]]
|
159 | 98 | == Codecs
|
160 | 99 |
|
161 |
| -The `org.springframework.core.codec` package contains the two main abstractions for converting a |
162 |
| -stream of bytes into a stream of objects or vice-versa. |
163 |
| -The `Encoder` is a strategy interface that encodes a stream of objects into an output stream of |
164 |
| -data buffers. |
165 |
| -The `Decoder` does the reverse: It turns a stream of data buffers into a stream of objects. |
166 |
| -Note that a decoder instance needs to consider <<databuffer-reference-counting,reference counting>>. |
167 |
| - |
168 |
| -Spring comes with a wide array of default codecs (to convert from and to `String`, |
169 |
| -`ByteBuffer`, and byte arrays) and codecs that support marshalling libraries such as JAXB and |
170 |
| -Jackson (with https://github.com/FasterXML/jackson-core/issues/57[Jackson 2.9+ support for non-blocking parsing]). |
171 |
| -Within the context of Spring WebFlux, codecs are used to convert the request body into a |
172 |
| -`@RequestMapping` parameter or to convert the return type into the response body that is sent back |
173 |
| -to the client. |
174 |
| -The default codecs are configured in the `WebFluxConfigurationSupport` class. You can |
175 |
| -change them by overriding the `configureHttpMessageCodecs` when you inherit from that class. |
176 |
| -For more information about using codecs in WebFlux, see <<web-reactive#webflux-codecs>>. |
| 100 | +The `org.springframework.core.codec` package provides the following stragy interfaces: |
| 101 | + |
| 102 | +* `Encoder` to encode `Publisher<T>` into a stream of data buffers. |
| 103 | +* `Decoder` to decode `Publisher<DataBuffer>` into a stream of higher level objects. |
| 104 | + |
| 105 | +The `spring-core` module provides `byte[]`, `ByteBuffer`, `DataBuffer`, `Resource`, and |
| 106 | +`String` encoder and decoder implementations. The `spring-web` module adds Jackson JSON, |
| 107 | +Jackson Smile, JAXB2, Protocol Buffers and other encoders and decoders. See |
| 108 | +<<web-reactive.adoc#webflux-codecs,Codecs>> in the WebFlux section. |
| 109 | + |
| 110 | + |
| 111 | + |
| 112 | + |
| 113 | +[[databuffers-using]] |
| 114 | +== Using `DataBuffer` |
| 115 | + |
| 116 | +When working with data buffers, special care must be taken to ensure buffers are released |
| 117 | +since they may be <<databuffers-buffer-pooled,pooled>>. We'll use codecs to illustrate |
| 118 | +how that works but the concepts apply more generally. Let's see what codecs must do |
| 119 | +internally to manage data buffers. |
| 120 | + |
| 121 | +A `Decoder` is the last to read input data buffers, before creating higher level |
| 122 | +objects, and therefore it must release them as follows: |
| 123 | + |
| 124 | +. If a `Decoder` simply reads each input buffer and is ready to |
| 125 | +release it immediately, it can do so via `DataBufferUtils.release(dataBuffer)`. |
| 126 | +. If a `Decoder` is using `Flux` or `Mono` operators such as `flatMap`, `reduce`, and |
| 127 | +others that prefetch and cache data items internally, or is using operators such as |
| 128 | +`filter`, `skip`, and others that leave out items, then |
| 129 | +`doOnDiscard(PooledDataBuffer.class, DataBufferUtils::release)` must be added to the |
| 130 | +composition chain to ensure such buffers are released prior to being discarded, possibly |
| 131 | +also as a result an error or cancellation signal. |
| 132 | +. If a `Decoder` holds on to one or more data buffers in any other way, it must |
| 133 | +ensure they are released when fully read, or in case an error or cancellation signals that |
| 134 | +take place before the cached data buffers have been read and released. |
| 135 | + |
| 136 | +Note that `DataBufferUtils#join` offers a safe and efficient way to aggregate a data |
| 137 | +buffer stream into a single data buffer. Likewise `skipUntilByteCount` and |
| 138 | +`takeUntilByteCount` are additional safe methods for decoders to use. |
| 139 | + |
| 140 | +An `Encoder` allocates data buffers that others must read (and release). So an `Encoder` |
| 141 | +doesn't have much to do. However an `Encoder` must take care to release a data buffer if |
| 142 | +a serialization error occurs while populating the buffer with data. For example: |
| 143 | + |
| 144 | +==== |
| 145 | +[source,java,indent=0] |
| 146 | +[subs="verbatim,quotes"] |
| 147 | +---- |
| 148 | + DataBuffer buffer = factory.allocateBuffer(); |
| 149 | + boolean release = true; |
| 150 | + try { |
| 151 | + // serialize and populate buffer.. |
| 152 | + release = false; |
| 153 | + } |
| 154 | + finally { |
| 155 | + if (release) { |
| 156 | + DataBufferUtils.release(buffer); |
| 157 | + } |
| 158 | + } |
| 159 | + return buffer; |
| 160 | +---- |
| 161 | +==== |
| 162 | + |
| 163 | +The consumer of an `Encoder` is responsible for releasing the data buffers it receives. |
| 164 | +In a WebFlux application, the output of the `Encoder` is used to write to the HTTP server |
| 165 | +response, or to the client HTTP request, in which case releasing the data buffers is the |
| 166 | +responsibility of the code writing to the server response, or to the client request. |
| 167 | + |
| 168 | +Note that when running on Netty, there are debugging options for |
| 169 | +https://github.com/netty/netty/wiki/Reference-counted-objects#troubleshooting-buffer-leaks[troubleshooting buffer leaks]. |
0 commit comments