Description
Julian Orth opened SPR-17328 and commented
Hi,
AsyncRestTemplate was deprecated in #19962 in favor of WebClient. However, WebClient does not seem to support all of the use cases that AsyncRestTemplate supports (and which RestTemplate does not support.)
Example
Consider the following JSON:
{
"a": [
{
"x": 2,
"y": 1
}
],
"b": [
{
"x": 3,
"y": 1
}
]
}
where both arrays (a
and b
) have 1,000,000,000 elements each. The goal is to calculate the sum of all x - y
over both arrays. (E.g. (2 - 1) + (3 - 1) = 3
in the example above.)
Solution with AsyncRestTemplate
With AsyncRestTemplate, this is easy: Call AsyncRestTemplate#execute
with a ResponseExtractor
, plug the InputStream
into a Jackson JsonParser
, use ObjectMapper
to deserialize each array element ad-hoc into
class V {
int x;
int y;
},
update the sum, proceed to the next element. Since only one Object of type V
needs to be in memory at a time, the memory requirements are constant and low.
Overall, performing this streaming processing of the JSON can probably be done in 25 lines of code using Jackson and AsyncRestTemplate.
The Problem with WebClient
With WebClient, this kind of processing seems to be practically impossible. Jackson appears to only support async parsing at the token level. Anything at a higher level (e.g. ObjectMapper) needs to have all tokens available in a blocking way to parse them.
Therefore, to implement the kind of streaming processing described above, I would have to manually keep track of the JSON tokens parsed and then plug them into an ObjectMapper all at once when I've detected the end of an array element. This is basically what Spring currently does to support streaming of top-level arrays:
WebClient.create().get().exchange().flatMapMany(r -> r.bodyToFlux(V.class))
However, even to support only this very limited streaming of top-level array elements, Spring had to re-implement about 200 lines of Jackson logic to keep track of the current depth in the token stream (Jackson2Tokenizer
).
Question
Since AsyncRestTemplate is deprecated, there no longer seems to be an encouraged and practical way in Spring 5 to do asynchronous streaming of JSON data. There are several ways to improve this situation:
- Un-deprecate AsyncRestTemplate
- Upstream complete async support in Jackson
- Provide a much expanded version of Jackson2Tokenizer to the public that handles more complicated cases such as the one described above
What are your thoughts on the matter and do you have plans to address this problem in a future release?
Thanks
Julian
PS: A similar problem exists on the server side. With web-mvc, an object returned from a REST endpoint would be streamed into the output stream via Jackson, keeping the memory requirements low. With webflux, a Mono<Object> returned from a REST endpoint will first be serialized into a String before it is written to the output stream.
Affects: 5.0.9