Description
This issue has been opened to discuss moving the internal serializer from Json.NET over to a faster JSON serialization library.
The feature/utf8json-serializer branch contains a minimal viable prototype of deserializing an ISearchResponse<T>
and serializing ISearchRequest
.
Some key observations working with utf8json whilst putting together this prototype:
-
Hit<T>
requires a custom formatter to be resolved at theIJsonFormatterResolver
level because it contains a generic type property whose formatter,SourceFormatter<T>
, cannot be resolved usingJsonFormatterAttribute
. If it were possible to resolve, then it would be possible to attributeHit<T>
with[JsonFormatter(typeof(HitFormatter<>))]
, and have the_source
field attributed with[JsonFormatter(typeof(SourceFormatter<>))]
. For now, initialize an instance ofSourceFormatter<T>
inside theHitFormatter<T>
constructor. -
Implementation does not handle different field casings
-
HitFormatter<T>
avoids allocating strings when reading property names by usingAutomataDictionary
. This dictionary lives outside of the genericHitFormatter<T>
to avoid creating an instance of the dictionary for eachT
. -
Both
JsonReader
andJsonWriter
are structs passed by ref, so cannot be captured inside of local
functions or lambda expression bodies, but instead would need to be passed as a ref parameter to a function. An example isJoinFieldFormatter
's Serialize method. -
utf8json does not have a similar concept to
[JsonObject(MemberSerialization.OptIn)]
to
only serialize those members that have been explicitly attributed withDataMemberAttribute
.
This is something that would ideally be needed as it is cumbersome to set[IgnoreDataMember]
on all properties that should be ignored. -
ConnectionSettings
is retrieved by castingIJsonFormatterResolver
to a known concrete
implementation that exposes it as a property. Not ideal, but it works. -
utf8json does not make a distinction between an integer token and a float token as Json.NET
does. This is not so much of a problem, since the bytes for the token can be inspected to determine
if they contain a decimal point, and use utf8json's internals to deserialize accordingly. Also, this
is needed only in cases where an integer/double distinction is necessary. SeeFuzzinessFormatter
for an example. -
The equivalent to
JsonConverter
,IJsonFormatter<T>
, only has a generic variant. In several places
in the client, we may serialize using the an interface, but deserialize using the concrete implementation.
This is handled byConcreteInterfaceFormatter<TConcrete, TInterface>
, where the formatter
isIJsonFormatter<TInterface>
. An interesting case is when the concrete type should be serialized
as the interface; in such scenarios, we end up with two formatters, one for the concrete type and one
for the interface, where each formatter references the others' serialize/deserialize implementation. See
QueryContainerFormatter
andQueryContainerInterfaceFormatter
for an example.
Benchmarking the feature/utf8json-serializer
branch against the 6.4.0 nuget package in deserializing a fixed byte response of 100, 1000 or 10000 Stackoverflow questions, the following results are collected.
BenchmarkDotNet=v0.11.2.856-nightly, OS=Windows 10.0.17134.285 (1803/April2018Update/Redstone4)
Intel Core i7-4980HQ CPU 2.80GHz (Haswell), 1 CPU, 8 logical and 4 physical cores
.NET Core SDK=2.1.500
[Host] : .NET Core 2.1.6 (CoreCLR 4.6.27019.06, CoreFX 4.6.27019.05), 64bit RyuJIT
Job-EXDGCR : .NET Core 2.1.6 (CoreCLR 4.6.27019.06, CoreFX 4.6.27019.05), 64bit RyuJIT
MinInvokeCount=30 MinIterationTime=500.0000 ms Jit=RyuJit
Platform=AnyCpu
100 Stackoverflow questions
Method | Mean | Error | StdDev | Median | Ratio | RatioSD | Gen 0/1k Op | Gen 1/1k Op | Gen 2/1k Op | Allocated Memory/Op |
---|---|---|---|---|---|---|---|---|---|---|
Search6x | 1,786.4 us | 15.78 us | 14.76 us | 1,785.3 us | 1.00 | 0.00 | 87.8906 | 42.9688 | - | 540.57 KB |
Search6xAsync | 1,810.0 us | 36.57 us | 50.06 us | 1,792.0 us | 1.02 | 0.03 | 87.8906 | 42.9688 | - | 541.06 KB |
Search6xJsonNetSerializer | 5,557.4 us | 86.19 us | 80.62 us | 5,547.7 us | 3.11 | 0.04 | 554.6875 | 250.0000 | 15.6250 | 3450.89 KB |
Search6xJsonNetSerializerAsync | 4,923.2 us | 101.23 us | 204.48 us | 4,929.8 us | 2.72 | 0.10 | - | - | - | 3451.38 KB |
SearchBleeding | 933.0 us | 23.24 us | 67.43 us | 911.9 us | 0.51 | 0.02 | - | - | - | 679.26 KB |
SearchBleedingAsync | 949.2 us | 24.30 us | 70.88 us | 931.7 us | 0.54 | 0.02 | - | - | - | 679.96 KB |
SearchBleedingJsonNetSerializer | 931.4 us | 22.70 us | 64.76 us | 917.0 us | 0.53 | 0.04 | - | - | - | 679.26 KB |
SearchBleedingJsonNetSerializerAsync | 926.7 us | 25.70 us | 74.97 us | 901.9 us | 0.53 | 0.05 | - | - | - | 679.96 KB |
1000 Stackoverflow questions
Method | Mean | Error | StdDev | Median | Ratio | RatioSD | Gen 0/1k Op | Gen 1/1k Op | Gen 2/1k Op | Allocated Memory/Op |
---|---|---|---|---|---|---|---|---|---|---|
Search6x | 19.473 ms | 0.1661 ms | 0.1554 ms | 19.511 ms | 1.00 | 0.00 | 625.0000 | 281.2500 | 62.5000 | 3.71 MB |
Search6xAsync | 15.165 ms | 0.2438 ms | 0.2162 ms | 15.121 ms | 0.78 | 0.01 | - | - | - | 3.71 MB |
Search6xJsonNetSerializer | 50.683 ms | 0.9887 ms | 1.6790 ms | 50.935 ms | 2.63 | 0.09 | 4000.0000 | 1000.0000 | - | 29.6 MB |
Search6xJsonNetSerializerAsync | 50.297 ms | 1.0050 ms | 1.6229 ms | 50.319 ms | 2.56 | 0.10 | 4000.0000 | 1000.0000 | - | 29.6 MB |
SearchBleeding | 8.276 ms | 0.1817 ms | 0.5328 ms | 7.994 ms | 0.43 | 0.03 | - | - | - | 6.38 MB |
SearchBleedingAsync | 7.887 ms | 0.1972 ms | 0.3012 ms | 7.790 ms | 0.41 | 0.02 | - | - | - | 6.38 MB |
SearchBleedingJsonNetSerializer | 8.189 ms | 0.1745 ms | 0.4565 ms | 7.954 ms | 0.42 | 0.02 | - | - | - | 6.38 MB |
SearchBleedingJsonNetSerializerAsync | 7.862 ms | 0.2369 ms | 0.4149 ms | 7.687 ms | 0.40 | 0.02 | - | - | - | 6.38 MB |
10,000 Stackoverflow questions
Method | Mean | Error | StdDev | Ratio | RatioSD | Gen 0/1k Op | Gen 1/1k Op | Gen 2/1k Op | Allocated Memory/Op |
---|---|---|---|---|---|---|---|---|---|
Search6x | 203.2 ms | 3.901 ms | 3.257 ms | 1.00 | 0.00 | 6000.0000 | 2000.0000 | - | 36.39 MB |
Search6xAsync | 205.6 ms | 2.221 ms | 2.078 ms | 1.01 | 0.01 | 6000.0000 | 2000.0000 | - | 36.39 MB |
Search6xJsonNetSerializer | 558.1 ms | 8.880 ms | 8.306 ms | 2.75 | 0.06 | 49000.0000 | 8000.0000 | - | 298.77 MB |
Search6xJsonNetSerializerAsync | 564.7 ms | 5.126 ms | 4.544 ms | 2.78 | 0.05 | 49000.0000 | 8000.0000 | - | 298.77 MB |
SearchBleeding | 117.4 ms | 1.359 ms | 1.271 ms | 0.58 | 0.01 | 4000.0000 | 1000.0000 | - | 90.12 MB |
SearchBleedingAsync | 114.2 ms | 1.980 ms | 1.852 ms | 0.56 | 0.01 | 4000.0000 | 1000.0000 | - | 90.12 MB |
SearchBleedingJsonNetSerializer | 118.2 ms | 1.572 ms | 1.471 ms | 0.58 | 0.01 | 4000.0000 | 1000.0000 | - | 90.12 MB |
SearchBleedingJsonNetSerializerAsync | 112.8 ms | 2.244 ms | 1.989 ms | 0.55 | 0.01 | 4000.0000 | 1000.0000 | - | 90.12 MB |
- 6.x is the 6.4.0 nuget package
- *Bleeding is the utf8json branch
A nice advantage of using utf8json as the internal serializer is that the handoff to a custom serializer can be done using a MemoryStream
constructed from an ArraySegment<byte>
, avoiding the need to read into a JToken
and construct a Stream
from the token, much reducing serialization time and allocations.
Allocated memory/op
The allocated memory per op is higher across the board with utf8json. To determine if this was a fixed amount of allocated memory/op, two searches were performed per benchmark method. The amount of allocated memory doubles
10,000 Stackoverflow questions with 2 search requests per benchmarked method
Method | Mean | Error | StdDev | Ratio | RatioSD | Gen 0/1k Op | Gen 1/1k Op | Gen 2/1k Op | Allocated Memory/Op |
---|---|---|---|---|---|---|---|---|---|
Search6x | 402.3 ms | 3.155 ms | 2.951 ms | 1.00 | 0.00 | 13000.0000 | 5000.0000 | 1000.0000 | 72.77 MB |
Search6xAsync | 408.8 ms | 11.595 ms | 16.630 ms | 1.03 | 0.05 | 13000.0000 | 5000.0000 | 1000.0000 | 72.78 MB |
Search6xJsonNetSerializer | 1,100.1 ms | 7.118 ms | 6.658 ms | 2.73 | 0.03 | 101000.0000 | 19000.0000 | 2000.0000 | 597.54 MB |
Search6xJsonNetSerializerAsync | 1,037.2 ms | 5.950 ms | 5.566 ms | 2.58 | 0.02 | 100000.0000 | 19000.0000 | 1000.0000 | 597.54 MB |
SearchBleeding | 259.7 ms | 3.799 ms | 3.368 ms | 0.65 | 0.01 | 9000.0000 | 4000.0000 | 1000.0000 | 180.25 MB |
SearchBleedingAsync | 248.5 ms | 3.518 ms | 3.291 ms | 0.62 | 0.01 | 9000.0000 | 4000.0000 | 1000.0000 | 180.25 MB |
SearchBleedingJsonNetSerializer | 260.5 ms | 2.991 ms | 2.652 ms | 0.65 | 0.01 | 9000.0000 | 4000.0000 | 1000.0000 | 180.25 MB |
SearchBleedingJsonNetSerializerAsync | 246.3 ms | 3.146 ms | 2.943 ms | 0.61 | 0.01 | 9000.0000 | 4000.0000 | 1000.0000 | 180.25 MB |