fix(NODE-3451): fix performance regression from v1 #451
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
NODE-3451 documents a performance regression in the node driver v4, which is actually due to a performance regression in js-bson v4 deserialization method (compared to v1).
The notable culprits were:
What changed?
deserializemethod has been updated to checkinstanceof Bufferand skip the rewrapping in those instances; this is a temporary measure that only addresses performance for Node.js buffersdeserializeStreamwas left untouched for scope reasonsdeserializeObjectmethod was updated to check for the presence of potential DBRef keys as it goes, removing the negative performance impact for any objects that do not contain any DBRef keys; there is some further optimization that could be done to eliminate theisDBRefLikecheck altogether, but since we expect these to be pretty rare, it didn't seem worth optimizing that specific edge casevalidateUtf8method was updated to only run if the\uFFFDcharacter is present: technically, this makes the performance worse for strings that do contain that special character, however, for all other strings, the loop over the resulting string withcharCodeAtis faster; unfortunately there is not much else that can be done to optimize string deserialization without losing the validation (short of doing our own decoding)validateUtf8call in DBPOINTER type was left untouched for scope reasonsAfter these changes, there may still be a residual 5% performance degradation for the typical use case relative to v1 which can be attributed to the remaining buffer and string validation.