-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Performance issues with TypedData._getX/_setX operations #33205
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Rationale: Some native typed setters/getters in the typed_data library are only called from clients that perform explicit bounds checks first. Under strongly typed Dart2, there is no need to repeat these tests. Avoids overhead, and reduces the polymorphicness of these calls (preparing more inlining later). #33205 Tests: ./tools/test.py --mode release --arch x64 -r vm tools/test.py -m release -r vm -c dartk --strong standalone_2/typed_data_test Change-Id: I81771a4f4b41a21e344f2d50745bbc30480b2a6c Reviewed-on: https://dart-review.googlesource.com/57460 Commit-Queue: Aart Bik <[email protected]> Reviewed-by: Vyacheslav Egorov <[email protected]>
Rationale: Always inline int convertors that don't do much more than testing and anding. For example, force inline v26 <- StaticCall:66( _toUint8@6027147<0> v4 T{Type: class 'int'?}) T{Type: class 'int'?} to v47 <- Constant(#255) .. CheckSmi:10(v4 T{Type: class 'int'?}) v45 <- BinarySmiOp:10(&, v4 T{_Smi}, v47 T{_Smi}) T{_Smi} #33205 Change-Id: I595d9a64365e16ae244480b5e27f8be23c43d164 Reviewed-on: https://dart-review.googlesource.com/58061 Commit-Queue: Aart Bik <[email protected]> Reviewed-by: Aart Bik <[email protected]> Reviewed-by: Vyacheslav Egorov <[email protected]>
Rationale: Handles remaining polymorphic reason for typed_data setters and getters (internal vs. external) during inlining. Also introduces high level flow graph utilities that can be reused throughout the compiler to reduce future code duplication. Disables type speculation for 64-bit AOT Dart2 to make all work. Performance: About 4x speedup on micro benchmarks (AOT64). #33205 Change-Id: I678426719e49cd8aa1e5051523da12178120b3ba Reviewed-on: https://dart-review.googlesource.com/59000 Reviewed-by: Vyacheslav Egorov <[email protected]> Reviewed-by: Alexander Markov <[email protected]> Commit-Queue: Aart Bik <[email protected]>
Timings of view cases on AOT 64-bit have substantially improved. On my desktop: user 0m0.980s Still a few i's to dot before we can close this. |
We should check if ARM32 saw any improvement at all and if not - see what is missing to get better code (e.g. be better with choosing non-speculative path on 32-bit platforms?). |
Rationale: Rather than forced inlining of clamped convertors (the "saturated" method _toClampedUint8() from dart:typed_data), exposing the fact that it always returns smi values (since they check fail on null inputs) yields much better Uint8ClampedListView performance (it avoids re-compilation due to a speculative CheckSmi). Note: In the long run, we may still want them inlined and improve range analysis to deal with clamping. Performance: About 2.8x faster than previous optimized version, about 4.5x faster than original. #33205 Change-Id: I86a06525d2f2ea0476effd3c3d856ff8d9ab1d87 Reviewed-on: https://dart-review.googlesource.com/60201 Commit-Queue: Aart Bik <[email protected]> Reviewed-by: Alexander Markov <[email protected]>
The i's to dot: |
Rationale: With limited integers, signed/unsigned 64-bit int typed data is just a bit pattern, no need to sign/zero extend these into bigger ints on load. This enables more intrinsification of indexed stores/loads. Note: Still TBD, inline these indexed operations too. Performance: About 10x improvement on micro benchmarks. #33205 Change-Id: I640c324a7d91e57fb4edc025e0dd456ad34fe906 Reviewed-on: https://dart-review.googlesource.com/60403 Reviewed-by: Alexander Markov <[email protected]> Commit-Queue: Aart Bik <[email protected]>
Rationale: This CL finalizes recent improvements by allowing inlining 64-bit and double getter/setter operations. The runtime over all data types (with and without view) now no longer has outliers. Note: 64-bit targets only, 32-bit targets still tbd. Performance: About 4x improvement on micro benchmarks. #33205 Change-Id: Ic82fa24167a68e3c196edf4843f0829c7fbcf9e1 Reviewed-on: https://dart-review.googlesource.com/60451 Commit-Queue: Aart Bik <[email protected]> Reviewed-by: Vyacheslav Egorov <[email protected]> Reviewed-by: Alexander Markov <[email protected]>
Internal benchmark shows "flat" behavior now for all typed data setters and getters on 64-bit AOT. Last item to look at is doing a bit more on 32-bit as well (also AOT vs JIT differences are tracked elsewhere). |
Rationale: All recent improvements, now for 32-bit too. Performance: Many large improvements on micro benchmarks. Meteor down as expected. #33205 Change-Id: Ie9ebcfdfe9c5e265595c95d5e943ae35c5700a97 Reviewed-on: https://dart-review.googlesource.com/63685 Commit-Queue: Aart Bik <[email protected]> Reviewed-by: Vyacheslav Egorov <[email protected]>
Rationale: With limited integers, signed/unsigned 64-bit int typed data is just a bit pattern, no need to sign/zero extend these into bigger ints on load. This enables more intrinsification of indexed stores/loads. Note: Still TBD, inline these indexed operations too. Performance: About 10x improvement on micro benchmarks. dart-lang#33205 Change-Id: I640c324a7d91e57fb4edc025e0dd456ad34fe906 Reviewed-on: https://dart-review.googlesource.com/60403 Reviewed-by: Alexander Markov <[email protected]> Commit-Queue: Aart Bik <[email protected]>
Rationale: All recent improvements, now for 32-bit too. Performance: Many large improvements on micro benchmarks. Meteor down as expected. dart-lang#33205 Change-Id: Ie9ebcfdfe9c5e265595c95d5e943ae35c5700a97 Reviewed-on: https://dart-review.googlesource.com/63685 Commit-Queue: Aart Bik <[email protected]> Reviewed-by: Vyacheslav Egorov <[email protected]>
Uh oh!
There was an error while loading. Please reload this page.
These functions only have a full native implementation in runtime and inliner only supports inlining them if receiver type is known.
The main reason for only inlining them if receiver type is known is because they are polymorphic with respect to receiver type:
lengthInBytes
, which makes these operations statically polymorphic - aslengthInBytes
is not stores inTypedData
and instead needs to be computed based on the receiver class.TypedData
.Inability to inline these methods is most visible in AOT code that works with typed data views:
where we end up always calling runtime:
The list of things we could improve:
Longer term we should investigate if we could normalize representation of views in such a way that view into external typed data is not different from the view into non-external typed_data.
The text was updated successfully, but these errors were encountered: