Skip to content

avoid checked entry points in closures #40813

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
mraleph opened this issue Feb 28, 2020 · 10 comments
Closed

avoid checked entry points in closures #40813

mraleph opened this issue Feb 28, 2020 · 10 comments
Assignees
Labels
area-vm Use area-vm for VM related issues, including code coverage, and the AOT and JIT backends. type-performance Issue relates to performance or code size vm-aot-code-size Related to improvements in AOT code size

Comments

@mraleph
Copy link
Member

mraleph commented Feb 28, 2020

Given the code like this:

void main(List<String> args) {
  // Condition here just to avoid inlining of the closure
  int Function(int) x = args.length > 0 ? (x) => x + 1 : (x) => x + 2;
  x(args.length);
}

we produce the following graph for the closure entry:

After AllocateRegisters
==== file:///tmp/x.dart_::_main_<anonymous closure>
  0: B0[graph]:0 {
      v0 <- Constant(#null) T{Null?}
      v1 <- Constant(#<optimized out>) T{_OneByteString}
      v6 <- Constant(#2) [2, 2] T{_Smi}
      v17 <- Constant(#0) [0, 0] T{_Smi}
      v26 <- Constant(#1) [1, 1] T{_Smi}
}
  2: B1[function entry]:2 {
      v31 <- Parameter(0) T{*?}
      v32 <- Parameter(1) T{*?}
      v33 <- SpecialParameter(ArgDescriptor) T{_ImmutableList}
}
  3:     ParallelMove rax <- S+2
  4:     ParallelMove rax <- C, rbx <- rax, r10 <- r10 goto:64 B12
  6: B13[function entry]:66 {
      v2 <- Parameter(0) T{*?}
      v3 <- Parameter(1) T{int?}
      v4 <- SpecialParameter(ArgDescriptor) T{_ImmutableList}
}
  7:     ParallelMove rax <- S+2
  8:     ParallelMove rax <- C, rbx <- rax, r10 <- r10 goto:68 B12
 10: B12[join]:62 pred(B1, B13) {
      v13 <- phi(v17, v6) alive [0, 2] T{_Smi}
      v9 <- phi(v32, v3) alive T{*?}
      v11 <- phi(v33, v4) alive T{_ImmutableList}
}
 11:     ParallelMove S-1 <- rbx
 12:     v15 <- LoadField(v11 . ArgumentsDescriptor.type_args_len {final}) [0, 4611686018427387903] T{_Smi}
 14:     Branch if StrictCompare:8(===, v15, v17) goto (3, 5)
 16: B3[target]:12
 18:     v18 <- LoadField(v11 . ArgumentsDescriptor.count {final}) [0, 4611686018427387903] T{_Smi}
 20:     Branch if StrictCompare:22(===, v18, v6) goto (7, 8)
 22: B7[target]:26
 24:     v20 <- LoadField(v11 . ArgumentsDescriptor.positional_count {final}) [0, 4611686018427387903] T{_Smi}
 26:     Branch if StrictCompare:30(===, v18, v20) goto (11, 10)
 28: B11[target]:34
 30:     Branch if StrictCompare:70(===, v13, v6) goto (14, 15)
 32: B14[target]:74
 34:     ParallelMove rcx <- rbx goto:82 B16
 36: B15[target]:76
 37:     ParallelMove rax <- rbx, rdx <- C, rcx <- C
 38:     AssertAssignable:52(v9, int, 'x', instantiator_type_args(v0), function_type_args(v0)) T{int?}
 40:     ParallelMove rcx <- S-1 goto:80 B16

Notice that checked and unchecked entries merge first and then we perform argument descriptor checks and AssertAssignable against parameter. Though these are guarded by check against an integer which encodes which entry we arrived from.

There are two issues here:

  • Performance: we should not actually need to check arguments descriptor - nor type check the parameter on unchecked entry. Dart 2 type system guarantees that signature matches on typed calls to a closure. This can be addressed by moving this code around in the graph. Note: we don't actually type check the arguments when entering from unchecked entry, I made a mistake reading the graph.
  • Code size: this prologue adds up to significant code in at the start of each function. For example on X64 the function just returning x + 1 ends up being 268 bytes.

I propose to change to the implementation which penalises dynamic invocations of closures in attempt to save on code size: we already emit direct invocations of function body at all typed call sites. So all dynamic invocations should be going through Closure.call method, which means we can just change the body of that method to perform necessary checks against closure signature (which should be available for type checks anyway).

/cc @alexmarkov @mkustermann @askeksa-google

@mraleph mraleph added area-vm Use area-vm for VM related issues, including code coverage, and the AOT and JIT backends. type-performance Issue relates to performance or code size vm-aot-code-size Related to improvements in AOT code size labels Feb 28, 2020
@mraleph
Copy link
Member Author

mraleph commented Feb 28, 2020

/cc @sjindel-google

@sjindel-google sjindel-google self-assigned this Mar 2, 2020
@mraleph
Copy link
Member Author

mraleph commented Mar 25, 2020

Note: I misread the graph above - we actually have a SSA definition which tells us which entry point we came from and we skip AssertAssignable-s is we came from unchecked entry point. So I would not expect massive performance improvements from removing checked entry points.

@mkustermann
Copy link
Member

As mentioned offline, maybe a good first step would be to make a (semantically unsound) change to a) not do any checks in the closure prologues and b) have only one entry point and see how big the code size would reduce for gallery.

@sstrickl
Copy link
Contributor

Applied the following patch, which causes argument type checks to be omitted (which has the side effect of also making it a single entry point):

diff --git a/runtime/vm/object.h b/runtime/vm/object.h
index fbeb2c80e1..c2776afa86 100644
--- a/runtime/vm/object.h
+++ b/runtime/vm/object.h
@@ -2774,7 +2774,7 @@ class Function : public Object {
   // Whether this function can receive an invocation where the number and names
   // of arguments have not been checked.
   bool CanReceiveDynamicInvocation() const {
-    return IsClosureFunction() || IsFfiTrampoline();
+    return IsFfiTrampoline();
   }
 
   bool HasThisParameter() const {
@@ -2844,7 +2844,7 @@ class Function : public Object {
     if (!I->should_emit_strong_mode_checks()) {
       return false;
     }
-    return IsClosureFunction() ||
+    return !IsClosureFunction() &&
            !(is_static() || (kind() == RawFunction::kConstructor));
   }

Numbers for Flutter gallery:

CPU Benchmark mode Size comparison Change (percentage)
arm7 release instructions sections -6.20%
uncompressed app.so size -4.10%
brotli app.so size -2.40%
release-sizeopt instructions sections -6.37%
uncompressed size -4.71%
brotli size -2.50%
arm8 release instructions sections -6.40%
uncompressed app.so size -3.93%
brotli app.so size -2.38%
release-sizeopt instructions sections -6.39%
uncompressed size -4.54%
brotli size -3.26%

So this gives us an approximation of the possible savings by the approach suggested by @mraleph .

@sstrickl
Copy link
Contributor

sstrickl commented Apr 24, 2020

Also, on ProductX64, the closure returning x+1 is 107 bytes with the above patch applied versus the now 274 bytes on master.

@sstrickl
Copy link
Contributor

My current plan is to generalize our type-related assert instructions (AssertAssignable, AssertSubtype) to take Values instead of AbstractTypes. This will be done in two steps:

  1. We generalize the instruction, but make the non-constant case an error. Since no code currently generates the non-constant case, no issues should be seen from this change. There should also be no change in code generation in the constant case, so no code size/speed regressions.
  2. We flesh out the case where the types are not constants, and so must be fully checked at runtime.

Once step 1 is in place for each, we will make a single dynamic invocation forwarder for _Closure.call that pulls the type information out of the closure value at runtime and uses the non-constant version of these instructions. (Step 2 will likely be done concurrently with this, so we also introduce uses of the non-constant case at the same time as the code that implements it and existing tests for dynamic closure calling will test the non-constant code generation.)

(Step 1 is already done for AssertAssignable in CL 146801, but I want to describe the plan here so I can point TODOs for step 2 for it and AssertSubtype here until they're completed.)

dart-bot pushed a commit that referenced this issue May 12, 2020
Previously, AssertAssignable instructions could only be created with a
compile-time AbstractType value for dst_type. Relax this so that
AssertAssignable instructions can be created with a possibly
non-constant Value instead.

This is only an IL-level change for the instruction. Code generation for
AssertAssignable still requires the dst_type Value to be bound to a
constant and is otherwise unchanged from before, thus this should have
no impact on size or speed of generated code. (Generating the
non-constant case will be done in follow-up work.)

Bug: #40813

Cq-Include-Trybots: luci.dart.try:vm-kernel-precomp-linux-release-x64-try,vm-kernel-precomp-linux-product-x64-try,vm-kernel-precomp-linux-debug-x64-try,vm-kernel-precomp-linux-release-simarm-try,vm-kernel-precomp-linux-release-simarm_x64-try,vm-kernel-precomp-linux-release-simarm64-try,vm-dartkb-linux-release-simarm64-try,vm-dartkb-linux-release-x64-try
Bug: #40813
Change-Id: I1f984039d8c2695c66161a69a3d80cfbe7110f19
Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/146801
Commit-Queue: Tess Strickland <[email protected]>
Reviewed-by: Martin Kustermann <[email protected]>
Reviewed-by: Vyacheslav Egorov <[email protected]>
dart-bot pushed a commit that referenced this issue May 15, 2020
Previously, AssertSubtype instructions could only be created with a
compile-time AbstractType value for sub_type and super_type. Relax this
so that AssertSubtype instructions can be created with possibly
non-constant Values instead.

Currently, all AssertSubtype uses still have constant sub- and
supertypes, so for now we check at code generation that we indeed
have constants for these values and only handle the constant case. Thus,
there should be no impact on size or speed of the generated code from
these changes. Follow-up work will add code generation for non-constant
sub- and supertypes and add uses the generalized form.

Bug: #40813

Cq-Include-Trybots: luci.dart.try:vm-kernel-precomp-linux-release-x64-try,vm-kernel-precomp-linux-product-x64-try,vm-kernel-precomp-linux-debug-x64-try,vm-kernel-precomp-linux-release-simarm-try,vm-kernel-precomp-linux-release-simarm_x64-try,vm-kernel-precomp-linux-release-simarm64-try
Change-Id: Ib1512b3e07a016d68a8e0c670ce857b8ac1b2777
Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/147520
Commit-Queue: Tess Strickland <[email protected]>
Reviewed-by: Martin Kustermann <[email protected]>
dart-bot pushed a commit that referenced this issue Jul 2, 2020
Adds TODO comments in appropriate places for future work that will move
non-covariant type checks out of the closure body. Instead, the VM will
perform them in the invoke field dispatcher (or NoSuchMethodFromCallStub
if --no-lazy-dispatchers is used) when a dynamic call is detected.

This change has minimal negative effects on the code size. Here are the
code size change percentages for the Flutter Gallery in release mode:

* ARM7
  * Instructions: +0.0391%
  * ROData: -0.0040%
  * Total: +0.0239%
* ARM8:
  * Instructions: No change
  * ROData: +0.0015%
  * Total: +0.0004%

All other code size benchmarks are also <0.01% increase.

Bug: #40813
Change-Id: I4bf145803bb9e2d4ba5c22c12b6fd3bb5368441d
Cq-Include-Trybots: luci.dart.try:vm-kernel-precomp-linux-release-x64-try,vm-kernel-precomp-nnbd-linux-release-x64-try,vm-dartkb-linux-release-x64-try
Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/151826
Commit-Queue: Tess Strickland <[email protected]>
Reviewed-by: Alexander Markov <[email protected]>
Reviewed-by: Martin Kustermann <[email protected]>
dart-bot pushed a commit that referenced this issue Jul 16, 2020
This change only affects compilation when running in non-precompiled
mode with --no-lazy-dispatchers enabled.

Instead of always compiling in non-covariant checks, even for closures
not called dynamically, remove the non-covariant checks from the closure
and instead do the non-covariant checks for dynamic calls during the
NoSuchMethodForCallStub fallback by calling
Function::DoArgumentTypesMatch.

Adds two overloads for Function::DoArgumentTypesMatch, one which takes a
function type argument vector and one which takes neither an
instantiator type argument vector or a function type argument vector.
For the versions that are not explicitly passed a type argument vector,
an appropriate one is constructed using the arguments. If there is not
enough information in the arguments, then we fall back to assuming the
empty type argument vector for the instantiator case and instantiating
to bounds in the function type argument case.

Fixes Function::DoArgumentTypesMatch to handle generic functions and to
check arguments appropriately according to the active null safety mode.
For generic functions, the provided or resulting function type vector
has non-covariant checks performed against the type parameter bounds.

This change uncovered one test that was incorrectly passing in strong
mode, see #42688 for details.

Cq-Include-Trybots: luci.dart.try:vm-kernel-precomp-linux-release-x64-try,vm-dartkb-linux-release-x64-try
Change-Id: I5030a46b356778448b51a243f16406eacf1c04bf
Bug: #40813
Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/153202
Commit-Queue: Tess Strickland <[email protected]>
Reviewed-by: Régis Crelier <[email protected]>
Reviewed-by: Martin Kustermann <[email protected]>
dart-bot pushed a commit that referenced this issue Jul 16, 2020
The VM only does this when the callable function does not expect dynamic
invocations. Otherwise, performing the checks would be redundant, as the
function body already contains the appropriate non-covariant checks.

Cq-Include-Trybots: luci.dart.try:vm-kernel-precomp-linux-release-x64-try,vm-dartkb-linux-release-x64-try
Bug: #40813
Change-Id: I27b5f4067247c6b1b76ba57951a1d91455c2eeb5
Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/154463
Commit-Queue: Tess Strickland <[email protected]>
Reviewed-by: Martin Kustermann <[email protected]>
dart-bot pushed a commit that referenced this issue Jul 16, 2020
Cq-Include-Trybots: luci.dart.try:vm-kernel-precomp-linux-debug-x64-try
Bug: #40813
Change-Id: I36e7150a4450be52e563df459eb47abb69d59281
Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/154686
Reviewed-by: Alexander Aprelev <[email protected]>
Commit-Queue: Tess Strickland <[email protected]>
dart-bot pushed a commit that referenced this issue Jul 21, 2020
Also relands the followup CLs:
"Perform non-covariant checks when dynamically invoking callables."
"Use AreValidArguments so that names are checked as well."

Original description of first CL:

This change only affects compilation when running in non-precompiled
mode with --no-lazy-dispatchers enabled.

Instead of always compiling in non-covariant checks, even for closures
not called dynamically, remove the non-covariant checks from the closure
and instead do the non-covariant checks for dynamic calls during the
NoSuchMethodForCallStub fallback by calling
Function::DoArgumentTypesMatch.

Adds two overloads for Function::DoArgumentTypesMatch, one which takes a
function type argument vector and one which takes neither an
instantiator type argument vector or a function type argument vector.
For the versions that are not explicitly passed a type argument vector,
an appropriate one is constructed using the arguments. If there is not
enough information in the arguments, then we fall back to assuming the
empty type argument vector for the instantiator case and instantiating
to bounds in the function type argument case.

Fixes Function::DoArgumentTypesMatch to handle generic functions and to
check arguments appropriately according to the active null safety mode.
For generic functions, the provided or resulting function type vector
has non-covariant checks performed against the type parameter bounds.

This change uncovered one test that was incorrectly passing in strong
mode, see #42688 for details.

Original description of second CL:

The VM only does this when the callable function does not expect dynamic
invocations. Otherwise, performing the checks would be redundant, as the
function body already contains the appropriate non-covariant checks.

Third CL had no additional description.

Cq-Include-Trybots: luci.dart.try:vm-kernel-precomp-linux-debug-x64-try,vm-dartkb-linux-release-x64-try,vm-kernel-reload-linux-release-x64-try, vm-kernel-reload-rollback-linux-debug-x64-try
Bug: #40813
Change-Id: I1a3e9c1865103a8d716e1cad814267caffaaadf2
Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/154688
Reviewed-by: Martin Kustermann <[email protected]>
dart-bot pushed a commit that referenced this issue Aug 17, 2020
This CL performs the following checks in the invoke field dispatcher for
dynamic closure calls when lazy dispatchers are enabled:

* The provided function type arguments vector (if any) has the correct
  length.

* No function type arguments should be provided if the closure has
  delayed type arguments.

* All required arguments (positional in all modes, named in appropriate
  null safety modes) have been provided by the caller.

* If there are optional positional arguments, an appropriate number
  has been provided.

* If there are optional named arguments, their names are valid.

Since the runtime already handles checking the argument shapes when lazy
dispatchers are disabled, these checks are now completely removed from
closure bodies in all cases. Thus, the only remaining checks in closure
bodies are the type checks performed by AssertSubtype and
AssertAssignable when lazy dispatchers are enabled.

Changes in the Flutter Gallery:

* ARM7, release: -3.61% instructions, -2.19% total
* ARM7, sizeopt: -3.62% instructions, -2.55% total
* ARM8, release: -3.66% instructions, -1.98% total
* ARM8, sizeopt: -3.65% instructions, -2.37% total

Most of these changes are already exercised by existing tests such as
(but not limited to):

* corelib{,_2}/dynamic_nosuchmethod_test
* language{,_2}/call/call_test
* language{,_2}/closure/tearoff_dynamic_test
* language{,_2}/generic/function_bounds_test
* language{,_2}/parameter/named_with_conversions_test
* language{,_2}/vm/no_such_args_error_message_vm_test

I've added one test to specifically check the interaction between
dynamic calls and required named parameters. There is some coverage in
other NNBD tests, but those are not directly focused on testing this
specifically.

Other changes:

* Adds initial cached ranges for certain BinarySmiOp and ShiftIntegerOp
  instructions when the RHS is a constant, to avoid false negatives for
  deoptimization and throw checks prior to range analysis.

* Adds new slots for various Function fields.

* Adds the ability to define unboxed native slots, which are always
  unboxed after retrieval even in unoptimized code. In the first
  iteration, the backend only handles loads from Uint32 unboxed native
  slots. Part of #42793.

* Removed the special handling for loading from non-nullable int fields
  in AOT compilation. Instead, their treatment is unified with the
  treatment of the new unboxed native fields, since the source field is
  always unboxed and the result of the load is also always unboxed, as
  code involving them is always optimized.

Bug: #40813
Change-Id: Ia02aa3e872c1fefd906fd67b55021ea1797556e4
Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/155604
Commit-Queue: Tess Strickland <[email protected]>
Reviewed-by: Alexander Markov <[email protected]>
@sstrickl
Copy link
Contributor

As expected, we saw regressions after 7874462 in benchmarks that are using dynamic closure calls (and improvements in ones that don't). Given that, I plan to check into the regressions as follows:

  • Verify all negatively-affected benchmarks are only due to explicit uses of dynamic closure calls.
    • If any unexpected regressions are found, investigate what is causing the regression.
  • For regressed benchmarks that are not meant to explicitly benchmark dynamic calls, create typed or non-dynamic versions of the benchmarks.
  • Investigate the generated code for dynamic closure call dispatchers for improvements.

dart-bot pushed a commit that referenced this issue Aug 28, 2020
Bug: #40813
Change-Id: Ifdb7dfef4fc581c14cb570a534a64d73f5be9449
Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/160726
Commit-Queue: Tess Strickland <[email protected]>
Reviewed-by: Jonas Termansen <[email protected]>
dart-bot pushed a commit that referenced this issue Sep 7, 2020
Arguments descriptors for invocation dispatchers are loaded as
constants.  However, we only optimize type argument length field loads
from constant argument descriptors, so the prologue code is less
efficient than it could be in these cases.

Add the appropriate cases for pulling out the count, positional count,
and size as constants from constant argument descriptors in
LoadFieldInstr::TryEvaluateLoad, which is enough to let the optimizing
compiler tune this code appropriately.

Also remove the code for handling dynamic closure call dispatchers
specially within the prologue builder and instead explicitly generate
any checks that use the runtime closure value in the dynamic closure
call fragment builders after the prologue.

Bug: #40813
Change-Id: Ie9bd4c01c3b604f282e9b2a72327b7fc81dc4eaf
Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/159905
Commit-Queue: Tess Strickland <[email protected]>
Reviewed-by: Martin Kustermann <[email protected]>
Reviewed-by: Alexander Markov <[email protected]>
dart-bot pushed a commit that referenced this issue Sep 8, 2020
Also replace any null flag entries left in the parameter names array to
zero Smi values, so that after truncation, no null checks are needed for
retrieved flag entries.

Now that all appropriate places truncate the parameter names array after
setting any required bits, dynamic closure call dispatchers can simply
check for any post-name flag entries to detect missing required named
parameters if no named arguments were passed.

Without the added truncate calls, then
vm/dart/minimal_kernel_bytecode_test fails when dynamic closure call
dispatchers use that shortcut as seen on patchset 1 of CL 160725.

Bug: #40813
Change-Id: I25abd23ad3ce42b03ec5ca29b4e067c162826066
Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/160860
Commit-Queue: Tess Strickland <[email protected]>
Reviewed-by: Martin Kustermann <[email protected]>
Reviewed-by: Alexander Markov <[email protected]>
dart-bot pushed a commit that referenced this issue Sep 9, 2020
As a preparation for being able to dynamically call TTS on any type object we ensure that
also in the non-dynamic AssertAssignables we always use TTS.

Issue #40813

Change-Id: Ia7a2aa98eff940e4ead28f6158f13c084ecb3818
Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/162008
Commit-Queue: Martin Kustermann <[email protected]>
Reviewed-by: Aske Simon Christensen <[email protected]>
Reviewed-by: Tess Strickland <[email protected]>
dart-bot pushed a commit that referenced this issue Sep 9, 2020
…be called indirectly

Right now AssertAssignable will not call the TTS of a TypeParameter but
instead load the value for the type parameter inline and then call it's
TTS.

To prepare for future changes where we want to call TTS of an arbitrary
type object, we need to be able to install a functioning TTS on
TypeParameter, which this CL does.

Issue #40813

Change-Id: I0792cc1a796fe8afb2167b7665580b3f1385005b
Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/162007
Commit-Queue: Martin Kustermann <[email protected]>
Reviewed-by: Tess Strickland <[email protected]>
dart-bot pushed a commit that referenced this issue Sep 10, 2020
Instead of possibly looping over the closure function parameter names
array multiple times, just do it once. Check each entry against the
provided argument names. If none match, check the required bit (if in
null-safe mode). Also keep a count of matched names to check that all
provided names matched after the iteration.

Now that optional name checking is part of the fragment built by
BuildClosureCallArgumentsValidCheck, that method builds checks for
exactly the same things as its namesake in Function.

Also refactor the variables used by the checker into read/write ones
that need to be allocated in the parsed function and read-only ones
that can just be temporaries since no Phi nodes are needed.

Cq-Include-Trybots: luci.dart.try:vm-kernel-linux-release-x64-try,vm-kernel-precomp-linux-product-x64-try,vm-kernel-linux-debug-x64-try,vm-kernel-precomp-linux-release-simarm_x64-try,vm-kernel-linux-release-simarm-try,vm-kernel-linux-release-simarm64-try
Bug: #40813
Change-Id: I3cb421dd538629d7f5499f3bbf0653d34b850dce
Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/160725
Commit-Queue: Tess Strickland <[email protected]>
Reviewed-by: Martin Kustermann <[email protected]>
Reviewed-by: Daco Harkes <[email protected]>
dart-bot pushed a commit that referenced this issue Sep 14, 2020
Current users of InstantiateTypeArguments have the uninstantiated type
arguments available at compile time. After upcoming work on #40813,
there are cases where the generated code needs to instantiate type
arguments that are only available at runtime.

This work changes the uninstantiated type arguments for the
InstantiateTypeArguments instruction to a runtime Value pointer instead
of a compile time TypeArguments reference. The changes beyond that are
limited to only what is needed to get current uses working, with ASSERTs
checking that current invariants still hold. The rest of the changes
needed for dynamic uninstantiated type arguments will come later.

Bug: #40813
Cq-Include-Trybots: luci.dart.try:vm-kernel-linux-release-x64-try,vm-kernel-precomp-linux-product-x64-try,vm-kernel-linux-debug-x64-try,vm-kernel-precomp-linux-release-simarm_x64-try,vm-kernel-linux-release-simarm-try,vm-kernel-linux-release-simarm64-try
Change-Id: Iafe836cfd8744d3e835e60112a4f099e631f4782
Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/162183
Reviewed-by: Martin Kustermann <[email protected]>
Commit-Queue: Tess Strickland <[email protected]>
dart-bot pushed a commit that referenced this issue Oct 26, 2020
These checks are now performed in the dynamic closure call dispatcher.
To avoid having to create type argument vectors (TAVs) for default type
arguments at runtime, we cache a compile-time created TAV in the
ClosureData for the closure function which is retrieved by the
dispatcher when needed.

We also keep an associated packed field of information that can also be
determined at compile time:

* Whether the cached TAV needs instantiation or can share its
  instantiator or function type arguments.

* The number of parent type parameters.

The former allows the generated IL to keep the invariant that the
InstantiateTypeArguments instruction (and the runtime entry it calls) is
only used for uninstantiated TAVs.

Also changes the destination name to an Value input for the
AssertSubtype instruction and adds handling for non-constant types and
names in that instruction's backend.

Additional changes:

* Adds new slots for ClosureData, Function, and TypeArguments.

* Adds a new kUnboxedUint8 representation (needed for
  TypeParameterLayout::flags_, which is of type uint8_t).

  * Extends LoadField to handle uint8_t unboxed native fields.

  * Adds BoxUint8 for boxing unboxed uint8_t values.

Code size impact on Flutter gallery in release mode:

* arm7: total +0.08%, vmisolate: +0.56%, isolate: +0.42%,
        readonly: -0.04%, instructions: -0.03%

* arm8: total +0.12%, vmisolate: +0.56%, isolate: +0.42%,
        readonly: -0.002%, instructions: +0.03%

Cq-Include-Trybots: luci.dart.try:vm-kernel-linux-debug-ia32-try,vm-kernel-linux-debug-x64-try,vm-kernel-nnbd-linux-debug-ia32-try,vm-kernel-nnbd-linux-debug-x64-try,vm-kernel-precomp-linux-debug-simarm_x64-try,vm-kernel-precomp-linux-debug-x64-try,vm-kernel-precomp-nnbd-linux-debug-simarm_x64-try,vm-kernel-precomp-nnbd-linux-debug-x64-try,vm-kernel-linux-release-simarm-try,vm-kernel-linux-release-simarm64-try,vm-kernel-nnbd-linux-release-simarm-try,vm-kernel-nnbd-linux-release-simarm64-try,vm-kernel-precomp-linux-release-simarm-try,vm-kernel-precomp-linux-release-simarm64-try,vm-kernel-precomp-nnbd-linux-release-simarm64-try
Bug: #40813
Change-Id: I5a7de27a17e3119e27752bd0d10e1c6bc1b52a16
Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/158844
Commit-Queue: Tess Strickland <[email protected]>
Reviewed-by: Martin Kustermann <[email protected]>
Reviewed-by: Régis Crelier <[email protected]>
tekknolagi pushed a commit to tekknolagi/dart-assembler that referenced this issue Nov 3, 2020
These checks are now performed in the dynamic closure call dispatcher.
To avoid having to create type argument vectors (TAVs) for default type
arguments at runtime, we cache a compile-time created TAV in the
ClosureData for the closure function which is retrieved by the
dispatcher when needed.

We also keep an associated packed field of information that can also be
determined at compile time:

* Whether the cached TAV needs instantiation or can share its
  instantiator or function type arguments.

* The number of parent type parameters.

The former allows the generated IL to keep the invariant that the
InstantiateTypeArguments instruction (and the runtime entry it calls) is
only used for uninstantiated TAVs.

Also changes the destination name to an Value input for the
AssertSubtype instruction and adds handling for non-constant types and
names in that instruction's backend.

Additional changes:

* Adds new slots for ClosureData, Function, and TypeArguments.

* Adds a new kUnboxedUint8 representation (needed for
  TypeParameterLayout::flags_, which is of type uint8_t).

  * Extends LoadField to handle uint8_t unboxed native fields.

  * Adds BoxUint8 for boxing unboxed uint8_t values.

Code size impact on Flutter gallery in release mode:

* arm7: total +0.08%, vmisolate: +0.56%, isolate: +0.42%,
        readonly: -0.04%, instructions: -0.03%

* arm8: total +0.12%, vmisolate: +0.56%, isolate: +0.42%,
        readonly: -0.002%, instructions: +0.03%

Cq-Include-Trybots: luci.dart.try:vm-kernel-linux-debug-ia32-try,vm-kernel-linux-debug-x64-try,vm-kernel-nnbd-linux-debug-ia32-try,vm-kernel-nnbd-linux-debug-x64-try,vm-kernel-precomp-linux-debug-simarm_x64-try,vm-kernel-precomp-linux-debug-x64-try,vm-kernel-precomp-nnbd-linux-debug-simarm_x64-try,vm-kernel-precomp-nnbd-linux-debug-x64-try,vm-kernel-linux-release-simarm-try,vm-kernel-linux-release-simarm64-try,vm-kernel-nnbd-linux-release-simarm-try,vm-kernel-nnbd-linux-release-simarm64-try,vm-kernel-precomp-linux-release-simarm-try,vm-kernel-precomp-linux-release-simarm64-try,vm-kernel-precomp-nnbd-linux-release-simarm64-try
Bug: dart-lang#40813
Change-Id: I5a7de27a17e3119e27752bd0d10e1c6bc1b52a16
Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/158844
Commit-Queue: Tess Strickland <[email protected]>
Reviewed-by: Martin Kustermann <[email protected]>
Reviewed-by: Régis Crelier <[email protected]>
dart-bot pushed a commit that referenced this issue Nov 18, 2020
This reverts commit 42c76fd.

Reason for revert: Failures in flutter/google3. Initial hypothesis is related to product mode and instruction deduplication.

Original change's description:
> [vm/compiler] Move AssertAssignables out of closure bodies.
>
> This CL moves the final set of checks out of closure bodies and into
> dynamic closure call dispatchers.  It also adds stubs for checking top
> types and null assignability for types only known at runtime.
>
> Fixes #40813 .
>
> Changes in Flutter gallery in release mode:
>
> * arm7: -3.05% total, +0.99% vmisolate, -0.89% isolate,
>         -1.20% readonly, -4.43% instructions
> * arm8: -3.20% total, +0.99% vmisolate, -0.88% isolate,
>         -1.18% readonly, -5.05% instructions
>
> TEST=Run on trybots of all architectures, includes test adjustments where needed.
>
> Cq-Include-Trybots: luci.dart.try:vm-kernel-linux-debug-x64-try,vm-kernel-nnbd-linux-debug-x64-try,vm-kernel-linux-debug-ia32-try,vm-kernel-nnbd-linux-debug-ia32-try,vm-kernel-precomp-linux-debug-simarm_x64-try,vm-kernel-precomp-linux-debug-x64-try,vm-kernel-precomp-nnbd-linux-debug-simarm_x64-try,vm-kernel-precomp-nnbd-linux-debug-x64-try,vm-kernel-linux-release-simarm-try,vm-kernel-linux-release-simarm64-try,vm-kernel-nnbd-linux-release-simarm-try,vm-kernel-nnbd-linux-release-simarm64-try,vm-kernel-precomp-linux-release-simarm-try,vm-kernel-precomp-linux-release-simarm64-try,vm-kernel-precomp-nnbd-linux-release-simarm64-try
> Change-Id: Ifb136c64339be76a642ecbb4fda26b6ce8f871f9
> Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/166622
> Commit-Queue: Tess Strickland <[email protected]>
> Reviewed-by: Martin Kustermann <[email protected]>
> Reviewed-by: Régis Crelier <[email protected]>

[email protected],[email protected],[email protected]

Change-Id: Iaf79acddcf18fb3699a894950b31515d6f759349
No-Presubmit: true
No-Tree-Checks: true
No-Try: true
Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/172643
Commit-Queue: Tess Strickland <[email protected]>
Reviewed-by: Tess Strickland <[email protected]>
@sstrickl sstrickl reopened this Nov 18, 2020
@sstrickl
Copy link
Contributor

sstrickl commented Nov 18, 2020

Reverted due to errors in downstreams that need investigation. New type check failures that seem to suggest that arguments are being mischecked as adjacent arguments, which suggests either not taking into account type args vs. no type args (but tests were added specifically for this) or implicit receiver vs. no implicit receiver confusion.

Initial hypothesis: Running precompiled product mode (that is, bare instructions mode + instructions deduplication) is more likely to deduplicate closures now. Perhaps this led to an issue where static and instance (closure) functions that share the same instructions got deduplicated... especially for implicit closure functions, where the target might have gotten inlined into the ICF and thus removed one of the cases (entries in the static calls table) that would have avoided deduplication.

(The reason this might be much more likely for ICFs is that they are now literally just forwarding functions, since they have no checks to perform. It'd have to be a case where the ICF didn't have to reach into the context to get the receiver, which means a static function...)

@sstrickl
Copy link
Contributor

So if you change the check for the STC symbol in the TypeCheck runtime entry to true (so any Instance::IsAssignableTo check failure always goes through Function::DoArgumentTypesMatch), we hit the UNREACHABLE() afterwards, which means that Function::DoArgumentTypesMatch succeeded. At least that'll help pin down what's going on here.

@sstrickl
Copy link
Contributor

sstrickl commented Nov 18, 2020

Turns out my initial hypothesis was wrong, but I didn't follow it far anyway because I quickly made a test case that showed it was due to named argument permutations. In particular, the code generated for a dynamic closure call dispatcher assumed that the named argument positions returned by the saved arguments descriptor accurately described the order of the parameters of the invoke field dispatcher... but in fact, they didn't. Instead, named parameters for invoke field dispatchers were always created in sorted order.

While that's a perfectly good order if well documented, I think it makes more sense for them to be in the order suggested by the arguments descriptor that they were created from, so I've made that change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-vm Use area-vm for VM related issues, including code coverage, and the AOT and JIT backends. type-performance Issue relates to performance or code size vm-aot-code-size Related to improvements in AOT code size
Projects
None yet
Development

No branches or pull requests

4 participants