Skip to content

Torn off methods with covariant parameters need to have parameter types replaced with Object #31305

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
stereotype441 opened this issue Nov 7, 2017 · 27 comments
Assignees
Labels
area-vm Use area-vm for VM related issues, including code coverage, and the AOT and JIT backends. front-end-kernel legacy-area-front-end Legacy: Use area-dart-model instead.

Comments

@stereotype441
Copy link
Member

Consider the following code:

abstract class I<T> {
  void h(T t);
}
class C implements I<num> {
  void f(num i) {}
  void g(covariant num i) {}
  void h(num i) {}
}
main() {
  C c = new C();
  print(c.f is void Function(Object)); // Should print `false` in strong mode
  print(c.g is void Function(Object)); // Should print `true` in strong mode
  print(c.h is void Function(Object)); // Should print `true` in strong mode
}

The reason c.g is void Function(Object) should be true is because g's parameter is declared with the covariant keyword, and tear-offs of methods with covariant parameters need to have their parameter types replaced with Object.

The reason c.h is void Function(Object) should be true is because h's parameter is covariant due to its use of a generic parameter in the interface class; again the torn off parameter type should be replaced with Object.

When this program is run using kernel, it prints:

false
false
false

The VM should look at the kernel flags VariableDeclaration.isCovariant and VariableDeclaration.isGenericCovariantImpl to see which parameters should have their types changed to Object.

There should be a similar behavior associated with TypeParameter.isGenericCovariantImpl. I'll file a separate issue for that because there are further problems with it.

I'll submit language_2 tests illustrating this problem as soon as I can.

@stereotype441 stereotype441 added area-kernel area-vm Use area-vm for VM related issues, including code coverage, and the AOT and JIT backends. labels Nov 7, 2017
@a-siva a-siva added this to the 2.0-alpha milestone Nov 7, 2017
whesse pushed a commit that referenced this issue Nov 14, 2017
The tear-off of _NativeSocket.multiplex needs to have runtime type
`(Object) -> void` in order to be passed to the RawReceivePort
constructor.  Once issue #31305 is fixed, we should be able to fix
this by marking _NativeSocket.multiplex's argument as "covariant".
Until then, we have to type the argument as `Object` and then assign
it.

Change-Id: I1c9b7fb77dd3b0a71037459206f8de30ad77f73e
Reviewed-on: https://dart-review.googlesource.com/20822
Reviewed-by: Sigmund Cherem <[email protected]>
Commit-Queue: Paul Berry <[email protected]>
@mraleph mraleph assigned crelier and unassigned mraleph Nov 15, 2017
@mraleph
Copy link
Member

mraleph commented Nov 15, 2017

Regis, would you have chance to work on this?

@crelier
Copy link
Contributor

crelier commented Nov 15, 2017

Slava, sure, I can work on this, but probably not this week. I am still editing kernel status files (a nightmare) after inserting argument type checks in strong mode and I hope to be done today or Friday (I am OOO tomorrow). I should be able to get to it early next week.

@crelier
Copy link
Contributor

crelier commented Nov 15, 2017

I am just looking briefly at this issue.

I do not understand why c.h should have a covariant parameter, since T of I can only be num, seen from C. But I have not followed all discussions about covariant.

In strong mode (which does not make much sense in the VM without type inference), the VM parser is already changing the parameter type to Object, since we do not keep the covariant information in the ast. So the VM in strong mode prints:
false
true
false

I guess this is beside the point and this issue is really about kernel providing a modified function signature to the runtime, so that the runtime test uses Object instead of num. Am I correct?

I do not mind having a look, but so far, I have not worked on that code, except for the portion passing type arguments to generic functions. I'll try next week.

@stereotype441
Copy link
Member Author

@crelier, to answer your question about why c.h should have a covariant parameter, the reason is because it may be called via I in a way that doesn't guarantee that the parameter is num, e.g.:

main() {
  I<Object> i = new C(); // Ok because C implements I<num>, which is a subtype of I<Object>
  i.h('oops'); // Legal at compile time because I<Object>.h has type (Object) -> void
}

The front end already sets VariableDeclaration.isGenericCovariantImpl to indicate to the VM that the type check is needed, and @sjindel-google is working on adding the logic to the VM to insert the type check when necessary.

As to your other question, "this issue is really about kernel providing a modified function signature to the runtime, so that the runtime test uses Object instead of num", unfortunately we can't fix the bug that way, because the kernel representation needs to retain the type that was explicitly provided by the user so that it can generate the appropriate compile-time errors, e.g.:

main() {
  C c = new C();
  c.h('oops'); // Compile-time error; C.h requires a num
}

If the kernel representation were changed so that C.h had type (Object) -> void, then we would fail to issue the correct compile-time error in modular compilation scenarios (because modular compilation doesn't necessarily have access to the source file for class C when compiling main; it has to be able to rely on the types in the kernel representation).

To put it another way, C.h needs to have type (num) -> void in the kernel representation, for compile time checks, but the runtime needs to reify tear-offs of C.h with type (Object) -> void.

This means that the VM has to be responsible for replacing the types of covariant parameters with Object when creating the type for the tear-off.

@crelier
Copy link
Contributor

crelier commented Nov 16, 2017

@stereotype441, thanks for the explanations. However, the VM has never been in the business of static checks (analysis, inference, ....), so I still do not understand why print(c.h is void Function(Object)) should print true. This is a runtime check, not a compile time one. It should behave in the same way as invoking c.h('oops') or i.h('oops'). Both should result in a runtime error.

As to my other question, I should have been clearer. In the VM, a tear-off gets a signature type, which may be distinct from the original function signature type. In this case, we would not modify the type of the original function, but only that of the tear-off.

@stereotype441
Copy link
Member Author

stereotype441 commented Nov 16, 2017

thanks for the explanations. However, the VM has never been in the business of static checks (analysis, inference, ....), so I still do not understand why print(c.h is void Function(Object)) should print true. This is a runtime check, not a compile time one. It should behave in the same way as invoking c.h('oops') or i.h('oops'). Both should result in a runtime error.

Agreed, the VM should not be in the business of static checks. And I mostly agree that c.h('oops') and i.h('oops') should both result in a runtime error (though I might quibble and say that since c.h('oops') is a compile time error, technically the runtime behavior of c.h('oops') is unspecified). However, my understanding from talking to @leafpetersen is that both c.h as void Function(Object) and i.h as void Function(Object) should succeed. Tear-offs are special in this regard--the type check that's needed to ensure soundness does not happen until the tear-off is invoked.

The informal spec at https://github.com/dart-lang/sdk/blob/master/docs/language/informal/covariant-from-class.md seems to back me up on this. It says "In the remainder of this section, a parameter which is covariant according to the definition given in covariant overrides is treated the same as a parameter which is covariant due to class covariance as defined in this document; in both cases we just refer to the parameter as a covariant parameter.", and then later "For each covariant parameter q, the part in the dynamic type of f which corresponds to q is Object." Which says to me that the runtime type of the tear-off of C.h should be void Function(Object), so print(c.h is void Function(Object)) should print true.

However, the informal spec is not entirely consistent. It also has this example:

// Here is the small part of the core List class that we need here.
abstract class List<E> ... {
  // The reified type is `(E) -> void` in all modes, as declared.
  void add(E value);
  // The reified type is `(Iterable<E>) -> void` in all modes, as declared.
  void addAll(Iterable<E> iterable);
  ...
}

typedef void F(num n);
typedef void G(Iterable<num> n);

main() {
  List<num> xs = <int>[1, 2];
  F myF = xs.add;    // Statically safe, yet fails at run time
                     // in strong mode and Dart 2.
  G myG = xs.addAll; // Same as above.
}

I believe this is incorrect; since xs.add implements the generic interface List<E>, its value parameter is covariant due to class covariance, so the runtime type of the tear-off xs.add should be (Object) -> void, therefore the assignment to F myF should succeed. Similarly, the runtime type of the tear-off xs.addAll should be (Object) -> void, so the assignment to G myG should succeed. IMHO the example should be changed to say:

main() {
  List<num> xs = <int>[1, 2];
  F myF = xs.add;  // Safe
  G myG = xs.addAll;  // Safe
  myF(1.5); // Fails at runtime in strong mode and Dart 2 (1.5 is not an int)
  myG(<num>[]); // Fails at runtime in strong mode and Dart 2 (<num>[] is not an Iterable<int>)
}

@leafpetersen can you confirm that this is correct?

As to my other question, I should have been clearer. In the VM, a tear-off gets a signature type, which may be distinct from the original function signature type. In this case, we would not modify the type of the original function, but only that of the tear-off.

Sounds good, thanks!

@lrhn
Copy link
Member

lrhn commented Nov 16, 2017

I can confirm that.

Basically, a covariant instance method argument is treated as the written type statically and as Object dynamically. That is:

class C {
  void foo(covariant num x) {}
}
main() {
  var c = new C();
  var f = c.foo;  // Static type of c.foo is void Function(num), that is the type inferred for f.
  void Function(Object) actual = f;  // This succeeds, because at runtime it's actually `Object` + cast.
}

Or in other words, trust the source statically, but at runtime, the foo method is rewritten into:

  void foo(Object x) { x as num; }  // which should also promote x to num in the body if there was one.

Being covariant is inherited by overriding methods in subclasses.
Method parameters in generic classes that use a type parameter covariantly (that is, contravariant wrt. the class), are also automatically covariant.
(In fact, the covariant feature is really just giving users access to the unsafe overriding that generics already have, without having to introduce an otherwise unnecessary type variable to get it).

So, if the VM wants to implement covariant correctly, without the static typing part, it should just rewrite the method foo(covariant X x) { body; } into foo(Object x) { x as X; body; }, and maybe do the same for methods like List.add where the argument is implicitly covariant.

@eernstg
Copy link
Member

eernstg commented Nov 16, 2017

However, the informal spec is not entirely consistent. It also has this example: ...

The motivation section of that informal spec contains four snippets of code, and the one you mention is the fourth one.

The 2nd one (right after 'Here is an example why it would not work to reify the declared parameter type directly') illustrates why we would not have expression soundness for tear-offs with the existing (reify-as-declared) rules.

The 3rd one (after 'this is how it works') shows how the reified type of tear-offs are determined using the new (reify-to-Object) rules when we have an explicitly covariant parameter, and shows by example that the soundness problem is now gone.

The 4th one corresponds to the 2nd one, that is, it illustrates why the old rules must be changed also for the situation where the parameter is covariant due to class covariance.

So it's not inconsistent, but for clarity I'll add a 5th snippet corresponding to the 3rd one, and add a loud comment in the 2nd and 4th snippet saying "this goes by the old rules and illustrates why they won't work". ;-)

CL is here.

@stereotype441
Copy link
Member Author

Thanks for the clarification @eernstg!

@leafpetersen
Copy link
Member

Did all the open questions get resolved here?

@crelier
Copy link
Contributor

crelier commented Nov 21, 2017

I was just reading all the comments on this issue, as well as the referred informal spec. Yes, it makes more sense to me now. Thanks for the explanations.

What is still unclear is the exact definition of a "covariant position", when considering the type of a formal parameter x of a method declared or inherited by a generic class. Only a couple of examples are given in the informal spec without much explanation. I understand the example of a formal declared as List<X> x, where X is in a covariant position. I also understand the given counter example int x(X arg), where X is in a contravariant position.

I guess that in X x(int arg), X would be in a covariant position, because the result type is covariant, correct?

What if X shows up in both a covariant position and in a contravariant position as in X x(X arg) ?
I suppose the contravariant position wins and you should not substitute with Object, correct?

My understanding is the following: when you consider whether a formal type F that depends on the class type parameter X, let's note it F(X), is in a covariant position or not, you should consider the type F(Y), where Y is a subtype of X. If F(Y) is a subtype of F(X), then F is in a covariant position. Do I understand this correctly?

Now what if you add function type parameters to the mix? Map<X, E> Function<E>(E e) x
I would say the function type parameter E can be ignored, and x is in a covariant position. Correct?

@stereotype441
Copy link
Member Author

What if X shows up in both a covariant position and in a contravariant position as in X x(X arg) ?
I suppose the contravariant position wins and you should not substitute with Object, correct?

No, I think it's the opposite. Consider:

class C<X> {
  void foo(X x(X arg)) {
    assert(x is X Function(X)); // soundness should guarantee this
  }
}
Object f(Object value) => ...;
main() {
  C<Object> c = new C<int>(); // permitted because C<int> <: C<Object>
  c.foo(f); // permitted because C<Object>.foo has type ((Object) -> Object) -> void
}

The above code has no static errors, yet at runtime it causes f (which has type (Object) -> Object) to be passed to C<int>.foo. But C<int>.foo expects its argument to have type (int) -> int, which (Object) -> Object does not satisfy. So a runtime check is required. When we say that a parameter is "covariant due to class covariance" we mean that the parameter requires a runtime check due to a situation like the one above. So to borrow your terminology, covariance wins--if the type variable X appears in any covariant position within the type of the parameter, then the parameter is considered covariant due to class covariance, regardless of whether or not the type variable also appears in a contravariant position somewhere else within the type of the parameter.

Note that it is not necessary to put any special logic in the VM to figure out whether the type variable appears in a covariant position. The front end does this for you. Simply look at the booleans VariableDeclaration.isCovariant and VariableDeclaration.isGenericCovariantImpl.

@crelier
Copy link
Contributor

crelier commented Nov 21, 2017

Since X appears in both a covariant and contravariant position, the only valid type value for X when assigned to foo of C<Y> will be Y, i.e. in your example int, and not a subtype of Y or a supertype of Y. That is the reason I was claiming that contravariance should win and x should not be considered as covariant. Changing the type to Object does not give you anything. On the contrary, it only delays compile-time errors to run time. But maybe I miss something...

In any case, this confirms that the meaning of "covariant position" should be better specified.

@leafpetersen
Copy link
Member

I guess that in X x(int arg), X would be in a covariant position, because the result type is covariant, correct?

correct.

What if X shows up in both a covariant position and in a contravariant position as in X x(X arg) ?
I suppose the contravariant position wins and you should not substitute with Object, correct?

No, unfortunately not. If a type variable appears covariantly in a parameter type anywhere, then the reified type of that parameter is Object. The reason is that we really want to maintain the property that the runtime type of an object is always a subtype of its static type. This is kind of a basic sanity property. So using the example from above:

class C<X> {
  void foo(X x(X arg)) {}
}

typedef TakesInt2Int = void Function(int Function(int));
typedef TakesObject2Object = void Function(Object Function(Object));
typedef TakesObject = void Function(Object);

main() {
  C<Object> c = new C<int>(); // permitted because C<int> <: C<Object>
  TakesObject2Object f = c.foo;  // Ok, statically and dynamically
  print(f is TakesInt2Int); // prints true 
  print(f is TakesObject2Object); // Should this print true or false?
}

The actual tearoff line is fine statically, no matter what, and there is no implied runtime check there so it's fine dynamically.

The first statement will print true, whether we make the reified type be TakesObject or TakesInt2Int.

But what should the second print statement print out? If we make the reified type of c.foo be TakesInt2Int then it will print false, because TakesInt2Int is not a subtype of TakesObjectToObject. This is not really problematic for the type system (I don't think) because the function is safe to call with anything (in the sense that the generated code will always check the argument), but it's really quite surprising. I have an object of some static type T, but if I ask if it is T, then it says false. This just feels wrong to me.

We could put a read check on tearoff sites and fail the check there, but that doesn't seem useful.

We could try to be more granular and use a type like TakesInt2Object for the reified type (that is, only rewrite the covariant uses of the type variable to Object, instead of the whole type). There's some discussion of this option in the informal proposal. It's not as simple as it sounds, though perhaps now that we're allowing super-bounded types it would get a little easier? But I think you may still have to deal with least upper bounds?

Agreed that we should specify covariant positions if we haven't anywhere (I think @eernstg did write it up in one of the informal proposals?). I believe that to be precise, we actually care about "not contravariantly", which means "covariantly" or "invariantly".

Inductively, a type variable T occurs covariantly in a type S if:

  • S is T
  • S is a parameterized interface type P<S0, ..., Sn>, and T occurs covariantly in any of the Si
  • S is a function type and T occurs covariantly in the return type or contravariantly in any of the parameter types.

Inductively, a type variable T occurs contravariantly in a type S if:

  • S is a parameterized interface type P<S0, ..., Sn>, and T occurs contravariantly in any of the Si
  • S is a function type and T occurs contravariantly in the return type or covariantly in any of the parameter types.

Inductively, a type variable T occurs invariantly in a type S if:

  • S is a parameterized interface type P<S0, ..., Sn>, and T occurs invariantly in any of the Si
  • S is a function type and T occurs invariantly in the return type or invariantly in any of the parameter types, or T occurs anywhere in the bounds of the type parameters of S.

@eernstg
Copy link
Member

eernstg commented Nov 22, 2017

@leafpetersen, I actually haven't defined variance positions anywhere in our informal specs (and it isn't in dartLangSpec.tex because it wasn't needed before now). I agree on your definition, except that I feel a need to think a little bit more about invariance.

It seems likely to me that the lack of a subtyping relation between generic function types with non-identical type parameter bounds ensures that we won't have to mention invariance arising from these bounds. Basically, that's just a non-issue here.

An aside: We may need to reconsider declarations like

class C<X> {
  Function<Y extends X>(Y) f;
}

because instances typable as C<T> for some given T would have dynamic type C<S> for some S which is a subtype of T, and the field f of such an instance would have type Function<Y extends S>(Y) which is simply unrelated to the naive statically known type Function<Y extends T>(Y). So we would need to give that field the much less informative type Function in order to remain sound, forgetting both that this is a generic function and that it takes one positional argument of a certain type.

As always, we can also just decide that it must have a specific type and throw if it doesn't (say, we could require the type Function<Y extends T>(Y)); but this essentially means that it is a soundness violation to let a variable of type C<T> refer to an instance of type C<S> whenever T is not equal to S, and it would surely be surprising if we were to enforce that, because we have a declaration somewhere (maybe in a subclass of the statically known class) where a formal type parameter is used as a bound in the signature of a generic function. An easy way out is of course to say "you cannot use a type variable from an enclosing class as or in a bound for a type variable of a generic function type!"

Back on track:

We could define that a type variable occurs invariantly in a type if it occurs both covariantly and contravariantly, just like many other languages do.

However, I suspect that this is not necessary: If a type parameter occurs covariantly in an instance method parameter type and it also occurs contravariantly in the same type, we can make the choice to protect that parameter by generating checks for the declared type in the body of the method. This will trivially ensure soundness (no matter what is required it will be enforced). Conversely, if a type parameter occurs only contravariantly then the top level type (that is, the actual type annotation for the given parameter) will be a supertype of the type that we see in the statically known type of the receiver, and in this case an invocation will be just as safe as it would have been with no variance in the type at all.

Hence, I believe that we can rely on the covariance check alone.

Let's reconsider the example that you mentioned, @stereotype441:

class C<X> {
  void foo(X x(X arg)) {
    assert(x is X Function(X)); // soundness should guarantee this
  }
}
Object f(Object value) => ...;
main() {
  C<Object> c = new C<int>(); // permitted because C<int> <: C<Object>
  c.foo(f); // permitted because C<Object>.foo has type ((Object) -> Object) -> void
}

I agree on 'soundness should guarantee this', I would consider that property to be part of the wider property known as heap soundness (even though x is on the stack rather than in the heap ;-).

The ability to initialize c as shown is also a given.

For c.foo(f), we are dealing with one of those "contravariant types" that I've pointed out many times (or at least, types that threaten expression soundness). We could soundly consider the statically known type of the parameter to be exists X <: Object. X Function(X). That type could be soundly approximated by Null Function(Object), in the sense that "as long as we pass a Null Function(Object), the argument is guaranteed to satisfy the actual type constraint".

That's not a practical type. However, the response could be "just stop mixing covariance and contravariance in your type declarations!" The limited usefulness of these types might nudge developers in the direction of keeping structural and nominal types more separate.

Another possible development could be to support invariance (preferably use-site, I'd say: List<exactly num> nums = ...;), such that developers can make the choice to take away covariance in a certain subset of their software, in order to avoid the loss of information that arises when contravariance is added to the mix.

We could return to the "claim a type and enforce it" approach (that I don't like so much, but which may be tempting because it is simple, presumably because it relies so much on the "naive" static analysis where we just compute derived types as if there were no variance). With such an approach, the parameter type of invocations of the form c.foo(...) will be Object Function(Object), and we are (statically) allowed to pass f as the actual argument. This will fail at run time, of course, because the actual requirement is int Function(int), which will be enforced by the dynamic check at the beginning of the body of foo.

With this approach, the type checker would tell us that an int Function(int) won't do (even though that's exactly what is needed at run time), and it will tell us that an Object Function(Object) is perfectly fine (which will fail at run time). With the int Function(int) we could have a failure at run time, too, if we insist on checking according to the statically known type (where I always argue that we should check according to the actual type requirement, not the overly-aggressive statically known requirement, but if we rely on generating a type check at the beginning of the body of foo then we will surely check the actual requirement). All in all, typing this parameter as Object Function(Object) isn't hugely helpful...

(... the relevant comparison would be Null Function(Object) with a dynamic check for the actual requirement, which would inform developers that both a return type greater than Null and a parameter type less than Object would give rise to a dynamic check, so an int Function(int) would be flagged as needing the dynamic check for two separate reasons, but it would succeed at run time).

Finally, if we tear off c.foo as in var g = c.foo, the argument type part of the reified type of the closure will be Object, so we can even assign g like Function(Object) h = g;. But that doesn't bring up any new topics, because it eliminates the "useless type"-ish property of the parameter x.

@crelier
Copy link
Contributor

crelier commented Nov 22, 2017

But what should the second print statement print out? If we make the reified type of c.foo be TakesInt2Int then it will print false, because TakesInt2Int is not a subtype of TakesObjectToObject. This is not really problematic for the type system (I don't think) because the function is safe to call with anything (in the sense that the generated code will always check the argument), but it's really quite surprising. I have an object of some static type T, but if I ask if it is T, then it says false. This just feels wrong to me.

Agreed. However, you should get a dynamic error even before getting to that second print statement. The assignment TakesObject2Object f = c.foo; would need to be checked and should fail dynamically, if we make the reified type of c.foo be TakesInt2Int. Maybe that is a check that you do not want to insert, therefore we have to reify c.foo to TakesObject.

@leafpetersen
Copy link
Member

An aside: We may need to reconsider declarations like

 class C<X> {
   Function<Y extends X>(Y) f;
 }

I think this is handled by the existing covariance checks. This code throws on the read from c.f in DDC:

class C<X> {
  Function<Y extends X>(Y) f;
 }

void main() {
  C<int> ci = new C<int>();
  ci.f = <Y extends int>(Y y) => y.isEven;
  Function<Y extends int>(Y) f = ci.f;
  C<Object> c = ci;
  Function<Y extends Object>(Y) g = c.f;
}

Am I missing your concern? It's certainly true that if you use covariant type variables in the bounds of generic functions in the signatures of methods, then those methods are going to be very rigid: you can't tear them off at a super type, etc. because there is no subtyping on bounds. But I think (hope?) there's no soundness issue that I'm missing.

We could define that a type variable occurs invariantly in a type if it occurs both covariantly and contravariantly, just like many other languages do.

Sorry, this isn't really the sense of invariant that what I was trying to get at. Bounds on generic functions are invariant. That is, for Function<T extends S>(...) to be a subtype of Function<T extends U>(...), S == U must be true. So if S is a type variable, then it is in an invariant position. So all I was trying to say is that in the following code:

class C<X> {
  void f(Function<Y extends X>(Y) x) {}
 }

there is an implied covariance check on x, and the reified type of f should be Object -> void. To see why this is the case, consider:

void test() {
  C<Object> c = new C<int>();
  void Function(Function<Y>(Y)) f = c.f;
}

Again, if we want the property that the runtime type of an object is a subtype of its static type, then we want the runtime type of c.f to be Object -> void, since the only supertypes of Function<Y>(Y) are Function and Object.

Does that make sense, or am I missing something again?

@leafpetersen
Copy link
Member

Agreed. However, you should get a dynamic error even before getting to that second print statement. The assignment TakesObject2Object f = c.foo; would need to be checked and should fail dynamically, if we make the reified type of c.foo be TakesInt2Int. Maybe that is a check that you do not want to insert, therefore we have to reify c.foo to TakesObject.

This has been a difficult line to walk, and I'm not entirely sure what the right answer is. In our existing approach, I don't think it's reasonable to do what you propose, because it makes it pretty much impossible to work with these sorts of functions, even in a safe way. If we relax our approach slightly as I suggest here: #31391 , then it would get a bit better, since you could at least tear things off safely without casting to dynamic first. In that case it might be possible to take the approach you propose?

@crelier
Copy link
Contributor

crelier commented Nov 22, 2017

I have not been involved in these discussions and I do not know what the "existing approach" is. I am not proposing any change, but just trying to understand what I am asked to implement in this issue.

@leafpetersen
Copy link
Member

leafpetersen commented Nov 22, 2017

I have not been involved in these discussions and I do not know what the "existing approach" is. I am not proposing any change, but just trying to understand what I am asked to implement in this issue.

Yes, fair enough. Just to be sure, I think that all of the information that you need to do the actual implementation is in kernel? If not, we should definitely fix that. I think though that your question though is more around understanding the overall structure of the covariance checks: what they are, and why. Is that right? Assuming so... here's some comments that can maybe help?

Roughly speaking, the covariance checks fall into two categories: parameter checks (or write checks), which are checks on the actual arguments provided to methods for which covariant subtyping of generics allows the static type discipline to be subverted; and read checks which are checks on the result of reading a property out of an object where again covariant subtyping allows the typing discipline to be subverted.

Focusing on the last one, what I was calling "the existing approach" is to require a caller side type check on the result of reading a field (calling a getter) when the return type is "unsafe", and specifically, to do that check relative to the interface type through which the field is read. That is, the check is induced simply by reading the field, regardless of the type at which the result is intended to be used. Example of existing approach:

class C<X> {
   X Function(X) f;
}
void test(C<Object> c) {
  c.f;  // Implied cast to Object -> Object, will fail
  c.f("hello");   // Implied cast to Object -> Object, will fail
  c.f == null;   // Implied cast to Object -> Object, will fail
  (c.f as dynamic) == null;   // Implied cast to Object -> Object, will fail
  c.f(3); // Implied cast to Object -> Object, will fail
  (c.f as int Function(int))(3); // Implied cast to Object -> Object, will fail
  (c as dynamic).f(3); // ok
}

void main() => test(new C<int>()..f = (int x) => x);

This is quite restrictive, as you can see. With tear-offs, we can be less restrictive because we adjust the reified type, and I think this is valuable (again, given the current strategy).

The proposal in #31391 would ease this somewhat by allowing many of the reads of c.f in test above to succeed. This might make treating tearoffs as simple properties reads more palatable.

I'd still be somewhat concerned though. It's hard to say what Dart 2.0 code will look like going forward, but certainly in porting Dart 1.0 code to strong mode there were patterns where people covariantly specialize class hierarchies in ways that are safe, but seem like they might break if we were more eager with these checks.

@crelier
Copy link
Contributor

crelier commented Nov 22, 2017

I think that all of the information that you need to do the actual implementation is in kernel?

Yes, flags are provided by kernel.

I think though that your question though is more around understanding the overall structure of the covariance checks: what they are, and why. Is that right?

Exactly right. It is easy to read those kernel flags, but understanding their exact meaning is what I am after.

I appreciate your lengthy replies to my short questions (thanks to @eernstg and @stereotype441 as well).
Collecting them in a different document than this bug report may benefit others who may have similar questions.

@leafpetersen
Copy link
Member

I appreciate your lengthy replies to my short questions

No problem, feedback and questions are always welcome and useful. It's valuable to walk through the reasoning and examples again, since assumptions change and other eyes see things we didn't.

Collecting them in a different document than this bug report may benefit others who may have similar questions.

Agreed.

@eernstg
Copy link
Member

eernstg commented Nov 23, 2017

An aside: We may need to reconsider declarations like

 class C<X> {
   Function<Y extends X>(Y) f;
 }

I think this is handled by the existing covariance checks. This code throws on the read from
c.f in DDC:

class C<X> {
  Function<Y extends X>(Y) f;
}

void main() {
  C<int> ci = new C<int>();
  ci.f = <Y extends int>(Y y) => y.isEven;
  Function<Y extends int>(Y) f = ci.f;
  C<Object> c = ci;
  Function<Y extends Object>(Y) g = c.f;
}

Am I missing your concern? It's certainly true that if you use covariant type variables in the bounds of > generic functions in the signatures of methods, then those methods are going to be very rigid:
you can't tear them off at a super type, etc. because there is no subtyping on bounds. But I think
(hope?) there's no soundness issue that I'm missing.

The point I'm making is that we have a special typing situation when a type variable X from a class is used as or in a bound for a function type F in the body of that class, because the set of types yielded by the possible values of X form a set of types with no subtyping relationship to each other, they're just a bunch of unrelated type whose least upper bound (for once, it exists ;-) is a plain Function.

For comparison, we often have a situation where a type variable X from a class is used as or in a type argument in a generic class type. As long as we stay in the universe of generic class types we have covariance everywhere, and this means that the computed types that I usually call "naive" will work nicely:

class C<X> {
  List<X> xs;
}

For an expression e of static type C<T> for some T, we know (by expression soundness) that the result of evaluating e in some state will be an instance of type C<S> for some S such that S <: T. Statically, that instance is correctly and optimally typed as C<T>, because of class covariance: That is always a correct typing, and there is no typing which is correct and at the same time more informative.

If we evaluate e.xs then we know that it is an instance of type List<S> for some S such that S <: T, and that is also correctly and optimally statically typed as List<T>. The motto could be "just ignore variance, and the right type comes out".

If we switch variance and consider the kind of expression that I've called "contravariant expressions" we get a different situation:

class D<X> {
  void Function(X) f;
}

Assume that e has static type D<T> and dynamic type E<S> for some S <: T, and consider evaluation of e.f. That evaluation will yield a value of type void Function(S) for some S such that S <: T, which is not a subtype of void Function(T). So this is a direct violation of expression safety, unless we "do something".

The current approach would be to say "You must have type void Function(T) and we'll throw if that isn't true!". What I'm advocating is to assign a sound type to this expression, rather than just requiring that the outcome must have a certain type for which there is no dynamic justification: We might get a value whose type is not void Function(T) even though the entity providing that value satisfies its own constraints perfectly well --- in this case, the function obtained from e.f would actually have the type void Function(S), and it's simply unjustified to require that it must have type void Function(T) because the only relevant constraint is that f in an instance of exactly C<S> must have type void Function(S).

As I've mentioned earlier, the most relevant static typings for f from an instance statically typed as C<T> are void Function(Null), no matter which T we have, or \exists S <: T. void Function(S).

So we can express sound types for "contravariant expressions". The motto, however, would now be "detect contravariant expressions, and maintain the variance using an existential type, or use the approximation where contravariantly placed type variables are replaced by bottom".

The situation where the type variable is used as a bound for a type parameter of a function type is one more step crazy:

class E<X> {}
class F<X> implements E<X> {
   Function<Y extends X>(Y) f;
 }

Assume that e has static type E<T> and dynamic type E<S> for some S <: T, and consider e.f. The obtained value has dynamic type Function<Y extends S>(Y), and this is a new situation because it is unrelated to the naive static type Function<Y extends T>(Y), rather than related in one or the other direction by subtyping. This means that a sound typing will be very information-lossy, we can't choose anything better than Function.

The motto is now "detect expressions whose type does not satisfy any variance constraints, and express the effects of variance on type variables therein using an existential type; alternatively, choose a sound approximation, i.e., for covariantly placed function types where a bound varies at all (any direction) use Function, for contravariantly placed ones, use Null; finally, also apply the rules for contravariant expressions".

(PS: "does not satisfy any variance constraints" means that for some type operator F it does not hold that X1 <: X2 => F<X1> <: F<X2> and it does not hold that X1 <: X2 => F<X2> <: F<X1>, so F isn't covariant and it also isn't contravariant.)

Also, the strategy where we just require the value of e.f to have the naive type (and throw if it doesn't have that type) is lossy in another sense: It is guaranteed to fail unless we have that very special situation where S == T. In other words, we've allowed developers to write code that will require invariance for some class types (that is, the class where f is declared, plus all subtypes, plus probably supertypes where that type argument appears, like E, because you would otherwise be able to obtain a reference to an F<S> typed as F<T> by constructs like if (myE is F<T>) .. or by an implicit downcast, etc).

The point is that it is really, really tricky for developers to maintain a discipline of invariance for any given set of types, in a world where covariance is pervasive. So we shouldn't have a type system that essentially requires developers to maintain such a discipline.

We could define that a type variable occurs invariantly in a type if it occurs both covariantly and
contravariantly, just like many other languages do.

Sorry, this isn't really the sense of invariant that what I was trying to get at. Bounds on
generic functions are invariant. That is, for Function<T extends S>(...) to be a subtype of
Function<T extends U>(...), S == U must be true. So if S is a type variable, then it is in an
invariant position.

Sure, the main point of my 'aside' paragraph was that variance doesn't exist for a type expression where a class type variable X is used as a bound in a function type: We cannot soundly consider such an expression to have a type which covaries with X (in which case the naive type is fine), nor that it contravaries with X (in which case a sound type would be explicitly existential, or we could replace X in contravariant locations by bottom). We could still use an existential type, but there is no reasonable approximation which isn't existential (those function types would always degenerate to Function).

So all I was trying to say is that in the following code:

class C<X> {
  void f(Function<Y extends X>(Y) x) {}
 }

there is an implied covariance check on x, and the reified type of f should be Object -> void.

About static checks on invocations of f:

The actual type requirement on the argument expression a in c.f(a) where c has static type C<T> is \exists S <: T. Function<Y extends S>(Y), which may be safely (but not so informatively) approximated as Function, so we could check statically that the given argument is a Function and leave the rest to the dynamic check (in the body of f, which will of course check the actual requirement Function<Y extends S>(Y) for an instance with dynamic type C<S>). I suppose that dynamic check is what you call the 'covariance check on x'.

I don't think we have any disagreement at all on these dynamic checks in the bodies of methods: They will always check for the actual type required for the given parameter, never any static approximations thereof. So I'm fine with that.

About tearing off f (I guess you're referring to a torn off function when you mention the reified type):

Assume that c has statically known type C<T> and dynamic type C<S> for some S <: T.
The best static type of c.f would be \exists S <: T. void Function(Function<Y extends S>(Y)),
and without existentials that could soundly be approximated by void Function(Function).

We have the notion of covariant parameters that comes to mind, but the parameter x isn't
covariant! It is simply untrue that a more special value for X yields a more special type annotation
for x (that is, we do not have
X2 <: X1 => Function<Y extends X2>(Y) <: Function<Y extends X1>(Y)).
It is true, however, that the type annotation is not a supertype, so we don't have the traditional
subtype requirement on parameter type annotations, so we need a dynamic check. This means
that "covariant parameters" might as well have been called "anything-not-contravariant parameters",
so this case seems to fit perfectly for the already given treatment of covariant parameters.

So we tear off the function, and we note that the type annotation on x isn't contravariant, so we must reify the argument type as Object. This yields the complete type void Function(Object) for the tear-off, as you mentioned. Agreed on that, too! ;-)

Sticking with the already-specified treatment of covariant parameters we could naively assign the
static type void Function(Function<Y extends T>(Y)) to the tear-off expression c.f, but that static type would of course be subject to the same arguments as the ones I gave for the instance method: The not-so-naive type would be \exists S <: T. void Function(Function<Y extends S>(Y)), which could
be approximated by a non-existential type void Function(Function).

To see why this is the case, consider:

void test() {
  C<Object> c = new C<int>();
  void Function(Function<Y>(Y)) f = c.f;
}

Again, if we want the property that the runtime type of an object is a subtype of its static type,
then we want the runtime type of c.f to be Object -> void, since the only supertypes of
Function<Y>(Y) are Function and Object.

Right, but we can do that by assigning justified types to the expressions in the first place, or we can assign types that embody unjustified requirements on the run-time semantics and then simply throw when those requirements aren't satisfied. I'm arguing that we should do the former.

@eernstg
Copy link
Member

eernstg commented Nov 23, 2017

I've added a definition of variance positions in covariant-from-class.md, using Leaf's definitions including invariance (and mentioning that it differs from invariance as it is traditionally defined), and adjusted the notion of being a covariant parameter such that it takes the function-type-formal-type-parameter-bound case into account. Here's the CL.

Note that these changes are orthogonal to the discussion about whether we should choose a sound static type for expressions that are contravariant or invariant or both, or we should insist on giving them the naive type statically, and throw when they don't have that.

(And I still think that we should find a better term than 'invariant' to describe the property that a type operator F is such that F<X1> and F<X2> are unrelated types, even when we know that X1 <: X2 or X2 <: X1. It's more like "varying wildly" than "invariant" ;-).

@eernstg
Copy link
Member

eernstg commented Nov 27, 2017

Above-mentioned CL was landed as 062e5d6.

@whesse whesse closed this as completed in 08b2294 Dec 1, 2017
@leafpetersen
Copy link
Member

@eernstg I'm not sure if I addressed all of your concerns/points from above or not - if not maybe we can revisit when I'm in the office there next week?

@eernstg
Copy link
Member

eernstg commented Dec 5, 2017

Sure!

@kmillikin kmillikin added legacy-area-front-end Legacy: Use area-dart-model instead. front-end-kernel and removed legacy-area-front-end Legacy: Use area-dart-model instead. labels Sep 19, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-vm Use area-vm for VM related issues, including code coverage, and the AOT and JIT backends. front-end-kernel legacy-area-front-end Legacy: Use area-dart-model instead.
Projects
None yet
Development

No branches or pull requests

8 participants