Skip to content

Fix GADT approximation #9322

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Jul 14, 2020
Merged

Conversation

abgruszecki
Copy link
Contributor

@abgruszecki abgruszecki commented Jul 8, 2020

Should fix #9044, fix #9158, fix #9274.

@abgruszecki
Copy link
Contributor Author

There's one remaining problem - I needed to bring back the alternate definition of recover in adaptToSubType, the one that tries at the absolute end to adapt using GADT approximation (https://github.com/lampepfl/dotty/pull/9322/files#diff-7c63b7bfffa9340fb48ac9b61fa56beeR3461). This turns out to be necessary in only one hyper-concrete example, in the following code:

object i7044 {
  case class Seg[T](pat:Pat[T], body:T)

  trait Pat[T]
  object Pat {
    case class Expr()            extends Pat[Int]
    case class Opt[S](el:Pat[S]) extends Pat[Option[S]]
  }

  def test[T](s:Seg[T]):Int = s match {
    case Seg(Pat.Expr(),body)          => body + 1
    case Seg(Pat.Opt(Pat.Expr()),body) => (body: Option[S$1]).get
  }
}

The type of (body: Option[S$1]).get is (Option[S$1]#get : => S$1) (S$1 is a pattern-bound type). We have S$1 <: Int, but for some reason we don't take into account this knowledge during type comparison. The traces are:


==> adapting (body:Option[Int]).get to Int 
?
adapt to subtype (Option[Int]#get : => Int) <:< Int
Subtype trace:
  ==> (Option[Int]#get : => Int) <:< Int  
    ==> (Option[Int]#get : => Int) <:< Int recur 
      ==> Int <:< Int LoApprox 
        ==> Int <:< Int recur 
        <== Int <:< Int recur  = true
      <== Int <:< Int LoApprox  = true
    <== (Option[Int]#get : => Int) <:< Int recur  = true
  <== (Option[Int]#get : => Int) <:< Int   = true
 
==> adapting (body:Option[S$1]).get to Int 
?
adapt to subtype (Option[S$1]#get : => S$1) !<:< Int
Subtype trace:
  ==> (Option[S$1]#get : => S$1) <:< Int  
    ==> (Option[S$1]#get : => S$1) <:< Int recur 
    <== (Option[S$1]#get : => S$1) <:< Int recur  = false
  <== (Option[S$1]#get : => S$1) <:< Int   = false

And after I wrote all the above, it occurs to me that I should just clench my teeth and take another dive into TypeComparer. @smarter if the above tells you anything that'd save me that dive, I'd be happy to hear it.

@smarter
Copy link
Member

smarter commented Jul 8, 2020

No clue sorry, but I assume it has something to do with the fact that the result type is => S$1 and not S$1, so maybe GADT lookup is needed in an extra case related to ExprType.

@abgruszecki abgruszecki force-pushed the fix-gadt-approximation branch from 89b74b9 to e09e489 Compare July 9, 2020 12:12
@abgruszecki abgruszecki marked this pull request as ready for review July 9, 2020 12:14
@abgruszecki
Copy link
Contributor Author

@smarter can you review? The primary things to take a look at are the explanation comment @ 027d7dd#diff-45a98dd466c5e61cb95404238782f22dR179-R205 and e09e489, which fixes the issue with subtyping we discussed.

I'm not certain what to do with 027d7dd#diff-e5c61d67affb377795aa901c89a20cd3. It previously accidentally typechecked because of GADT approximation, but now it doesn't. At the same time, it should typecheck in principle. It started as being a neg test. Should I maybe open an issue for it?

@abgruszecki abgruszecki requested a review from smarter July 9, 2020 12:20
Comment on lines +193 to +194
* - If we have GADT cstr A <: Int, then for all A <: Int, Option[A] <: Option[Int].
* Therefore, we can approximate Option[A] ~~ Option[Int].
Copy link
Member

@smarter smarter Jul 10, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still concerned about the stuff I mentioned in #9274 (comment), it seems that if I have a method def foo[T <: X] = ..., then I immediately get a GADT constraint T <: X, and based on this documentation I would guess this means that ApproximateGadtAccumulator will replace Option[T] by Option[X], even though there's no need to do any approximation and nothing GADT related is actually going on?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I now realize that approximateGADT is only called when the type of the tree is not a subtype of the expected type, so it's probably OK technically, but I think it's confusing and not great for performance that we run this stuff in situations which have nothing to do with GADTs. A few suggestions:

  • Add some documentation above the call to approximateGADT in adaptToSubType, describing in what situation this might help (e.g., with a simple example)
  • As far as I can tell, the result of the approximation is ignored unless pt is a SelectjonProto and ctx.gadt.nonEmpty, so we should only run the type map when these conditions are true, this is better for performance and for understanding what's going on.
  • In Incorrect code ends up passing the typechecker due to GADT logic misfiring #9274 you said:

    it's not possible to tell whether a symbol is a type parameter of an enclosing function after descending into its body; therefore, we need to stick those symbols somewhere when we see them and we stick them into GadtConstraint
    Type parameters have the TypeParam flag set, and I think method type parameters should always be TypeRefs where the prefix is NoPrefix, is that enough?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re. documentation: I'll add the doc to code in adaptToSubType. We need to approximate GADTs in adapt in order to look up members that should be there b/c of GADT constraints. We cannot take GADT constraints into account when doing normal member lookup b/c of cache. Sample code is https://github.com/lampepfl/dotty/pull/9322/files#diff-70e7287c36d5e6a572c97c2b51613049R1-R9.

Re. lazily performing the approximation: will do, doing it eagerly is a remainder from the old recover, which also used the result of approx.

Re. type parameters: key issue is telling whether the type parameters are from an enclosing method, or from somewhere else (i.e. a class?) Now that I think about it, there may be some way of doing so. Question is, do we actually get any performance out of it. I'd prefer to know for sure that there's a performance issue with this code before trying to change it.

Final concern: can you take a look @ https://github.com/lampepfl/dotty/pull/9322/files#diff-1b5c8ce1eaca91d793eef217487d5a59R754 and see whether that change makes sense to you? isNewSubType would at most return false early before doing the type comparison, so I cannot really see any way that this change is wrong, but at the same time all the justification I have for it is that no test broke and it fixes my issue.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re. type parameters: key issue is telling whether the type parameters are from an enclosing method, or from somewhere else (i.e. a class?)

Check if the owner is a method, type parameters from methods which are not enclosing should not be in scope at all.

Final concern: can you take a look @ #9322 (files) and see whether that change makes sense to you?

I guess that usage of isNewSubType was incorrect before? Maybe something worth asking Martin about.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re. type parameters: key issue is telling whether the type parameters are from an enclosing method, or from somewhere else (i.e. a class?)

Check if the owner is a method, type parameters from methods which are not enclosing should not be in scope at all.

Yeah, that's what I was thinking about when I said "there may be some way of doing so". I'd still like an indication that we have a performance problem here.

Final concern: can you take a look @ #9322 (files) and see whether that change makes sense to you?

I guess that usage of isNewSubType was incorrect before? Maybe something worth asking Martin about.

@odersky can you take a look at the single line modified in TypeComparer in https://github.com/lampepfl/dotty/pull/9322/files#diff-1b5c8ce1eaca91d793eef217487d5a59R754 and check if the change looks OK to you? The problem that this change fixes is #9322 (comment).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd still like an indication that we have a performance problem here.

Even if it's not a performance problem, I think it's nicer if GADT code doesn't run when nothing GADT related is going on: it makes it easier to know what code can be safely ignored when trying to understand or debug something, and it makes the output of debug printers easier to read (especially the subtyping printer which is hard enough to follow in the best of situations), but this isn't a blocker for this PR, and if you think it would make the resulting GADT logic too complicated then we can keep the current logic of course.

Comment on lines +193 to +194
* - If we have GADT cstr A <: Int, then for all A <: Int, Option[A] <: Option[Int].
* Therefore, we can approximate Option[A] ~~ Option[Int].
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I now realize that approximateGADT is only called when the type of the tree is not a subtype of the expected type, so it's probably OK technically, but I think it's confusing and not great for performance that we run this stuff in situations which have nothing to do with GADTs. A few suggestions:

  • Add some documentation above the call to approximateGADT in adaptToSubType, describing in what situation this might help (e.g., with a simple example)
  • As far as I can tell, the result of the approximation is ignored unless pt is a SelectjonProto and ctx.gadt.nonEmpty, so we should only run the type map when these conditions are true, this is better for performance and for understanding what's going on.
  • In Incorrect code ends up passing the typechecker due to GADT logic misfiring #9274 you said:

    it's not possible to tell whether a symbol is a type parameter of an enclosing function after descending into its body; therefore, we need to stick those symbols somewhere when we see them and we stick them into GadtConstraint
    Type parameters have the TypeParam flag set, and I think method type parameters should always be TypeRefs where the prefix is NoPrefix, is that enough?

Explanations are in the relevant doc.

Provisionally restored GADT-approximating version of `recover` in
`adaptToSubType`, as it turns out to be necessary to typecheck i7044.scala.

Moved boundspropagation.scala to pending.
Originally, ApproximateGadtAccumulator allowed it to typecheck, so it
was added as a test. Since `recover` was provisionally restored, the
test succeeds again. However, as GADT approximation is only meant to fix
GADT member selection issues and the example does not involve those,
we probably don't want it as a pos test. At the same time, it should
typecheck, so we shouldn't delete it either.
Add an explanatory comment and calculate GADT approx lazily.
@abgruszecki abgruszecki force-pushed the fix-gadt-approximation branch from a2e2076 to 72916ea Compare July 13, 2020 08:33
// Member lookup cannot take GADTs into account b/c of cache, so we
// approximate types based on GADT constraints instead. For an example,
// see MemberHealing in gadt-approximation-interaction.scala.
lazy val gadtApprox = Inferencing.approximateGADT(wtp)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a lazy val is overkill here, instead I would do something like:

pt match {
  case pt: SelectionProto if ctx.gadt.nonEmpty =>
    val gadtApprox = Inferencing.approximateGADT(wtp)
    gadts.println(...)
    if (pt.isMatchedBy(gadtApprox))
      return tpd.Typed(tree, TypeTree(gadtApprox))
  case _ =>
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That doesn't mesh well with the debug statement, which would be best printed both when pt is and isn't a SelectionProto.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the concern is about performance, I seriously would not care about it unless it shows up in the performance test.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I'm more concerned by clarity than peformance here, also I'm not convinced that we should print anything when pt is not a selectionproto: if I turn on the GADT printer, I want to see what the GADT logic has been doing, but I don't want my output full of messages about situation where it didn't do anything

@abgruszecki
Copy link
Contributor Author

test performance please

@dottybot
Copy link
Member

performance test scheduled: 1 job(s) in queue, 0 running.

@dottybot
Copy link
Member

Performance test finished successfully:

Visit http://dotty-bench.epfl.ch/9322/ to see the changes.

Benchmarks is based on merging with master (608baf4)

@abgruszecki
Copy link
Contributor Author

@smarter I adjusted the logging logic as you suggested, and removed the tree attachment. If you don't have further feedback, I'd merge this PR - since Martin is on vacation I don't think we want to bother him with reviewing a single line + I'm reasonably confident that that change will at most get us into an infinite loop in type comparison, which is somewhat easy to detect.

@smarter smarter merged commit 001817c into scala:master Jul 14, 2020
@smarter smarter deleted the fix-gadt-approximation branch July 14, 2020 14:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants