-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Change case class desugaring and decouple Products and name-based-pattern-matching #1938
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
b056cb2
to
b5742d9
Compare
While I like the marker-trait approach and I'd be in favor of it if we were designing a new language, I'd prefer if we don't go for it now. The reason is that scalac actually supports name-based pattern matching, though it wasn't documented, and there are some code-bases that use it(e.g. parser combinators). The proposed scheme is incompatible between the two approaches. The intermediate ground could be to temporarily support both under -language:Scala2 mode, but it would mean implementing both schemes, which would only increase complexity. I think there's a possibility to remain compatible with scalac, by having a scheme that is very close to it, like the current one used by dotty. While I agree that it would be nice to simplify it, I believe, we should remain compatible. |
val caseClassMeths = { | ||
def syntheticProperty(name: TermName, rhs: Tree) = | ||
DefDef(name, Nil, Nil, TypeTree(), rhs).withMods(synthetic) | ||
// The override here is less than ideal: user defined productArity / productElement |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why? I think it's used a lot in userland and I don't (yet?) see a value in removing it.
// The override here is less than ideal: user defined productArity / productElement | ||
// methods would be silently ignored. This is necessary to compile `scala.TupleN`. | ||
// The long term solution is to remove `ProductN` entirely from stdlib. | ||
def productArity = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there's now a LOT more code synthesized per case-class. I'm worried about both the bytecode footprint and the runtime code size.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer to keep all those methods in super-classes, as it makes it a lot easier for JIT to omptimize a callsite that sees products of the same arity. In your case, it would be non optimizable by contemporary VMs, to the best of my knowledge.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ping @sjrd I think this would also affect Scala.js size.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is what scalac does, but I guess we could keep ProductN
when N <= 22
and synthesize only after that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can someone summarize what the effective changes to generated code are, e.g., as snippets of Scala or Java code of what's generated per case class?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is case class A[T1, T2](e1: T1, e2: T2)
compiled by 2.12 vs compiled by dotty/master.
The PR replaces the replace the ProductN
superclasses by Product
which require synthesizing productElements
, so the discussion is about reintroducing these lines in dotty.
@DarkDimius I think I'm confusing names, if name based pattern matching refers to Let me summarize my understanding to be sure we are on the same page. The following works as expected on class A(e1: Int, e2: Int, e3: Int) {
def isEmpty = false
def get = this
def _1 = e1
def _2 = e2
def _3 = e3
}
object A {
def unapply(a: A): A = a
} Let's now consider
class A(e1: Int, e2: Int, e3: Int) extends Product // + case flag
object A {
def unapply(a: A): Option[(Int, Int, Int)] = Some((a.e1, a.e2, a.e3))
}
class A(e1: Int, e2: Int, e3: Int) extends Product3[Int, Int, Int]
object A {
def unapply(a: A): A = a
}
class A(e1: Int, e2: Int, e3: Int) extends Product with NameBasedPattern
object A {
def unapply(a: A): A = a
} Pattern matching on 1. with dotty compiles but goes through the unapply, thus allocating a Tuple3 and an Option (can be observed with So using a |
else -1 | ||
/** Is this type eligible for name based pattern matching? | ||
* | ||
* That means either extending `scala.ProductN` or `NameBasedPattern`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd capitalize OR.
It would be also nice to "one day" make scala.ProductN
inherit NameBasedPattern
.
That would simplify the rule even more.
@OlivierBlanvillain yes, we didn't speak about the same thing indeed. Thanks for pointing it out.
Yes, name-based pattern-matching refers to unapply that doesn't return a I'd propose to rename this pr to along the lines "change case class desugaring and decouple Products and name-based-pattern-matching".
This change would also remove most of the new code introduced in this PR as well as the workaround(I guess?). |
@DarkDimius We would still need it them for arity 23 and up. It makes sense to keep ProductN below that, I just need to add a test case to make sure that the synthesized code is covered :) |
c53bdae
to
6c56acd
Compare
@OlivierBlanvillain Why the change? Is about the 22 restriction, or are there other reasons? |
@odersky Yes, it's about the 22 restriction. I need something like this for my hlist branch where I synthesize TupleN for N > 22 and erase them to arrays. An alternative would be to also synthesize & erase ProductN as needed. |
So that means with the current PR case classes stop implementing ProductN traits once they have more than 22 parameters? in that case I'd prefer to generalize Product instead. |
I like the idea of disentangling ProductN and name-based pattern matching, since the future status of ProductN is unclear. But I am reluctant to introduce yet another marker trait. Can we try instead with |
Things mostly work using This is non trivial because of types such as In the last commit I swapped the priority of |
In presence of both modes I would prefer the name-based patternatching to
be preferred, as it is more efficient.
I also think that we should not break source compatibility like the last
change does. It makes it impossible to compile into efficient bytecode
without source-incompatible changes to public api of the library. Its as if
we are propagating the problem we didn't solve to library authors.
…On 22 February 2017 10:09:29 am Olivier Blanvillain ***@***.***> wrote:
Things mostly work using `scala.Product` instead of a new marker trait for
"product pattern matching" (the one tuples and other case classes that
calls `_i` directly).
This is non trivial because of types such as `Option` and `List` that are
already valid return type for unapplys but also extend `Product`, thus
becoming candidate for two different patterns.
In the last commit I swapped the priority of `isProductMatch` and
`isGetMatch` tests which is enough to make things work for `Option` and
`List`. However it breaks the "Scala-parser-combinators use case" where a
single extractor is used with two different type of patterns. I could not
find a way to make this one work without breaking something else... Could
we deprecate this corner case?
--
You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub:
#1938 (comment)
|
f3b7cdd
to
0a3fb0f
Compare
I agree with @DarkDimius that I'd prefer product matches to take precedence over get matches.
I thought the test would be "extends Product and has the right number of |
721cc47
to
b19b134
Compare
I've found a way to make Actually |
Ouch. I had not thought of that. I guess that swings the balance in favor of using a new subtrait of |
If we ignore this zero arity case we could even avoid using a marker trait altogether by saying that the Also, it looks like scalac desugars |
Sent from my iPhone
On 24 Feb 2017, at 10:50, Olivier Blanvillain ***@***.***> wrote:
If we ignore this zero arity case we could even avoid using a marker trait altogether by saying that the def _1 method serves as a marker for product pattern matching. It would be more consistent with the rest of the language which seams is mostly name based.
Also, it looks like scalac desugars case class C() with a def unapply(c: C): Boolean, could we do the same?
Yes let's do that. that means we can leave the other criterion to be "extends Product and has exactly the right number of constructors". I'd tend to still require Product for safety.
… —
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.
{"api_version":"1.0","publisher":{"api_key":"05dde50f1d1a384dd78767c55493e4bb","name":"GitHub"},"entity":{"external_key":"github/lampepfl/dotty","title":"lampepfl/dotty","subtitle":"GitHub repository","main_image_url":"https://cloud.githubusercontent.com/assets/143418/17495839/a5054eac-5d88-11e6-95fc-7290892c7bb5.png","avatar_image_url":"https://cloud.githubusercontent.com/assets/143418/15842166/7c72db34-2c0b-11e6-9aed-b52498112777.png","action":{"name":"Open in ***@***.*** in #1938: If we ignore this zero arity case we could even avoid using a marker trait altogether by saying that the `def _1` method serves as a marker for product pattern matching. It would be more consistent with the rest of the language which seams is mostly name based.\r\n\r\nAlso, it looks like scalac desugars `case class C()` with a `def unapply(c: C): Boolean`, could we do the same?"}],"action":{"name":"View Pull Request","url":"#1938 (comment)"}}}
|
What's the status here? |
70c4083
to
46c680c
Compare
The PR is in sync with the latest comments:
The changes are mergeable from my side! |
Here another attempt to formalize Dotty's pattern-matching (base on #1805). This covers the Extractor Patterns section of the spec. Dotty support 4 different extractor patterns: Boolean Pattern, Product Pattern, Seq Pattern and Name Based Pattern. Boolean Pattern
Product Pattern
Seq Pattern
Name Based Pattern
In case of ambiguities, Product Pattern is preferred over Name Based Pattern. |
@DarkDimius Could you give your opinion on this last comment? |
99a0913
to
b4fbadf
Compare
The spec looks accurate to me. One question concerns the almost equivalence of name-based and product patterns. Should we enforce that the result type of a |
Indeed that would be nice, either by adding a
|
OK, let's drop Product from the name based matching rules. For the moment we should still generate a Product for case classes, because existing code uses productArity and productElement. But dropping Product from the pattern matching rules makes it easier to change that later. |
Product pattern use to: - have a `<: Product` requirement - compute the arity of a pattern by looking at `N` in a `ProductN` superclass. This commit changes `<: Product`, instead we look for a `_1` member. The arity is determined by inspecting `_1` to `_N` members instead. --- Here another attempt to formalize Dotty's pattern-matching (base on scala#1805). This covers the *Extractor Patterns* [section of the spec](https://www.scala-lang.org/files/archive/spec/2.12/08-pattern-matching.html#extractor-patterns). Dotty support 4 different extractor patterns: Boolean Pattern, Product Pattern, Seq Pattern and Name Based Pattern. Boolean Pattern - Extractor defines `def unapply(x: T): Boolean` - Pattern-matching on exactly `0` patterns Product Pattern - Extractor defines `def unapply(x: T): U` - `N > 0` is the maximum number of consecutive (parameterless `def` or `val`) `_1: P1` ... `_N: PN` members in `U` - Pattern-matching on exactly `N` patterns with types `P1, P2, ..., PN` Seq Pattern - Extractor defines `def unapplySeq(x: T): U` - `U` has (parameterless `def` or `val`) members `isEmpty: Boolean` and `get: S` - `S <: Seq[V]` - Pattern-matching on any number of pattern with types `V, V, ..., V` Name Based Pattern - Extractor defines `def unapply(x: T): U` - `U` has (parameterless `def` or `val`) members `isEmpty: Boolean` and `get: S` - If there is exactly `1` pattern, pattern-matching on `1` pattern with type `S` - Otherwise fallback to Product Pattern on type `U` In case of ambiguities, *Product Pattern* is preferred over *Name Based Pattern*.
b4fbadf
to
5bf9d2b
Compare
Updated to decouple |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Otherwise LGTM
@@ -846,17 +848,9 @@ class Definitions { | |||
} | |||
|
|||
def isProductSubType(tp: Type)(implicit ctx: Context) = | |||
(tp derivesFrom ProductType.symbol) && tp.baseClasses.exists(isProductClass) | |||
Applications.extractorMemberType(tp, nme._1).exists |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The isProductSubType
method should be renamed and moved to Applications
.
0 <= numArgs && numArgs <= Definitions.MaxTupleArity && | ||
tp.derivesFrom(defn.ProductNType(numArgs).typeSymbol) | ||
numArgs > 0 && defn.isProductSubType(tp) && | ||
productSelectorTypes(tp).size == numArgs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use productArity
instead?
4699adb
to
198b5ce
Compare
The change in question breaks the following pattern, commonly used in name based pattern matching: ```scala object ProdEmpty { def _1: Int = ??? def _2: String = ??? def isEmpty = true def unapply(s: String): this.type = this def get = this } ``` This type define both `_1` and `get` + `isEmpty` (but is not <: Product). After the changes in scala#1938 it becomes eligibles for both product and name based pattern. Because "in case of ambiguities, *Product Pattern* is preferred over *Name Based Pattern*", isEmpty wouldn't be used, breaking the scalac sementics.
The change in question broke the following pattern, commonly used in name based pattern matching: ```scala object ProdEmpty { def _1: Int = ??? def _2: String = ??? def isEmpty = true def get = this } ``` This type define both `_1` and `get` + `isEmpty` (but is not <: Product). After scala#1938, `ProdEmpty` became eligibles for both product and name based pattern. Because "in case of ambiguities, *Product Pattern* is preferred over *Name Based Pattern*", isEmpty wouldn't be used, breaking the scalac semantics.
This PR changes the name based pattern matching the following way:
Eligibility condition is to extend a
NameBasedPattern
trait (instead of extendingProductN
).The pattern arity and types are determined by looking at implemented
_1
to_N
methods, whereN
is the arity of the last (subsequently) implemented method with such shape (N
was before obtained from theProductN
superclass).As a side effect, this PR lifts the 22 limitation on case classes.
@DarkDimius I remember you mentioning a change in this vein, what are you thoughts?