Skip to content

Use unitialized for wildcard initializers #11231

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Feb 5, 2021

Conversation

odersky
Copy link
Contributor

@odersky odersky commented Jan 27, 2021

Fixes #11225

@sjrd
Copy link
Member

sjrd commented Jan 27, 2021

Whatever we do, if it touches the language and is not a critical issue, it should now wait for 3.1. We set that as a constraint to ourselves.

@odersky odersky force-pushed the fix-11225 branch 2 times, most recently from 864be7c to d719901 Compare January 27, 2021 14:33
@odersky
Copy link
Contributor Author

odersky commented Jan 27, 2021

I think it's such a small thing and such an obvious improvement that we should not hesitate to wait with it. We want to argue that Scala 3 is a lot simpler than Scala 2. This change is a clear win.

@odersky odersky changed the title Use notInitialized for wildcard initializers Use unitialized for wildcard initializers Jan 27, 2021
@odersky
Copy link
Contributor Author

odersky commented Jan 27, 2021

Here's a reason for doing it now. The ProgFun2 MOOC contains two occurrences of the = _ syntax. Now is the last practical time to change the course.

@sjrd
Copy link
Member

sjrd commented Jan 27, 2021

I think it's such a small thing and such an obvious improvement that we should not hesitate to wait with it. We want to argue that Scala 3 is a lot simpler than Scala 2. This change is a clear win.

A lot of things are "clear wins". With clear wins like this we can still be there at Easter. It's a language change, and it is not done as feedback about Scala 3 features (in fact it's not even a Scala 3 feature in the first place). According to the decisions we took so far, we should not do this, no matter how small of clear win it is.

Here's a reason for doing it now. The ProgFun2 MOOC contains two occurrences of the = _ syntax. Now is the last practical time to change the course.

You said yourself that it's an obscure feature. How come ProgFun2 has two occurrences of this stuff? It seems to be in Signal.scala. There are ways to restructure this code not to need = _ in the first place. I can send a PR if you want.

@odersky
Copy link
Contributor Author

odersky commented Jan 27, 2021

@sjrd If you want to give it a go, I'd be interested to have a look at another Signal implementation that does not rely on uninitialized values. But if it should be a candidate for replacement, it has to be as short and as clear as the one given. Even then I am not sure I'll have the stomach to re-record.

@odersky
Copy link
Contributor Author

odersky commented Jan 27, 2021

I also don't want to wait until Easter for 3.0. But adding this will not delay the release at all. We can merge now and be done with it. Correction: I still have to fix the slides for the MOOC and replace 40 seconds of the recording where I explain the syntax. Also we have to communicate the change to book authors. But it's infinitely simpler to do this now than post 3.0.

@sjrd
Copy link
Member

sjrd commented Jan 27, 2021

@sjrd If you want to give it a go, I'd be interested to have a look at another Signal implementation that does not rely on uninitialized values. But if it should be a candidate for replacement, it has to be as short and as clear as the one given. Even then I am not sure I'll have the stomach to re-record.

https://github.com/lampepfl/moocs/pull/592 (this is a link to a private repo)

@sjrd
Copy link
Member

sjrd commented Jan 28, 2021

Other relevant PR, this time public: #11238

@odersky
Copy link
Contributor Author

odersky commented Feb 2, 2021

@nicolasstucki Can you do a technical review for this?

@odersky odersky requested a review from nicolasstucki February 2, 2021 18:07
@odersky odersky assigned nicolasstucki and unassigned odersky Feb 2, 2021
Copy link
Contributor

@nicolasstucki nicolasstucki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seen how many things break with this approach it seems more reasonable to use well-defined syntax instead.

cpy.ValDef(tree)(rhs = trivialErasedTree(tree))
else tree
else tree.rhs match
case rhs: TypeApply
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should also have a case for Block(Nil, rhs: TypeApply) or possibly more nesting.

  var a1: Int = uninitialized
  var a2: Int = { uninitialized } // has an error
  var b1: Unit = uninitialized
  var b2: Unit = { uninitialized } // has an error

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If uninitialized then we should also expect aliasing to work

inline def uninitialised[A] = uninitialized[A]

var x: Int = uninitialised

Therefore we would also need to look into Inline(_, _, _).

```scala
import scala.compiletime.uninitialized

var x: A = uninitialized
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, we also support

  var a = uninitialized[A]

This seems to be leaking implementation details.

Copy link
Contributor Author

@odersky odersky Feb 3, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. We it's better to define it like this:

@compileTimeOnly("`uninitialized` can only be used as the right hand side of a mutable field definition")
def uninitialized: Nothing = ???

Copy link
Contributor

@nicolasstucki nicolasstucki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seen how many things break with this approach it seems more reasonable to use well-defined syntax instead.

@odersky
Copy link
Contributor Author

odersky commented Feb 3, 2021

@nicolasstucki Yes, let's use @compileTimeOnly. We should also go over the books with erased to see whether we want to keep the current behavior. I found it surprising, even though I might have been advocating for it at some point. [EDIT: Actually, I don't think I ever advocated for it. ]

Intuitively, the way I thought erased should work was as follows:

  • Erased definitions (including inline definitions and erased parameters) are dropped.
  • Arguments that get passed to erased parameters are also dropped
  • No reference to an erased symbol is supposed to survive after that

@odersky
Copy link
Contributor Author

odersky commented Feb 3, 2021

@nicolasstucki I addressed the review comments. There's one remaining corner case: if uninitialized gets removed by inlining we do not flag it. That's just what compileTimeOnly and erased imply, so we should not doctor with it to
change the behavior.

Copy link
Contributor

@nicolasstucki nicolasstucki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are still a couple of bugs and technical inconsistencies.

Given that the following case slipped through the test cases.

class Foo:
  var y0: Int = uninitialized // error

I would suggest improving test coverage by porting a few existing tests with = _ to use = uninitialized.

* Finally, the phase replaces `compiletime.uninitialized` on the right hand side
* of a mutable field definition by `_`. This avoids a "is declared erased, but is
* in fact used" error in Erasure and communicates to Constructors that the
* variable does not have an initializer.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

uninitialized is not erased anymore and the transformation has nothing to do with pruning erased definitions. It is also unrelated to erased in general as this operation does produce a value at runtime (even though it is elided). This should be done somewhere else. FirstTransform might be a good candidate as we are transforming var x: T = uninitialized to the canonical form that erasure understands.

inline def g(inline x: Int): Unit = ()
def f2 = g(uninitialized) // this one is ok since `uninitialized` is inlined away

var x7: Int = uni // error
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be ok as the code we end up with is

  var x7: Int = uni  // ok
// after inlining
  var x7: Int = uninitialized // still ok

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Otherwise, we would need to make g(uninitialized) to be consistent with inlining/erased rules.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's an error since we end up with Inlined(uninitialized). I think that's OK. Ideally we would not allow the inline def of uni in the first place.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then it would be inconsistent with var x4: Int = if false then uninitialized else 1 // ok, since inlined away and def f2 = g(uninitialized) // this one is ok since uninitialized is inlined away. We would have some special inlining rules that only apply uninitialized which will confuse everyone.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need not overthink this. The intention is that you should not use uninitialized except as the RHS of a field definition. For all practical purposes that's what this PR achieves. We don't need to split hairs beyond that,

Fix recognition of uniniialized if type of var is a context function.
This required a special case in typedValDef for `var x: A ?=> B = _`.
Now it requires a special case in `PruneErasedDefs` in `var x: A ?=> B = undefined`.
@odersky odersky merged commit 36159bb into scala:master Feb 5, 2021
@odersky odersky deleted the fix-11225 branch February 5, 2021 12:01
@djspiewak
Copy link

@odersky I realized today while pondering this (apparently too late) that this is more disruptive than I realized. The problem stems from the fact that this cannot be shimmed by a uniform inherited conditional compilation strategy. What I mean by that is abusing src/main/scala-2/src/main/scala-3 directories and having some common parent type which encapsulates the differences between the languages.

In this case, the difference which must be encapsulated is the initialization semantic in the constructor. A concrete example is instructive: https://github.com/typelevel/cats-effect/blob/bde1abdb8aa80b1d0d765db33f7ba1825d529978/core/shared/src/main/scala/cats/effect/IOFiber.scala#L91-L92

The problem here is that these definitions cannot be extracted into the parent type without materially impacting their runtime semantics. Extraction into the parent is the standard trick here because it allows the conditional compilation to be isolated to just the parts of the implementation which must be fundamentally distinct between language versions. You can see a related example of this in that same type:

Unfortunately, in this case it is impossible to isolate the changes to just a parent type, exposing them via inheritance, without sacrificing some measurable elements of the current system. This is for two reasons. First, moving variables to a supertype changes their location in the object header, which has an impact on page faults and, in the worst case, can create added indirection (I'd have to actually test it, but I believe this particular case would indeed result in added indirection due to the need to retain some variables in the child class). This is a JVM limitation. Second, and in a sense, worse: the encoding of Scala's protected access modifier and uniform vars in the language means that it is impossible to obtain the same performance characteristics as a private[this] var from any other access modifier. All other access modifiers are encoded as methods which wrap instance fields, which actually (perhaps surprisingly!) has a measurable performance cost even after HotSpot has inlined the method body (which it will in most, but not all, cases).

And here we have the real crux of the problem: no one is using this syntax except in exceptionally performance-sensitive contexts, such as the file I linked. By definition, this feature is only meaningful for people who are sensitive to these kinds of performance degradations. But also by definition, it is impossible to avoid these kinds of degradations without doing one of two things:

  • Giving up on cross-publishing for both Scala 2 and Scala 3 (the nuclear option that no one wants)
  • Duplicating the entire file between Scala 2 and Scala 3 just for the sake of the new uninitialized syntax

The latter is the only option which will allow Cats Effect (and similar projects) to continue cross-publishing for Dotty, but unfortunately it also means we will be forced to needlessly duplicate the entire file in two source directories and manually keep them in sync. I invite you to peruse the file's contents further and judge for yourself how sustainable this is likely to be.

All in all, I come back to my original premise: this is simply not worth it. This change imposes a perhaps-surprisingly significant burden on an already-overstretched ecosystem. It would be one thing if this were an enormously beneficial language refinement (such as removing unbounded type projection) which carried manifest benefits across multiple domains, but it's just... not. This is a tiny bit of syntax that most people never see, and arguably isn't even that terrible when you do see it. The marginal gains are small, and the marginal losses are substantial and self-compounding.

Perhaps this is too late, but honestly and sincerely, I urge you to reconsider.

@dwijnand
Copy link
Member

dwijnand commented Feb 6, 2021

A third option is back-porting this to 2.13 and 2.12.

@LPTK
Copy link
Contributor

LPTK commented Feb 6, 2021

As a first step, why not just add a shim in src/main/scala-2/?

def uninitialized[A]: A = null.asInstanceOf[A]

Then we can see if this actually results in any visible performance regression. My guess would be that it doesn't, though one can never be sure with the JVM.

@djspiewak
Copy link

djspiewak commented Feb 6, 2021

@LPTK I've actually tested the performance regression in this case previously. It is quite measurable. Particularly since at least one of those fields is volatile, so you're adding a memory barrier where one did not previously exist.

Backporting is an option here. Also the fact that it's only deprecated and not removed (I missed that in my PR review) helps quite a bit (and avoids the worst case scenarios I described), though it does mean that fatal warnings are off the table for a while, and we will just have this same discussion again when 3.1 or 3.2 rolls around.

@smarter
Copy link
Member

smarter commented Feb 6, 2021

and we will just have this same discussion again when 3.1 or 3.2 rolls around.

There's a bunch of things which are currently deprecated under -source 3.1 but since we'd like to be free to bump the minor version at any time, it's very likely we're going to move that to -source future and put off the deprecation until we can drop scala 2 support or scala 2 -Xsource:3 supports the new form of things we want to deprecate, cf #11270

@dwijnand
Copy link
Member

dwijnand commented Feb 8, 2021

Would a fourth option be making both compilers elide the bytecode emitted by null.asInstanceOf[A] or does that remove legitimate bytecode?

@djspiewak
Copy link

Would a fourth option be making both compilers elide the bytecode emitted by null.asInstanceOf[A] or does that remove legitimate bytecode?

It would mostly remove legitimate bytecode. Specifically, it would need to remove x = null.asInstanceOf[A], but only in the initial declaration of the field (so, in the constructor).

@sjrd
Copy link
Member

sjrd commented Feb 9, 2021

That could alter the meaning of some (admittedly poorly designed) programs. It could be that something assigns a non-zero value to the field before we get to its initialization code (for example, from the super constructor, through a virtual method, or more simply from previous statements in the template). The explicitly assignment to null is supposed to erase that value. If you remove its bytecode, you remove that effect and change the meaning of the program.

@anatoliykmetyuk anatoliykmetyuk added the release-notes Should be mentioned in the release notes label Feb 10, 2021
@Kordyjan Kordyjan added this to the 3.0.0 milestone Aug 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release-notes Should be mentioned in the release notes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Get rid of = _ in variable definitions
9 participants