Skip to content

Consider removing while from the language #5440

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
propensive opened this issue Nov 13, 2018 · 20 comments
Closed

Consider removing while from the language #5440

propensive opened this issue Nov 13, 2018 · 20 comments

Comments

@propensive
Copy link
Contributor

It seems that while does not need to be a part of the core language. A simple definition such as,

@tailrec
def while(pred: => Boolean)(fn: => Unit): Unit = if(pred) { fn; while(pred)(fn) }

in Predef would suffice to provide the same functionality, and would enable a slight reduction in the size of the core language. I don't know what difference (if any) it would make to performance characteristics. There would be a potential disadvantage that it would become possible to shadow the while method, but not by existing code, of course.

Something similar would probably be possible with do/while syntax.

@nairbv
Copy link

nairbv commented Nov 13, 2018

It looks like it was added as a primitive for performance reasons. In "Scala By Example," Odersky includes essentially the same as your above code with this description:

For efficiency and better error diagnostics the while loop is a primitive construct in
Scala. But in principle, it could have just as well been a predefined function. Here is
a possible implementation of it:

def While (p: => Boolean) (s: => Unit) {
  if (p) { s ; While(p)(s) }
}

@propensive
Copy link
Contributor Author

It would be interesting to revisit the performance argument in case more recent optimisations render the difference irrelevant, in particular now that we have inline.

@adamgfraser
Copy link
Contributor

I tried playing with this a bit. It looks like the tail recursive implementation is about three times slower in my admittedly simplistic test. I imagine that given how common iteration is in lower level code this wouldn't make sense. Note that I wasn't able to use inline here, receiving the following error message: Maximal number of successive inlines (32) exceeded, Maybe this is caused by a recursive inline method?.

import scala.annotation.tailrec

object Comparison extends App {

  val N = 100000
  val as = (1 to N).toIndexedSeq

  println("Native while loop:")
  timeit(10000)(nativeSum(as))
  println("Tail recursive implementation:")
  timeit(10000)(tailRecSum(as))

  @tailrec
  def While(condition: => Boolean)(action: => Unit): Unit =
    if (condition) { action; While(condition)(action) }

  def nativeSum(as: IndexedSeq[Int]): Int = {
    var i = 0
    var sum = 0
    while (i < as.length) {
      sum += as(i)
      i += 1
    }
    sum
  }

  def tailRecSum(as: IndexedSeq[Int]): Int = {
    var i = 0
    var sum = 0
    While (i < as.length) {
      sum += as(i)
      i += 1
    }
    sum
  }

  def timeit(n: Int)(task: => Unit): Unit = {
    val start = System.currentTimeMillis
    (0 to n).foreach { _ => task }
    val stop = System.currentTimeMillis
    println((stop - start) / 1000.0 + " seconds")
  }
}
Native while loop:
4.089 seconds
Tail recursive implementation:
11.669 seconds

@nairbv
Copy link

nairbv commented Nov 14, 2018

The "Scala by example" book also mentioned "better error diagnostics." Maybe the extra lines in each stack trace is what he means?

scala> While(true) { throw new Exception("thrown from function while block") }
java.lang.Exception: thrown from function while block
  at .$anonfun$res4$2(<console>:14)
  at scala.Function0.apply$mcV$sp(Function0.scala:34)
  at .While(<console>:14)
  ... 28 elided

scala> While(throw new Exception("thrown from functional condition check")) { println("in block") }
java.lang.Exception: thrown from functional condition check
  at .$anonfun$res5$1(<console>:14)
  at scala.Function0.apply$mcZ$sp(Function0.scala:34)
  at .While(<console>:14)
  ... 28 elided

scala> while(true) { throw new Exception("thrown from regular while block") }
java.lang.Exception: thrown from regular while block
  ... 28 elided

scala> while(throw new Exception("thrown from regular condition check")) { println("in block") }
java.lang.Exception: thrown from regular condition check
  ... 28 elided

@pnf
Copy link

pnf commented Nov 14, 2018

@adamgfraser Re timing: what I would worry about most is short loops where the JIT doesn't get enough reps to deem it worth inlining. jmh would be useful here. It would also be interesting to compare the timings and the generated assembler using graal, which can be shockingly better on higher-order functions.

@pnf
Copy link

pnf commented Nov 14, 2018

Also: I mentioned on a twitter thread already, but it's interesting that the Dotty documentation contains an example with both inline and while keywords:

inline def foreach(op: => Int => Unit): Unit = {
  var i = from
  while (i < end) {
    op(i)
    i += 1
  }
}

The following might be my personal quirks, but I tend to think

  1. Having constructs like while available to extract maximum performance in core infrastructure is totally consistent with Scala philosophy.
  2. Higher-order functions taking impure closures are weird; if I were looking for controversial deprecations, I'd axe foreach.

@He-Pin
Copy link
Contributor

He-Pin commented Nov 15, 2018

Remove it and let the compiler do optimization ?

@joroKr21
Copy link
Member

Adding inline essentially unrolls the loop which is not what you want.

@joroKr21
Copy link
Member

What you need to do instead of inline is a macro (I'm not sure what the syntax is these days so consider this pseudo code):

macro def while(p: Expr[Boolean])(b: Expr[Unit]): Unit = '{
  def w(): Unit = if (~p) { ~b; w() }
  w()
} 

But it' s probably not worth doing. It's more interesting to automatically transform a mutating while loop into a pure tailrec method which takes all variables as arguments, but that's a lot more involved.

@LPTK
Copy link
Contributor

LPTK commented Nov 15, 2018

@joroKr21 why would you need a macro ?

I think you could do it as:

inline def doWhile(p: => Bool): Unit = {
  @tailrec def rec: Unit = if (p) rec
  rec
}

AFAIK the by-name parameter to the inline function will be copied to its use sites (here, the condition of the if) in the inlined body.

@adamgfraser
Copy link
Contributor

I was able to use inline with the method like this:

  inline def While(condition: => Boolean)(action: => Unit): Unit = {
    @tailrec
    def go: Unit = if (condition) { action; go }
    go
  }

With that implementation and testing with JMH, the inline version is slightly faster than a native while loop. The inline version is dramatically faster than the tail recursive version without inlining. So it looks like we could potentially replace while with a library method without compromising performance.

[info] Benchmark              Mode  Cnt  Score   Error   Units
[info] TestWhile.inlineSum   thrpt   20  2.150 ± 0.044  ops/ms
[info] TestWhile.nativeSum   thrpt   20  1.967 ± 0.040  ops/ms
[info] TestWhile.tailRecSum  thrpt   20  0.885 ± 0.050  ops/ms

I also checked the error reporting and we still get nice error messages. I'm not sure how much this all buys us because we're still using highly impure actions that have the type => Unit and with @tailrec we're still compiling down to a while loop on the JVM. But it is definitely interesting that we can implement this without any performance penalty, and potentially with a slight speedup, by using inline. Does it make sense to pursue this further?

package bench

import java.util.concurrent.TimeUnit

import scala.annotation.tailrec

import org.openjdk.jmh.annotations._

@OutputTimeUnit(TimeUnit.MILLISECONDS)
@BenchmarkMode(Array(Mode.Throughput))
@State(Scope.Benchmark)
class TestWhile {

  val as = (1 to 100000).toIndexedSeq

  inline def While(condition: => Boolean)(action: => Unit): Unit = {
    @tailrec
    def go: Unit = if (condition) { action; go }
    go
  }

  @tailrec
  final def While0(condition: => Boolean)(action: => Unit): Unit =
    if (condition) { action; While0(condition)(action) }

  @Benchmark
  def nativeSum: Int = {
    var i = 0
    var sum = 0
    while (i < as.length) {
      sum += as(i)
      i += 1
    }
    sum
  }

  @Benchmark
  def inlineSum: Int = {
    var i = 0
    var sum = 0
    While (i < as.length) {
      sum += as(i)
      i += 1
    }
    sum
  }

  @Benchmark
  def tailRecSum: Int = {
    var i = 0
    var sum = 0
    While0 (i < as.length) {
      sum += as(i)
      i += 1
    }
    sum
  }
}
import scala.annotation.tailrec

object Test {

  def main(args: Array[String]): Unit =
    failingSum(as)

  def as = (1 to 100000).toIndexedSeq

  def failingSum(as: IndexedSeq[Int]): Int = {
    var i = 0
    var sum = 0
    While (i < as.length) {
      if (as(i) > 50000) throw new Exception("Boom!")
      sum += as(i)
      i += 1
    }
    sum
  }

  inline def While(condition: => Boolean)(action: => Unit): Unit = {
    @tailrec
    def go: Unit = if (condition) { action; go }
    go
  }
}
[error] (run-main-3) java.lang.Exception: Boom!
[error] java.lang.Exception: Boom!
[error]         at Test$.go$1(Test.scala:14)
[error]         at Test$.failingSum(Test.scala:24)
[error]         at Test$.main(Test.scala:6)
[error]         at Test.main(Test.scala)
[error]         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
[error]         at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
[error]         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
[error]         at java.lang.reflect.Method.invoke(Method.java:498)

@propensive
Copy link
Contributor Author

Thanks for investing the time into benchmarking this, @adamgfraser - the results are certainly interesting!

I also mentioned this to @odersky recently, and he suggested that the do/while variant could be removed completely. It looks like we might be able to avoid it, but as a general rule, any reduction in performance of while would be a blocker, though.

@adamgfraser
Copy link
Contributor

@propensive That totally makes sense. Is there anything else I can do to help? The results are encouraging but it is just one benchmark. Is there a broader set of benchmarks that we could use to determine whether there would be any reduction in performance?

@LPTK
Copy link
Contributor

LPTK commented Nov 18, 2018

It seems nobody mentioned performance of the compiler. Compiling a while statement is probably much faster than inlining a def and optimizing it away.

If this change made while-heavy projects slower to compile (if there are any?), then it'd be a bad idea.

@pnf
Copy link

pnf commented Nov 18, 2018 via email

@pnf
Copy link

pnf commented Nov 19, 2018

The graal numbers are spectacularly better. tailrecSum now matches inlineSum, which is reassuring. From the byte code, it seems that the inlining in inlineSum just goes as far as inlining a call to go$1, which is tail-recursive, so it's really the same as tailrecSum except with one extra layer. I'm guessing that C2 is does something amazingly awful with tailrecSum, but I haven't gotten around to inspecting the assembly yet, or digging into why LMF is used in some places but not others.

nativeSum is still slightly lower, which surprises me, because it's doing less work (specifically, not repeatedly fetching and then updating the i and sum values). I'm still find the use of two non-RT closures referring by necessity to closed-over mutables to be sort of perverse.

This is a macbook i5-8259U 2.3GHz with turbo shut off, fwiw.

graal:
[info] TestWhile.inlineSum   thrpt   10  11.308 ± 0.050  ops/ms                                   
[info] TestWhile.nativeSum   thrpt   10   9.644 ± 0.014  ops/ms
[info] TestWhile.tailRecSum  thrpt   10  11.309 ± 0.037  ops/ms
1.8:
[info] TestWhile.inlineSum   thrpt   10  2.272 ± 0.061  ops/ms
[info] TestWhile.nativeSum   thrpt   10  2.043 ± 0.070  ops/ms
[info] TestWhile.tailRecSum  thrpt   10  1.004 ± 0.001  ops/ms

Compiled from "Bench1.scala"
public class test.TestWhile {
private final scala.collection.immutable.IndexedSeq as;

public test.TestWhile();
Code:
0: aload_0
1: invokespecial #26 // Method java/lang/Object."":()V
4: aload_0
5: getstatic #32 // Field scala/runtime/RichInt$.MODULE$:Lscala/runtime/RichInt$;
8: getstatic #37 // Field scala/Predef$.MODULE$:Lscala/Predef$;
11: iconst_1
12: invokevirtual #43 // Method scala/LowPriorityImplicits.intWrapper:(I)I
15: ldc #44 // int 100000
17: invokevirtual #48 // Method scala/runtime/RichInt$.to$extension0:(II)Lscala/collection/immutable/Range$Inclusive;
20: invokeinterface #54, 1 // InterfaceMethod scala/collection/immutable/IndexedSeq.toIndexedSeq:()Lscala/collection/immutable/IndexedSeq;
25: putfield #56 // Field as:Lscala/collection/immutable/IndexedSeq;
28: return
LineNumberTable:
line 17: 0
line 19: 4
LocalVariableTable:
Start Length Slot Name Signature
0 29 0 this Ltest/TestWhile;

public scala.collection.immutable.IndexedSeq<java.lang.Object> as();
Code:
0: aload_0
1: getfield #56 // Field as:Lscala/collection/immutable/IndexedSeq;
4: areturn
LineNumberTable:
line 19: 0
LocalVariableTable:
Start Length Slot Name Signature
0 5 0 this Ltest/TestWhile;

public final void While0(scala.Function0, scala.Function0);
Code:
0: goto 30
3: aload 4
5: invokeinterface #66, 1 // InterfaceMethod scala/Function0.apply:()Ljava/lang/Object;
10: invokestatic #72 // Method scala/runtime/BoxesRunTime.unboxToBoolean:(Ljava/lang/Object;)Z
13: ifeq 27
16: aload 5
18: invokeinterface #66, 1 // InterfaceMethod scala/Function0.apply:()Ljava/lang/Object;
23: pop
24: goto 3
27: goto 41
30: aload_0
31: aload_1
32: aload_2
33: astore 5
35: astore 4
37: astore_3
38: goto 3
41: return
LineNumberTable:
line 29: 0
LocalVariableTable:
Start Length Slot Name Signature
0 42 0 this Ltest/TestWhile;
0 42 1 condition Lscala/Function0;
0 42 2 action Lscala/Function0;

public int nativeSum();
Code:
0: iconst_0
1: istore_1
2: iconst_0
3: istore_2
4: iload_1
5: aload_0
6: invokevirtual #80 // Method as:()Lscala/collection/immutable/IndexedSeq;
9: invokeinterface #85, 1 // InterfaceMethod scala/collection/SeqLike.length:()I
14: if_icmpge 40
17: iload_2
18: aload_0
19: invokevirtual #80 // Method as:()Lscala/collection/immutable/IndexedSeq;
22: iload_1
23: invokeinterface #88, 2 // InterfaceMethod scala/collection/SeqLike.apply:(I)Ljava/lang/Object;
28: invokestatic #92 // Method scala/runtime/BoxesRunTime.unboxToInt:(Ljava/lang/Object;)I
31: iadd
32: istore_2
33: iload_1
34: iconst_1
35: iadd
36: istore_1
37: goto 4
40: iload_2
41: ireturn
LineNumberTable:
line 32: 0
line 33: 0
line 34: 2
line 35: 4
line 36: 17
line 37: 33
line 39: 40
LocalVariableTable:
Start Length Slot Name Signature
1 40 1 i I
3 38 2 sum I
0 42 0 this Ltest/TestWhile;

public int inlineSum();
Code:
0: iconst_0
1: invokestatic #102 // Method scala/runtime/IntRef.create:(I)Lscala/runtime/IntRef;
4: astore_1
5: iconst_0
6: invokestatic #102 // Method scala/runtime/IntRef.create:(I)Lscala/runtime/IntRef;
9: astore_2
10: aload_0
11: aload_1
12: aload_2
13: invokespecial #106 // Method go$1:(Lscala/runtime/IntRef;Lscala/runtime/IntRef;)V
16: getstatic #112 // Field scala/runtime/BoxedUnit.UNIT:Lscala/runtime/BoxedUnit;
19: pop
20: aload_2
21: getfield #115 // Field scala/runtime/IntRef.elem:I
24: ireturn
LineNumberTable:
line 43: 0
line 44: 0
line 45: 5
line 24: 10
line 50: 20
LocalVariableTable:
Start Length Slot Name Signature
4 20 1 i Lscala/runtime/IntRef;
9 15 2 sum Lscala/runtime/IntRef;
0 25 0 this Ltest/TestWhile;

public int tailRecSum();
Code:
0: iconst_0
1: invokestatic #102 // Method scala/runtime/IntRef.create:(I)Lscala/runtime/IntRef;
4: astore_1
5: iconst_0
6: invokestatic #102 // Method scala/runtime/IntRef.create:(I)Lscala/runtime/IntRef;
9: astore_2
10: aload_0
11: aload_0
12: aload_1
13: invokedynamic #135, 0 // InvokeDynamic #0:apply$mcZ$sp:(Ltest/TestWhile;Lscala/runtime/IntRef;)Lscala/compat/java8/JFunction0$mcZ$sp;
18: aload_0
19: aload_1
20: aload_2
21: invokedynamic #144, 0 // InvokeDynamic #1:apply$mcV$sp:(Ltest/TestWhile;Lscala/runtime/IntRef;Lscala/runtime/IntRef;)Lscala/compat/java8/JFunction0$mcV$sp;
26: invokevirtual #146 // Method While0:(Lscala/Function0;Lscala/Function0;)V
29: aload_2
30: getfield #115 // Field scala/runtime/IntRef.elem:I
33: ireturn
LineNumberTable:
line 54: 0
line 55: 0
line 56: 5
line 57: 10
line 60: 18
line 61: 29
LocalVariableTable:
Start Length Slot Name Signature
4 29 1 i Lscala/runtime/IntRef;
9 24 2 sum Lscala/runtime/IntRef;
0 34 0 this Ltest/TestWhile;

private final void go$1(scala.runtime.IntRef, scala.runtime.IntRef);
Code:
0: goto 66
3: aload_1
4: getfield #115 // Field scala/runtime/IntRef.elem:I
7: aload_0
8: invokevirtual #80 // Method as:()Lscala/collection/immutable/IndexedSeq;
11: invokeinterface #85, 1 // InterfaceMethod scala/collection/SeqLike.length:()I
16: if_icmpge 63
19: aload_2
20: getfield #115 // Field scala/runtime/IntRef.elem:I
23: aload_0
24: invokevirtual #80 // Method as:()Lscala/collection/immutable/IndexedSeq;
27: aload_1
28: getfield #115 // Field scala/runtime/IntRef.elem:I
31: invokeinterface #88, 2 // InterfaceMethod scala/collection/SeqLike.apply:(I)Ljava/lang/Object;
36: invokestatic #92 // Method scala/runtime/BoxesRunTime.unboxToInt:(Ljava/lang/Object;)I
39: iadd
40: istore_3
41: aload_2
42: iload_3
43: putfield #115 // Field scala/runtime/IntRef.elem:I
46: aload_1
47: getfield #115 // Field scala/runtime/IntRef.elem:I
50: iconst_1
51: iadd
52: istore 4
54: aload_1
55: iload 4
57: putfield #115 // Field scala/runtime/IntRef.elem:I
60: goto 3
63: goto 69
66: goto 3
69: return
LineNumberTable:
line 23: 0
line 46: 3
line 47: 19
line 48: 46
line 23: 60
LocalVariableTable:
Start Length Slot Name Signature
0 70 0 this Ltest/TestWhile;
0 70 1 i$1 Lscala/runtime/IntRef;
0 70 2 sum$1 Lscala/runtime/IntRef;

private final boolean tailRecSum$$anonfun$1(scala.runtime.IntRef);
Code:
0: aload_1
1: getfield #115 // Field scala/runtime/IntRef.elem:I
4: aload_0
5: invokevirtual #80 // Method as:()Lscala/collection/immutable/IndexedSeq;
8: invokeinterface #85, 1 // InterfaceMethod scala/collection/SeqLike.length:()I
13: if_icmpge 20
16: iconst_1
17: goto 21
20: iconst_0
21: ireturn
LineNumberTable:
line 57: 0
LocalVariableTable:
Start Length Slot Name Signature
0 22 0 this Ltest/TestWhile;
0 22 1 i$2 Lscala/runtime/IntRef;

private final void tailRecSum$$anonfun$2(scala.runtime.IntRef, scala.runtime.IntRef);
Code:
0: aload_2
1: getfield #115 // Field scala/runtime/IntRef.elem:I
4: aload_0
5: invokevirtual #80 // Method as:()Lscala/collection/immutable/IndexedSeq;
8: aload_1
9: getfield #115 // Field scala/runtime/IntRef.elem:I
12: invokeinterface #88, 2 // InterfaceMethod scala/collection/SeqLike.apply:(I)Ljava/lang/Object;
17: invokestatic #92 // Method scala/runtime/BoxesRunTime.unboxToInt:(Ljava/lang/Object;)I
20: iadd
21: istore_3
22: aload_2
23: iload_3
24: putfield #115 // Field scala/runtime/IntRef.elem:I
27: aload_1
28: getfield #115 // Field scala/runtime/IntRef.elem:I
31: iconst_1
32: iadd
33: istore 4
35: aload_1
36: iload 4
38: putfield #115 // Field scala/runtime/IntRef.elem:I
41: return
LineNumberTable:
line 57: 0
line 58: 0
line 59: 27
LocalVariableTable:
Start Length Slot Name Signature
0 42 0 this Ltest/TestWhile;
0 42 1 i$3 Lscala/runtime/IntRef;
0 42 2 sum$2 Lscala/runtime/IntRef;
}

@odersky
Copy link
Contributor

odersky commented Nov 19, 2018

It's actually the other way round. The body of @tailrec methods gets translated to while loops. while is also one of the standard AST kinds in Tasty. So in that sense, while is fundamental whereas @tailrec is derived.

@pnf
Copy link

pnf commented Nov 21, 2018

There's still a bit of a mystery about why nativeSum runs slightly slower than the two @tailRec methods, when the generated bytecode is essentially the same, except that nativeSum doesn't have to call putField and getField repeatedly, which ought to make it faster. The difference persists under graal, and when using Array, and the Intel compilation output is also practically identical. (The more outrageous difference between tailRecSum and inlineSum on jdk8 is just an artifact of C2 insanity.)

One thing I did notice is that in all cases we seem to invoke .length within the tight loop, even though as is immutable.

@propensive
Copy link
Contributor Author

I just wanted to say "thanks" to everyone who's getting involved with this issue: for a somewhat frivolous issue, it's exactly the sort of discussion I hoped to see around it, and the sort of contributions I'm not knowledgeable enough to make myself!

@odersky
Copy link
Contributor

odersky commented Dec 17, 2018

Closing for now since no immediate action is planned. I agree it was a good discussion to have!

@odersky odersky closed this as completed Dec 17, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants