Skip to content
This repository was archived by the owner on Sep 1, 2020. It is now read-only.

Allow primes at the end of identifiers. #29

Merged
merged 1 commit into from
Sep 6, 2014

Conversation

milessabin
Copy link
Member

No description provided.

@puffnfresh
Copy link

👍

@milessabin
Copy link
Member Author

@non you had a comment on this in a previous incarnation which I didn't fully grok ... do you want to elaborate on it here?

@milessabin milessabin self-assigned this Sep 6, 2014
@non
Copy link

non commented Sep 6, 2014

So, it was a somewhat weird thing. I noticed that unicode primes (1-3) were already allowed:

val xʹ = 999
val xʺ = false
val x= "this is a cat"

These currently work. Since Scala considers and -> to be the same, I was wondering if we wanted to unify identifiers like foo'' and fooʺ?

This is definitely not necessary, just something I was thinking about.

@propensive
Copy link

I'd give than a largely indifferent +1.

Binary compatibility says we should encode ''' as if the user had typed , though, which I imagine is the more difficult way round to implement.

@non
Copy link

non commented Sep 6, 2014

Yeah honestly I'm not convinced my point is even worth taking, since (AFAIK) no one uses unicode primes.

@non
Copy link

non commented Sep 6, 2014

We could transform fooʺ to foo'' during compilation under a flag. It's definitely not worth the trouble on its own, but could be part of a larger "smooth out some rough edges" flag. For now let's ignore my comments and merge this though. 👍

@milessabin
Copy link
Member Author

WfM ... who wants to do the honours?

non added a commit that referenced this pull request Sep 6, 2014
Allow primes at the end of identifiers.
@non non merged commit e89ac28 into typelevel:2.11.x Sep 6, 2014
object a1 {
def b(c: Char) = ???

a1 b 'c'
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

scalac would parse a1 b'c' as a1.b('c'), while the Typelevel compiler would parse it as a1.b'(c'). Not sure if this is something permitted under your compatibility guidelines, but you should probably warn this (or require a language import).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a1 b'c' parsing correctly sort of strikes me as a bug, though? Shouldn't it require a space?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, Iulian! It's also an issue with symbol literals, unsurprisingly.

Regardless of whether it's considered a bug (in Scalac) or not, it's a
difference, so we'll need to put it under a -Z flag.

Given there are likely to be several very subtle parsing differences like
this -- which in all likelihood don't affect anyone that much -- would it
make sense to group them under a single -Z flag?

On 18 September 2014 13:15, Nami-Doc [email protected] wrote:

In test/files/pos/primed-identifiers.scala:

@@ -0,0 +1,48 @@
+object Test {

  • val l = List("Identifiers", "with", "primes", "!")
  • val l' = l map(_.length)
  • val l'' = l zip l'
  • val l''' = l''.reverse
  • object a1 {
  • def b(c: Char) = ???
  • a1 b 'c'

a1 b'c' parsing correctly sort of strikes me as a bug, though? Shouldn't
it require a space?


Reply to this email directly or view it on GitHub
https://github.com/typelevel/scala/pull/29/files#r17723052.

Jon Pretty | @propensive

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Nami-Doc it's not a bug, according to the grammar in the SLS. Generally, space is not required between tokens (that's why you can write foo(x) without any space between the 4 tokens).

@propensive a -Z flag would work, but why not an import language.typelevel.syntax, or something like that?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dragos language flags work by using implicits, they're not in scope during the parser (i.e. it hasn't even gotten to typing, yet) - we need to figure out a solution for this.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've only got my/Martin's suggestion of passing both possible parsings
through to the typer, somehow. But I can't imagine how this could be any
less terrible than it sounds.
On 18 Sep 2014 15:14, "Brian McKenna" [email protected] wrote:

In test/files/pos/primed-identifiers.scala:

@@ -0,0 +1,48 @@
+object Test {

  • val l = List("Identifiers", "with", "primes", "!")
  • val l' = l map(_.length)
  • val l'' = l zip l'
  • val l''' = l''.reverse
  • object a1 {
  • def b(c: Char) = ???
  • a1 b 'c'

@dragos https://github.com/dragos language flags work by using
implicits, they're not in scope during the parser (i.e. it hasn't even
gotten to typing, yet) - we need to figure out a solution for this.


Reply to this email directly or view it on GitHub
https://github.com/typelevel/scala/pull/29/files#r17728891.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, I'm pretty certain that aint gonna happen.

Notice that the name is MutableSettings, though. What if the Parser were to... mutate them? 👿

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the record, the name MutableSettings was intended as documentation, not as guidance.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@puffnfresh spot on about imports... too bad.

@non
Copy link

non commented Sep 19, 2014

So, I will try to fix this up so that primes only parse as primes in cases where there is no ambiguity with char literals, etc. If we can't find a way to avoid these situations then a -Z flag is probably called for.

@puffnfresh
Copy link

@non will only allowing quotes at the end clear up the ambiguity?

@non
Copy link

non commented Sep 19, 2014

@puffnfresh Well I don't think so. @dragos's example is a b'c'. In order to maintain source compatibility we must parse that as a.b('c').

I need to read the spec and look at the parsing rules to figure out the best rationale for preferring that interpretation over a.b'(c'), but an easy rule would be the identifiers-with-primes must be followed by a non-word character or a newline.

@propensive
Copy link

Primes are only allowed at the end...

(or am I misinterpreting?)
On 19 Sep 2014 05:36, "Brian McKenna" [email protected] wrote:

@non https://github.com/non will only allowing quotes at the end clear
up the ambiguity?


Reply to this email directly or view it on GitHub
#29 (comment).

@non
Copy link

non commented Sep 19, 2014

Basically we need to ensure that anything that might have involved single quotes previously cannot now be mistaken for a primed-identifier. Here are some possible examples:

  • a b'c' needs to parse as a.b('c') not a.b'(c')
  • a b'c needs to parse as a.b('c) not a.b'(c)

@puffnfresh
Copy link

@propensive @non that sounds right, thanks.

@som-snytt
Copy link

Is 'abc' a Symbol("abc'")? I guess so, but it might be worth a test.

And I guess if I can't make a symbol literal with backquoted idents, I can't make a primed backquoted ident, only plain ids.

Edit: it's a couple chars' lookahead for a b'c' but a b'cdef' is less friendly.

@non
Copy link

non commented Sep 19, 2014

I think for any kind of coherence we must forbid symbols from having primes.

Seems like some clean up work is needed on this already-merged primes feature, eh? ;)

@som-snytt
Copy link

If it were discriminating, would it accept *'?

What about emoji operators? 💩 prime?

@non
Copy link

non commented Sep 19, 2014

Probably no and yes. I will write up the formal rules and get back to you though.

@milessabin
Copy link
Member Author

I think requiring a -Z flag is probably a saner option than further complicating the grammar.

@soc
Copy link

soc commented Sep 19, 2014

Given the rippling effects, what's the actual benefit except "that's what Haskell does" which trumps those issues?

@propensive
Copy link

The benefit of having primes at all?

The benefit is not having to think up new names for an identifier. My usual
solution to this problem is to append a 2 then a 3, etc, to the
identifier name, though for some reason primes seems like a neater solution
than using numbers.

On 19 September 2014 22:37, soc [email protected] wrote:

Given the rippling effects, what's the actual benefit except "that's what
Haskell does" which trumps those issues?


Reply to this email directly or view it on GitHub
#29 (comment).

Jon Pretty | @propensive

@soc
Copy link

soc commented Sep 19, 2014

From my point of view, introducing yet another way to do fundamentally the same thing would be enough of a reason to not go down that road any further. If Typelevel doesn't plan to disallow numbers at the end of identifiers, I can hardly see the point, especially given the accidental complexity.

@som-snytt
Copy link

There has to be a really excellent use for _'!

puffnfresh pushed a commit to puffnfresh/scala that referenced this pull request Jul 30, 2015
These methods are "signature polymorphic", which means that compiler
should not:
  1. adapt the arguments to `Object`
  2. wrap the repeated parameters in an array
  3. adapt the result type to `Object`, but instead treat it as it
     it already conforms to the expected type.

Dispiritingly, my initial attempt to implement this touched the type
checker, uncurry, erasure, and the backend.

However, I realized we could centralize handling of this in the typer
if at each application we substituted the signature polymorphic
symbol with a clone that carried its implied signature, which is
derived from the types of the arguments (typechecked without an
expected type) and position within and enclosing cast or block.

The test case requires Java 7+ to compile so is currently embedded
in a conditionally compiled block of code in a run test.

We ought to create a partest category for modern JVMs so we can
write such tests in a more natural style.

Here's how this looks in bytecode. Note the `bipush` / `istore`
before/after the invocation of `invokeExact`, and the descriptor
`(LO$;I)I`.

```
% cat sandbox/poly-sig.scala && qscala Test && echo ':javap Test$#main' | qscala
import java.lang.invoke._

object O {
  def bar(x: Int): Int = -x
}

object Test {
  def main(args: Array[String]): Unit = {
    def lookup(name: String, params: Array[Class[_]], ret: Class[_]) = {
      val lookup = java.lang.invoke.MethodHandles.lookup
      val mt = MethodType.methodType(ret, params)
      lookup.findVirtual(O.getClass, name, mt)
    }
    def lookupBar = lookup("bar", Array(classOf[Int]), classOf[Int])

    val barResult: Int = lookupBar.invokeExact(O, 42)
    ()
  }
}

scala> :javap Test$#main
  public void main(java.lang.String[]);
    descriptor: ([Ljava/lang/String;)V
    flags: ACC_PUBLIC
    Code:
      stack=3, locals=3, args_size=2
         0: aload_0
         1: invokespecial typelevel#18                 // Method lookupBar$1:()Ljava/lang/invoke/MethodHandle;
         4: getstatic     typelevel#23                 // Field O$.MODULE$:LO$;
         7: bipush        42
         9: invokevirtual typelevel#29                 // Method java/lang/invoke/MethodHandle.invokeExact:(LO$;I)I
        12: istore_2
        13: return
      LocalVariableTable:
        Start  Length  Slot  Name   Signature
            0      14     0  this   LTest$;
            0      14     1  args   [Ljava/lang/String;
           13       0     2 barResult   I
      LineNumberTable:
        line 16: 0
}
```

I've run this test across our active JVMs:

```
% for v in 1.6 1.7 1.8; do java_use $v; pt --terse test/files/run/t7965.scala || break; done
java version "1.6.0_65"
Java(TM) SE Runtime Environment (build 1.6.0_65-b14-466.1-11M4716)
Java HotSpot(TM) 64-Bit Server VM (build 20.65-b04-466.1, mixed mode)
Selected 1 tests drawn from specified tests

.

1/1 passed (elapsed time: 00:00:02)
Test Run PASSED
java version "1.7.0_71"
Java(TM) SE Runtime Environment (build 1.7.0_71-b14)
Java HotSpot(TM) 64-Bit Server VM (build 24.71-b01, mixed mode)
Selected 1 tests drawn from specified tests

.

1/1 passed (elapsed time: 00:00:07)
Test Run PASSED
java version "1.8.0_25"
Java(TM) SE Runtime Environment (build 1.8.0_25-b17)
Java HotSpot(TM) 64-Bit Server VM (build 25.25-b02, mixed mode)
Selected 1 tests drawn from specified tests

.

1/1 passed (elapsed time: 00:00:05)
Test Run PASSED
```
milessabin pushed a commit that referenced this pull request Aug 12, 2016
This corrects an error in the change to the trait encoding
in scala#5003: getters in traits should have empty bodies and
be emitted as abstract.

```
% ~/scala/2.12.0-M4/bin/scalac sandbox/test.scala && javap -c T
Compiled from "test.scala"
public interface T {
  public abstract void T$_setter_$x_$eq(int);

  public int x();
    Code:
       0: aload_0
       1: invokeinterface #15,  1           // InterfaceMethod x:()I
       6: ireturn

  public int y();
    Code:
       0: aload_0
       1: invokeinterface #20,  1           // InterfaceMethod y:()I
       6: ireturn

  public void y_$eq(int);
    Code:
       0: aload_0
       1: iload_1
       2: invokeinterface #24,  2           // InterfaceMethod y_$eq:(I)V
       7: return

  public void $init$();
    Code:
       0: aload_0
       1: bipush        42
       3: invokeinterface #29,  2           // InterfaceMethod T$_setter_$x_$eq:(I)V
       8: aload_0
       9: bipush        24
      11: invokeinterface #24,  2           // InterfaceMethod y_$eq:(I)V
      16: return
}

% qscalac sandbox/test.scala && javap -c T
Compiled from "test.scala"
public interface T {
  public abstract void T$_setter_$x_$eq(int);

  public abstract int x();

  public abstract int y();

  public abstract void y_$eq(int);

  public static void $init$(T);
    Code:
       0: aload_0
       1: bipush        42
       3: invokeinterface #21,  2           // InterfaceMethod T$_setter_$x_$eq:(I)V
       8: aload_0
       9: bipush        24
      11: invokeinterface #23,  2           // InterfaceMethod y_$eq:(I)V
      16: return

  public void $init$();
    Code:
       0: aload_0
       1: invokestatic  #27                 // Method $init$:(LT;)V
       4: return
}
```
@Blaisorblade
Copy link

Since this was discussed again, for the record: I've rested this in 2.12.0

val xʹ = 999
val xʺ = false
val x‴ = "this is a cat"

and the third fails now. In fact, the only primes that work (at least in 2.12.0) are U+02B9 and U+02BA (MODIFIER LETTER PRIME and DOUBLE PRIME), but not the real PRIME characters (U+2032-U+2034, U+2057).

If anybody else cares, my 2 cents: please either use the working primes, or readd support for the other ones, instead of trying to change what ' does.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants