Skip to content

Override getOrElse in Map1, Map2, Map3 and Map4 to avoid allocations. #6138

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jan 15, 2018
Merged

Override getOrElse in Map1, Map2, Map3 and Map4 to avoid allocations. #6138

merged 2 commits into from
Jan 15, 2018

Conversation

id-ilych
Copy link
Contributor

I'm not sure if tests are needed and where to put them if they are...

@scala-jenkins scala-jenkins added this to the 2.12.5 milestone Oct 16, 2017
@id-ilych
Copy link
Contributor Author

@adriaanm Hello. I'm not sure if you are the person to address to but readme says that you are associated with 'infrastructure' so here's the issue:
I did very small change, ran sbt test locally and it passed successfully. But then PR fails on validate-test stage and the only information I could get from the log is that fail happened in something called "partest pos neg jvm". How can I find out what exactly is failing and reproduce that locally to see how's that affected by my changes?

@hrhino
Copy link
Contributor

hrhino commented Oct 17, 2017

partests are the integration tests; you can run them locally by saying partest <category> (for category in pos/neg/run/jvm & a few others) or partest test/files/path/to/test to run a specific one.

If you open the full build log (accessible by clicking "Details" on validate-test at the bottom of the page) and search for !! (which is what partest spits out on failed tests) you can see that the failure is run/t8549:

!! 1632 - run/t8549.scala                           [non-zero exit code]
##### Log file '/home/jenkins/workspace/scala-2.12.x-validate-test/test/files/run/t8549-run.log' from failed test #####

warning: there were two deprecation warnings (since 2.11.0); re-run with -deprecation for details
java.io.InvalidClassException: scala.collection.immutable.Map$EmptyMap$; local class incompatible: stream classdesc serialVersionUID = -5626373049574850357, local class serialVersionUID = 7274227957760111274

This test is testing that the small Maps are serializable using java's default serialization, and that the serialization format doesn't change. Your change shouldn't have any impact on actual serializability, but Java's default serialVersionUID algorithm hashes declared non-private methods as well(!), so this change breaks stability of serialization.

I'm pretty sure that you can fix this by putting @serialVersionUID(...) (with the current value) on these classes to pin the SVUID to its current value. Someone else will come by and confirm or deny that that's acceptable.

@id-ilych
Copy link
Contributor Author

@hrhino Wow! Thank you very much for such detailed explanation. Will try that attribute. Though I'm somewhat scared with unforeseeable (for me) consequence of a very small changes like that and it's awesome that there were tests for that. My guess is that I should add serialization tests for MapN and fix their SVUID as well.

@id-ilych
Copy link
Contributor Author

@hrhino I added tests (green on unmodified codebase) and then applied my changes along with @SerialVersionUID annotations. t8549 test is green but now another binary compatibility test fails with
field serialVersionUID in class scala.collection.immutable.Map#Map2 does not have a correspondent in other version (same for other MapX). How should I go about it now?

@hrhino
Copy link
Contributor

hrhino commented Oct 19, 2017

That's the MiMa check failing. In general, Scala provides a strong binary compatibility guarantee for the standard library. This means that any code compiled against version 2.12.x of the library can be run with version 2.12.y of the library on the classpath, and still work. Adding or removing public fields or methods breaks that guarantee, since a user could call/access that field/method and potentially receive a runtime error.

In this case, the problem is that @serialVersionUID adds a magical public static final field called serialVersionUID to the class, which is how Java serialization knows what version to use. Theoretically, this means that a Java caller could say something like

long foo = scala.collection.immutable.Map.Map1.serialVersionUID;

which would compile against 2.12.5 but break if run with 2.12.4 on the classpath (as that field is not there).

If you think that's a bit silly in this case, you're not alone: serialVersionUID was recently deemed insufficiently risky to enforce the binary compatibility policy on, as part of #6101. A maintainer might agree with me, in which case you'd add similar lines to that file:

ProblemFilters.exclude[MissingFieldProblem]("scala.collection.immutable.Map#Map1.serialVersionUID")
ProblemFilters.exclude[MissingFieldProblem]("scala.collection.immutable.Map#Map2.serialVersionUID")
ProblemFilters.exclude[MissingFieldProblem]("scala.collection.immutable.Map#Map3.serialVersionUID")
ProblemFilters.exclude[MissingFieldProblem]("scala.collection.immutable.Map#Map4.serialVersionUID")

and the build should go green again. You can verify this locally by running library/mimaReportBinaryIssues in sbt.

Unrelatedly, the file CONTRIBUTING.md contains useful information about the PR process and other tidbits. It's recommended that PRs usually be one commit long before merging. You can use git rebase -i to squash your commits together. It's cool to see that the new tests pass before the change, though! 👍

@id-ilych
Copy link
Contributor Author

id-ilych commented Oct 19, 2017

@hrhino And again - thank you for detailed explanation :) Will definitely read CONTRIBUTING.md more thoroughly.

Will add these (and EmptyMap) to ProblemFilters and hope mainteners will agree.

I already rebased my branch to have new tests before any changes to the code. The reason is that I thought it would be easier for reviewer to verify that these serialization tests are formed before any changes to serialized classes and therefore changes that pass tests too are really compatible. So I kept multiple commits intentionally to simplify code review in hope that maintainer will use 'squash and merge' button (https://help.github.com/articles/merging-a-pull-request/).

hrhino added a commit to hrhino/scala that referenced this pull request Oct 19, 2017
The JVM serialization code doesn't care either way, and
leaving them as public leads to annoying (but mostly harmless)
MiMa errors if the annotation is added. Since the generated
field is synthetic, it can only be referenced from Java, so
privatizing it is unlikely to break anyone's use of it, or
so I'd hope.

See scala#6101 and scala#6138 for examples of places
where this has caused troubles before.
hrhino added a commit to hrhino/scala that referenced this pull request Oct 19, 2017
The JVM serialization code doesn't care either way, and
leaving them as public leads to annoying (but mostly harmless)
MiMa errors if the annotation is added. Since the generated
field is synthetic, it can only be referenced from Java, so
privatizing it is unlikely to break anyone's use of it, or
so I'd hope.

See scala#6101 and scala#6138 for examples of places
where this has caused troubles before.

Also, because the diff was too small otherwise, have a warning
for putting the annotation where it cannot possibly do any good.
No one asked for it but me, but maybe it'll save someone's skin,
at some time in the future.
@id-ilych
Copy link
Contributor Author

review @Ichoran

@scala-jenkins scala-jenkins requested a review from Ichoran October 19, 2017 21:45
@Ichoran
Copy link
Contributor

Ichoran commented Oct 19, 2017

@id-ilych - I want to run some quick microbenchmarks to make sure that multiple dispatch on these cases doesn't eat up more time on some reasonable use-cases than it saves by providing these options. Ping me in a couple days if I haven't gotten around to it (I've probably just forgotten).

@id-ilych
Copy link
Contributor Author

@Ichoran Hi! How is it going?

@id-ilych
Copy link
Contributor Author

@Ichoran ping

@id-ilych
Copy link
Contributor Author

id-ilych commented Nov 2, 2017

@Ichoran Is there anybody there? 😕

@Ichoran
Copy link
Contributor

Ichoran commented Nov 2, 2017

@id-ilych - Sorry! I will try to get to it tomorrow or over the weekend. (I did forget; I apologize. And then was too busy.)

@id-ilych
Copy link
Contributor Author

id-ilych commented Nov 8, 2017

@Ichoran Hi! Any news?

@smarter
Copy link
Member

smarter commented Nov 8, 2017

I want to run some quick microbenchmarks to make sure that multiple dispatch on these cases doesn't eat up more time on some reasonable use-cases than it saves by providing these options.

On a related note, it might be worth exploring simply getting rid of Map2/Map3/Map4, Guava had to do something similar to avoid the cost of megamorphic dispatch: google/guava#1268

@Ichoran
Copy link
Contributor

Ichoran commented Nov 8, 2017

@id-ilych - I am sorry; I'm far too overcommitted. I have to finish up some work for the strawman collections (that I didn't quite get done over the weekend), and then I'll do this as soon as I can.

@id-ilych
Copy link
Contributor Author

id-ilych commented Nov 8, 2017

@Ichoran No problem. I'm in no hurry. Just wanted to be sure this PR is not forgotten. Will ping you again in two weeks then. No to be too annoying :)

@id-ilych
Copy link
Contributor Author

@Ichoran Hi! Any news?

@Ichoran
Copy link
Contributor

Ichoran commented Dec 18, 2017

Sorry, haven't had time. Various time-critical things came up (both work-related and not) and some are still not resolved. Realistically it'll probably be another week at the earliest.

Copy link
Member

@retronym retronym left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This LGTM, in particular the changes to test that this is serialization compatible with 2.12.4

@retronym retronym merged commit 236b12b into scala:2.12.x Jan 15, 2018
@id-ilych id-ilych deleted the mapx_getorelse branch January 15, 2018 15:58
@SethTisue
Copy link
Member

thanks @id-ilych for the fix, and for your patience.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants