-
Notifications
You must be signed in to change notification settings - Fork 59
Very slow updates when running community build locally #408
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Does the problem go away if you downgrade the sbt version in - sbt-version: "0.13.13"
+ sbt-version: "0.13.12" I was able to reproduce the problem locally on my laptop at Lightbend SF, and downgrading sbt fixes it, so I assume you'll see the same thing as well. what's interesting also is that I see the same effect (slow with 0.13.13, fast with 0.13.12) even when running a local Artifactory. so apparently this has nothing to do with anything on scala-ci.typesafe.com @eed3si9n @dwijnand @cunei can you think of a reason why sbt 0.13.13 would resolve things 100x slower than 0.13.12? |
I can't think or see anything obvious from the release notes. Would require more investigation. |
is there a nightly build of sbt 0.13.14 I can try, on the off chance whatever's going on has already been fixed? |
Looks like our nightly builds are broken.. https://repo.scala-sbt.org/scalasbt/ivy-snapshots/org.scala-sbt/sbt/0.13.14-20161128-062537/jars/ |
because of the performance regression under investigation at scala#408
A mysterious thing about this is that it's fast on Jenkins. So there's one set of conditions under which performance is fine. Until we have a fix or workaround, maybe we can just downgrade the whole community build to sbt 0.13.12. Test run of that: https://scala-ci.typesafe.com/job/scala-2.12.x-integrate-community-build/988/consoleFull |
I've been working on (and this ticket is based on) the 2.12.0 branch which uses 0.13.12. |
Ugh, perhaps I've made a mistake here where I thought something was fast because I'd downgraded sbt, but it was actually fast for a different reason? Dammit. |
Maybe Dbuild caches something under some circumstances and changing the parameter invalidates the cache? |
The problem seems to be specific to running Taking
and so on, 43 lines total. But in a dbuild context, dbuild injects stuff into the build definition, plus for some reason most lines are doubled, so (even with dbuild 0.9.7-RC1, which I tried to see if it helped, and it didn't) we see:
and so on, for a whopping 442 lines. each one takes time. @cunei any insight into why the number of artifacts that need to be resolved would blow up tenfold like that...? is it just all of the transitive dependencies of the stuff that dbuild injects into every build? (but, what about the doubling aspect?) Also, I'm still confused about whether there's been a regression here. I don't think I remember seeing 10-15 minute pauses before, but it's hard for me to be sure. Once a two-minute pause becomes normal, it becomes normal for me to turn my attention to something else, so then I'm not sure if I would notice that something took 10 minutes instead of two. I don't think the tenfold increase in the amount of resolution is the only factor at work — it accounts for one order of magnitude of slowdown, but the slowdown here is on the order of 100x, not 10x. I haven't a clue where the other 10x is going. Using a local Artifactory does help, but not as much you'd hope, since those 442 "Resolving..." lines still each take time. A lot of my wordings here are pretty vague ("does help, but not as much you'd hope", talk about orders of magnitude) because even just gathering data on this, in the presence of different kinds of caching, isn't easy (as my mistake on the sbt version being responsible shows). Sorry for all the rambling, but I don't know what to do at this point besides ramble, hope somebody else has some insight, and/or give up.
|
@cunei in the 442-line "Resolving..." list, a lot of the lines look funny to me:
those don't look like normal dependencies with normal version numbers, is it possible they're a sign of some bug in dbuild...? |
maybe #410 will help |
Unfortunately I've not been able to do a full run overnight. My successful runs have taken a day and a half plus a ton of which is waiting on network IO. |
I appreciate that you have limited time, but you shouldn't underestimate the benefits of people being able to run community builds without having to come through a central bottleneck ... I'd like to see community builds being used widely for releases of all sorts, not just scala/scala. |
I just found a bug in sbt 0.13.13 that might explain this - sbt/sbt#2851 |
Would that also affect 0.13.12? |
Looking at it bit more closely, since the sbt/sbt#2851 seems to break only the build level keys, it might not be the cause of this slowdown. |
@SethTisue I am looking into it. The versions for org.apache#apache seem correct, those are real version numbers: http://search.maven.org/#search%7Cgav%7C1%7Cg%3A%22org.apache%22%20AND%20a%3A%22apache%22 |
@milessabin is it only the first project built where the update on the build definition takes so long...? or is it every project? (my current suspicion is that it's only actually slow once per dbuild run, in which case this might not be quite as bad a problem as we feared?) |
It's every project, at least at the top level. So, 10+ min hang for each of the thirty odd projects. |
@milessabin I have a few suggestions.
All of the necessary remote repositories should be added as remotes within Artifactory, rather than being listed in the "resolvers" file. When that is done, dbuild can be called using the " |
@SethTisue The duplicate lines during resolution appear during a stand-alone sbt resolution, so it is not something dbuild-specific. Concerning the large number of downloaded dependencies, I am currently checking whether that is due to the additional plugin injected by dbuild, or some other factors. |
@cunei isn't that meant to be
|
@cunei thanks for that, very helpful. I'll try setting up a local artifactory and report back. You say,
Could shared vs. individual .ivy2 repositories be made a configurable option? |
@milessabin Unfortunately that would not work; within dbuild there may be bizarre situations in which you have identically named artifacts which are actually different; they are internally maintained by storing them into a specially built artifacts store, in order to keep them apart. They are then "rematerialized" in various directories during build as needed, in order to keep Ivy happy. So, a single cache won't work very well. |
@dwijnand Right on both counts. For me it's usually ``~/.sbt/re<TAB!>", so I got the name incorrectly. 👍 |
@cunei you would think so, but in my experience, it's still pathologically slow even with a local Artifactory |
I dunno, perhaps both the Artifactory on scala-ci.typesafe.com, and my local Artifactory, are misconfigured in the same way, and that's ruining performance in both scenarios? I checked and my Artifactory has "Unused Artifacts Cleanup Period" set to empty, which should mean it never discards old artifacts. What about "Metadata Retrieval Cache Period" — currently defaulted to 600 seconds. Maybe I should try setting that to a much longer value, like 24 hours or something? |
It might also help to tell Artifactory which artifacts we expect to find on which remote resolvers. (A piece of advice I found online: "I highly recommend adding include/exclude rules on the remote repositories for the list of groupId/artifacts you are expecting. This will tell Artifactory to completely stop sending request outside for irrelevant artifacts") I'm more and more convinced this must be an Artifactory config issue. Ivy is hanging on |
@cunei but is it possible to use a single cache only when resolving dependencies for build definitions? we don't build sbt plugins in the community build anymore, so these are all just normal Scala 2.10 (or plain old Java) binary artifacts. we don't need, or want, any special dbuild magic to happen with them. |
@SethTisue dbuild rewrites version numbers and artifact names as it works, and it starts with a fresh ivy2 cache per project. Once each project has been compiled, you would have to go hunting what the local build created, and in which way it polluted the ivy2 cache, in order to clean it and prepare it for a new project; it would be a bit tricky. |
On my system and my local Artifactory, using dbuild, I see resolutions like:
So, definitely fast. |
I can confirm that a local Artifactory fixes this problem. I think it would be a good idea to guide people in that direction from the outset. |
hmm, not for me. can you attach your config? (Artifactory can dump it out as XML) |
@SethTisue On my instance, the values for "Metadata Retrieval Cache Period" all default to 43200 (12 hours). I suspect that may be related to the "slow-sometimes-then-fast-again" behavior that I sometimes observe, and to the fact that it's mostly slow in your case. Not sure that is the answer, but it seems worth trying. Also, I'm on Artifactory 4.x, in case that makes a difference. |
My local Artifactory configuration is here ... suggestions for improvements most welcome. |
I thiiiiiiiiink that jacking up retrievalCachePeriodSecs to a big number (I picked 86400 = 24 hours) may have solved my performance problems. As in the first project built during first run of the day is still slow (because expired metadata on everything), but then it gets faster from then on. I'll keep observing the behavior to try to be more sure. |
My first successful 2.12.x branch community build after setting up a local Artifactory (so, from cold) took about 3.5 hours. Is that comparable to the sort of times you're seeing? |
I haven't done a cold timing recently, but 3.5 hours sounds very good — faster than it runs on Jenkins. I did some work on this over the weekend — starting with a fresh local Artifactory config with only Maven Central, and then:
That, in combination with a nice long "Metadata Retrieval Cache Period" on all resolvers, seems to have fixed my own performance problems to my own satisfaction. This is a little different than what Miles did, where his local Artifactory just retrieves everything from the Artifactory on scala-ci. I took a different approach because my goal is for my new config to eventually replace the one on scala-ci, and hopefully that will result in improved performance for everybody, including our Jenkins runs, and people who are running locally using our Artifactory, and people like Miles who have a local Artifactory that points to ours. But in order to make that happen, I'd have to upgrade the scala-ci Artifactory to a newer version, otherwise the XML config doesn't carry over (and I don't want to keep multiple versions in sync by hand). So, leaving this ticket open, but it is now fixed at least to the extent that:
I plan to return to this in 2017. |
fwiw, locally I'm on Artifactory 4.14.3 now and my current config (which does not use the scala-ci Artifactory for anything) is https://gist.github.com/SethTisue/04dfcd01480882e4cf359d8e60e3f054 |
@SethTisue I'm tempted to go with your config. Is there any reason why I ought to prefer chaining off the scala-ci Artifactory rather than going direct to the upstream repos as you've done? |
@milessabin maintenance — chaining is set-and-forget, the real list of resolvers needs updating, roughly once/month but having your own config gives you more control, which might matter to you if (as I do) you use it for everything you do, not just for the community build |
That sounds reasonable. However I'm not convinced my current set up makes that as smooth as it might be. I ended up chaining several repos from the scala-ci Artifactory and then adding a couple more direct ones. Presumably I could chain them all from scala-ci. But then I need to know which scala-ci repos I ought to be chaining. Would it be possible to have scala-ci advertise a single (virtual?) repository which would do the whole job? |
Probably, since that's what I have in my local config — a single virtual repository, with two entries in my |
Yeah, fumbling around here too :-) |
TIL about the "Eagerly Fetch Jars" option. it isn't on by default, and I didn't have it on in my local config (I do now). it seems plausible that turning it on for all of the remote repos could help. (we already have it enabled in the scala-ci Artifactory) |
I've added my local config to the repo: https://github.com/scala/community-builds/blob/2.12.x/artifactory.xml |
seems unlikely anyone will pursue this further. having an Artifactory config in the repo for people to use is fine. not that many people want to do local runs; it's fine to use Jenkins most of the time. |
I think scala/scala-dev#720 and 0a06655 may be some help here... don't know how much |
(scala/scala-dev#720 cut the number of virtual repos from 3 to 2, by eliminating dbuild-unchecked, but I didn't try to go all the way to 1 in this round) |
When running the community build locally I see very long (>= 10 mins) pauses coinciding with log output of the form,
which I believe corresponds to an sbt update in the project about to be built. During these pauses I can see that the community build process is blocked on a socket read from,
which corresponds to https://scala-ci.typesafe.com/. Is it possible that this machine is seriously overloaded? Or that it has limited connectivity with Europe?
The text was updated successfully, but these errors were encountered: