Skip to content

spawnUri load Dart files asynchronously #12617

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
nex3 opened this issue Aug 21, 2013 · 25 comments
Closed

spawnUri load Dart files asynchronously #12617

nex3 opened this issue Aug 21, 2013 · 25 comments
Assignees
Labels
area-vm Use area-vm for VM related issues, including code coverage, and the AOT and JIT backends. closed-duplicate Closed in favor of an existing report library-isolate type-enhancement A request for a change that isn't a bug

Comments

@nex3
Copy link
Member

nex3 commented Aug 21, 2013

Right now, spawnUri blocks the entire process while loading the Dart files necessary to start the isolate. This is problematic not only because spawnUri is performing IO, which is otherwise always done asynchronously in Dart, but because it means that neither the library passed to spawnUri nor any of its transitively imported libraries can be HTTP URLs pointed back at the spawning process.

@nex3
Copy link
Member Author

nex3 commented Aug 26, 2013

It's worth noting that an API that returns Future<SendPort> would be desirable for this, rather than API that returns SendPort and buffers messages internally. If the API returns a future, it's possible for a caller to determine when [spawnUri] is finished doing file IO, so it can e.g. clean up temp files.

@floitschG
Copy link
Contributor

We are discussing making the spawning an asynchronous operation.

I don't see what cleanup the spawner can do. I don't think the VM gives any guarantees that it won't do any file IO anymore once an isolate has started. At the moment the VM loads all files synchronously, but it doesn't need to. Libraries that are prefixed, for example, can be loaded on demand (by the VM). This would speed up the startup.

@nex3
Copy link
Member Author

nex3 commented Aug 28, 2013

In our case, the only file we'd need to delete is the entrypoint file. In general, though, having the file-loading behavior be lazy at the discretion of the VM seems like it would make it difficult to reason about in dangerous ways. If I start running a test script, for example, I should be able to safely start editing files as soon as I see that the script has emitted output. If some file is being secretly lazy-loaded and I happen to edit it, it could cause my test to behave in confusing and inconsistent ways. This is especially important for barback's transformers, where isolates are being spawned and sources are being changed in quick succession.

In any case, right now [spawnUri] blocks on file IO, which is clearly wrong.

@floitschG
Copy link
Contributor

Assigning to VM.
spawnUri is not supposed to wait for all file-operations.


cc @iposva-google.
Removed Area-Library label.
Added Area-VM label.

@alxhub
Copy link

alxhub commented Dec 20, 2013

I was just bitten by this. In my case, I was attempting to spawnUri() on a URI that's served by an HttpServer in the same VM, and it deadlocks.

@justinfagnani
Copy link
Contributor

It seems that spawnUri doesn't block the whole VM, but just the spawning isolate, which means a workaround is to call spawnUri from another isolate. You won't be able to send the Isolate instance back as a return value, but that's not an issue until Isolate does something.

@nex3
Copy link
Member Author

nex3 commented Jun 5, 2014

Currently our workaround for this issue in pub is adding about 600ms to the time to spawn each isolate. On todomvc, this leads to a total of 2.4s being spent loading buffer isolates when running "pub serve" or "pub build".

@a-siva
Copy link
Contributor

a-siva commented Jun 6, 2014

I am not sure I understand this workaround of adding 600ms to the time is needed to spawn each isolate. Could you elaborate a bit more.

@nex3
Copy link
Member Author

nex3 commented Jun 6, 2014

Pub loads barback transformer plugins by spawning an isolate that loads its code from an HTTP server also run by pub. Because [Isolate.spawnUri] blocks synchronously until all the sources for the isolate are loaded, this creates a deadlock if it's called directly from the main pub isolate, since the spawning isolate will make an HTTP that the (blocked) main isolate will be unable to respond to.

To work around this, the main isolate spawns a buffer isolate that in turn spawns the desired child isolate. The buffer isolate is then the one that blocks on the sources being loaded, leaving the main isolate free to serve the files being requested. Spawning this buffer isolate is what takes 600ms.

@iposva-google
Copy link
Contributor

In comment #­9 you claim that spawning a small helper isolate takes 600ms. Based on my measurements this claim is more than an order of magnitude off. Here is a sample program and its output:

import "dart:isolate";

isolated(msg) {
  var sp = msg[0];
  var str = msg[1];
  sp.send("$str from isolate");
}

main() {
  var rp = new RawReceivePort();
  var msg = new List(2);
  msg[0] = rp.sendPort;
  msg[1] = "Hello";
  var sw = new Stopwatch();

  rp.handler = (msg) {
    print(msg);
    rp.close();
    sw.stop();
    print("Isolate roundtrip: ${sw.elapsedMicroseconds}us");
  };
  
  sw.start();
  Isolate.spawn(isolated, msg);
}

dart ~/dart/Bug12617.dart
Hello from isolate
Isolate roundtrip: 43.387ms

How come you report such a discrepancy?

@lrhn
Copy link
Member

lrhn commented Jun 6, 2014

Does the time to do an Isolate.spawn depend on the size of the isolate (e.g., the number of libraries used)?

@nex3
Copy link
Member Author

nex3 commented Jun 6, 2014

I suspect the difference is caused by issue #6610. Pub has a lot more source code than your sample script.

@munificent
Copy link
Member

How come you report such a discrepancy?

Your program is tiny. When you call spawn, it only reloads 25 lines of code.

Pub is about 34,000 lines. If I stuff a bunch of dead code in your example program, the time increases proportionally. Splitting into separate files would probably likewise increase the time.

@dgrove
Copy link
Contributor

dgrove commented Jun 6, 2014

@nweiz/rnystrom - are you guys running from a snapshot or from .dart files for these measurements?

@nex3
Copy link
Member Author

nex3 commented Jun 6, 2014

From the source files. Running from a snapshot, this is much closer to the numbers Ivan is reporting.

@iposva-google
Copy link
Contributor

Thanks for the confirmation. As we keep restructuring the loading code to be more asynchronous to support deferred loading we will get closer to be able to properly solve this. Currently, even when loading the sources of the new isolate asynchronously we cannot let the originating isolate run free.

@munificent
Copy link
Member

In our meeting last week, Ivan, you agreed this would happen. What's the time frame?

@nex3 nex3 added Type-Defect area-vm Use area-vm for VM related issues, including code coverage, and the AOT and JIT backends. library-isolate labels Sep 9, 2014
nex3 added a commit to dart-lang/test that referenced this issue Jun 8, 2015
Since we don't load isolates from code being generated by the main
isolate, there's no need to work around dart-lang/sdk#12617. This
dramatically decreases load times for VM tests.
@kevmoo kevmoo added type-bug Incorrect behavior (everything from a crash to more subtle misbehavior) and removed triaged labels Feb 29, 2016
@iposva-google iposva-google added the closed-obsolete Closed as the reported issue is no longer relevant label May 19, 2016
@nex3
Copy link
Member Author

nex3 commented Jun 8, 2016

This still reproduces.

@nex3 nex3 reopened this Jun 8, 2016
@nex3 nex3 removed the closed-obsolete Closed as the reported issue is no longer relevant label Jun 8, 2016
@nex3
Copy link
Member Author

nex3 commented Dec 27, 2016

It looks like this no longer reproduces, at least with Dart 1.21. I'm leaving it open for the time being since AFAIK it was fixed accidentally, so there probably still need to be regression tests written.

nex3 added a commit to dart-lang/pub that referenced this issue Dec 27, 2016
It seems like dart-lang/sdk#12617 got fixed at some point.
@zanderso zanderso added type-enhancement A request for a change that isn't a bug and removed type-bug Incorrect behavior (everything from a crash to more subtle misbehavior) labels Jan 5, 2017
@zanderso
Copy link
Member

zanderso commented Jan 5, 2017

@a-siva Did you fix this? I am going to close it with "Cannot reproduce". We have so many bugs, we probably shouldn't leave open things we can't reproduce. If it happens again, please file a new bug with updated information and repro steps.

@zanderso zanderso closed this as completed Jan 5, 2017
@zanderso zanderso added the closed-cannot-reproduce Closed as we were unable to reproduce the reported issue label Jan 5, 2017
@nex3
Copy link
Member Author

nex3 commented Jan 5, 2017

@zanderso Can we get some tests for this if we're going to close it? It was a long-standing issue, and if it starts cropping up again code is likely to break.

@zanderso
Copy link
Member

zanderso commented Jan 5, 2017

I'm just cleaning up the issue tracker. I assume that @a-siva has used good judgement as far as testing goes, and has added whatever tests are appropriate. If you find otherwise, then we should open a new issue.

@nex3
Copy link
Member Author

nex3 commented Jan 5, 2017

@zanderso Added tests where? There are no SDK CLs that refer to this issue.

@zanderso
Copy link
Member

zanderso commented Jan 5, 2017

@nex3 I don't think your request for information about tests is germane to this issue. You're probably going to have to schedule a meeting with someone to get your questions answered.

@a-siva
Copy link
Contributor

a-siva commented Jan 5, 2017

I did not fix this, I would have closed the bug if I fixed it. I believe this is a duplicate of the first part of #24849 which was fixed by @turnidge

Regarding tests we have a number of isolate tests in the regression suite and writing a test to measure the time it takes to spawn an isolate is not the right test. For this case the right test would be the one described in one of the comments above "I was attempting to spawnUri() on a URI that's served by an HttpServer in the same VM, and it deadlocks".

Let us leave this issue closed as a duplicate of 24849 and use that issue to track addition of a test.

@zanderso zanderso added closed-duplicate Closed in favor of an existing report and removed closed-cannot-reproduce Closed as we were unable to reproduce the reported issue labels Jan 6, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-vm Use area-vm for VM related issues, including code coverage, and the AOT and JIT backends. closed-duplicate Closed in favor of an existing report library-isolate type-enhancement A request for a change that isn't a bug
Projects
None yet
Development

No branches or pull requests