-
Couldn't load subscription status.
- Fork 31
Task: avoid double handler installation #101
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Task: avoid double handler installation #101
Conversation
Explanation of the suspected issue: (This suspected issue was noticed by @clef-men; he convinced me that this looks credible and we wrote the proposed fix together.) The Task implementation seems to be slightly wasteful in that, if we understand correctly, it may install its effect handler twice. This occurs in the following scenario: - call Task.run, under which you - create a promise with `async` - `await` this promise - when `await` gives control back, two `step` handlers are on the stack Two handlers are on the stack because: - when the current task is paused by `await`, the continuation `k` contains the deep handler of the surrounding `step` call: ```ocaml let cont v (k, c) = Multi_channel.send c (Work (fun _ -> continue k v)) ``` invoking this Work function will thus reinstate the surrounding deep handler. - when this continuation is received by another worker, `step` is called again, installing a second handler ```ocaml let rec worker task_chan = match Multi_channel.recv task_chan with | Quit -> Multi_channel.clear_local_state task_chan | Work f -> step f (); worker task_chan ``` At this point, if we understand correctly, there are two `step` handlers on the call stack. Note that this does not grow to an unbounded number of nested handlers: on the next Wait effect, the inner handler either continues immediately (still 2 handlers) or pushes the current continuation to a Pending queue and returns, popping the two handlers. Once this continuation is invoked again under `step`, we are back to 2 handlers. Note that in some case the current implementation does need the `step` call in `worker`, because the function argument of `Work f` does not systematically include a handler: ```ocaml let async pool f = let pd = get_pool_data pool in let p = Atomic.make (Pending []) in Multi_channel.send pd.task_chan (Work (fun _ -> do_task f p)); p ``` Explanation of our proposed fix: This PR proposes to fix the issue by enforcing the invariant that the Work function always includes its own handler for Task effects: for the Work functions that use `continue` and `discontinue` there is nothing to do, for the Work function of `async` we call `step` at this point. Another approach would be to use a shallow handler, and use `step` in the same way as currently (around each Work function), but this may be slightly slower -- we would be exactly encoding a deep handler using a shallow handler. (The current code reads like the authors had the shallow-handler semantics in mind.) Co-authored-by: Clement Allain <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR @gasche @clef-men. The explanation makes sense to me.
I don't have a good mental model of how the handlers are implemented under the hood, I could confirm this by walking through it in gdb. (@kayceesrk or others would know about this)
CHANGES: * Add parallel_find (ocaml-multicore/domainslib#90, @gasche) * Update CI (ocaml-multicore/domainslib#93, @Sudha247) * Optimisation to work-stealing (ocaml-multicore/domainslib#96, @art-w) * Improve docs presentation (ocaml-multicore/domainslib#99, @metanivek) * Property based tests (ocaml-multicore/domainslib#100, jmid) * Task: avoid double handler installation (ocaml-multicore/domainslib#101, @gasche & @clef-men) * Fix a benign data-race in Chan reported by ocaml-tsan (ocaml-multicore/domainslib#103, @art-w) * Dune, opam, and GitHub Actions fixes (ocaml-multicore/domainslib#105, @MisterDA) * domain local await support (ocaml-multicore/domainslib#107, @polytypic) * Windows run on GitHub Actions (ocaml-multicore/domainslib#110, @Sudha247) * Adjust PBTs based on recommended_domain_count (ocaml-multicore/domainslib#112, @jmid) * Test condition tweaks (ocaml-multicore/domainslib#113, @jmid)
CHANGES: * Add parallel_find (ocaml-multicore/domainslib#90, @gasche) * Update CI (ocaml-multicore/domainslib#93, @Sudha247) * Optimisation to work-stealing (ocaml-multicore/domainslib#96, @art-w) * Improve docs presentation (ocaml-multicore/domainslib#99, @metanivek) * Property based tests (ocaml-multicore/domainslib#100, jmid) * Task: avoid double handler installation (ocaml-multicore/domainslib#101, @gasche & @clef-men) * Fix a benign data-race in Chan reported by ocaml-tsan (ocaml-multicore/domainslib#103, @art-w) * Dune, opam, and GitHub Actions fixes (ocaml-multicore/domainslib#105, @MisterDA) * domain local await support (ocaml-multicore/domainslib#107, @polytypic) * Windows run on GitHub Actions (ocaml-multicore/domainslib#110, @Sudha247) * Adjust PBTs based on recommended_domain_count (ocaml-multicore/domainslib#112, @jmid) * Test condition tweaks (ocaml-multicore/domainslib#113, @jmid)
Explanation of the suspected issue:
(This suspected issue was noticed by @clef-men; he convinced me that this looks credible and we wrote the proposed fix together.)
The Task implementation seems to be slightly wasteful in that, if we understand correctly, it may install its effect handler twice. This occurs in the following scenario:
asyncawaitthis promiseawaitgives control back, twostephandlers are on the stackTwo handlers are on the stack because:
when the current task is paused by
await, the continuationkcontains the deep handler of the surroundingstepcall:invoking this Work function will thus reinstate the surrounding deep handler.
when this continuation is received by another worker,
stepis called again, installing a second handlerAt this point, if we understand correctly, there are two
stephandlers on the call stack. Note that this does not grow to an unbounded number of nested handlers: on the next Wait effect, the inner handler either continues immediately (still 2 handlers) or pushes the current continuation to a Pending queue and returns, popping the two handlers. Once this continuation is invoked again understep, we are back to 2 handlers.Note that in some case the current implementation does need the
stepcall inworker, because the function argument ofWork fdoes not systematically include a handler:Explanation of our proposed fix:
This PR proposes to fix the issue by enforcing the invariant that the Work function always includes its own handler for Task effects: for the Work functions that use
continueanddiscontinuethere is nothing to do, for the Work function ofasyncwe callstepat this point.Another approach would be to use a shallow handler, and use
stepin the same way as currently (around each Work function), but this may be slightly slower -- we would be exactly encoding a deep handler using a shallow handler. (The current code reads like the authors had the shallow-handler semantics in mind.)Co-authored-by: Clement Allain [email protected]