Skip to content

Conversation

tlively
Copy link
Member

@tlively tlively commented May 13, 2024

The stack.new instruction binds some argument to the function that will be
called on the new stack, leaving the rest to be supplied by stack.switch.
Previously the arguments bound by stack.new were a prefix of the function
arguments, but change it so that they are a suffix of the function
arguments (besides the return stack reference) instead.

This allows us to relax the typing of stack.new_switch to be able to use stack
types that only send a prefix of the function arguments while maintaining the
property that (stack.new_switch x y) can be rewritten as (stack.new x y) (stack.switch x). It also gives us the new property that (stack.new x y) (stack.switch x) can be rewritten as (stack.new_switch x y); previously
that was only true if the parameters of the stack type at x matched the
parameters of the function y or if the stack type had no parameters besides
the return stack reference.

@tlively tlively requested a review from fgmccabe May 13, 2024 21:43
@tlively tlively mentioned this pull request May 13, 2024
Copy link
Member Author

tlively commented May 13, 2024

@fgmccabe
Copy link
Collaborator

I don't see how this change "allows us to relax the typing of stack.new_switch to be able to use stack
types that only send a prefix of the function arguments". In particular, the remaining arguments must come from 'somewhere' and there does not seem to be any further opportunity to complete the list of arguments.

In addition, did you mean:
(stack.new_switch x y) can be rewritten as (stack.new y) (stack.switch x)

?


`stack.new_switch x_1 y` both allocates a new stack and switches to it. It is equivalent to `(stack.new x y) (switch x)`, but engines should be able to implement it more efficiently because it calls the function immediately without having to stage the arguments anywhere.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean
(stack.new y)(switch x)
?
(Otherwise, there is an extraneous x floating around)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, because stack.new takes both the stack type index x and the function index y.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

got it.

@tlively
Copy link
Member Author

tlively commented May 13, 2024

I don't see how this change "allows us to relax the typing of stack.new_switch to be able to use stack types that only send a prefix of the function arguments". In particular, the remaining arguments must come from 'somewhere' and there does not seem to be any further opportunity to complete the list of arguments.

The full list of required arguments comes from the function type, and the types given in the stack type just need to match a prefix of those.

@fgmccabe
Copy link
Collaborator

The arguments to the coroutine function must be provided in total. However, those arguments can come from a combination of the arguments provided at stack.new and the remainder can come from those provided at switch.
However, validation of stack.new requires that the vector of y* x* matches the function type f*; and that the stack type matches y_1* .. for the appropriate definition of matches. (I.e., f:f*->() and y* x* = f* and (stack y* rt) matches the stack type mentioned in stack.new (and in switch).

Is this also your understanding?

@fgmccabe
Copy link
Collaborator

There is something else that is worth mentioning:

The normal (but not universal) ABI for a function call is for the arguments to be evaluated in a left-to-right order. This is currently strongly encouraged in wasm.

However, the order of evaluation of arguments in a stack.new/switch is not expected to (does not) follow this pattern.

@tlively
Copy link
Member Author

tlively commented May 13, 2024

Is this also your understanding?

Yes, I think that sounds right.

There is something else that is worth mentioning:

The normal (but not universal) ABI for a function call is for the arguments to be evaluated in a left-to-right order. This is currently strongly encouraged in wasm.

You're talking about ABIs on physical machines, right? How is this encouraged in Wasm?

However, the order of evaluation of arguments in a stack.new/switch is not expected to (does not) follow this pattern.

Can you explain this in more detail? Is this something you think should be fixed, documented in the explainer, or just noted here in the comments?

@slindley
Copy link
Collaborator

The normal (but not universal) ABI for a function call is for the arguments to be evaluated in a left-to-right order. This is currently strongly encouraged in wasm.

You're talking about ABIs on physical machines, right? How is this encouraged in Wasm?

However, the order of evaluation of arguments in a stack.new/switch is not expected to (does not) follow this pattern.

Can you explain this in more detail? Is this something you think should be fixed, documented in the explainer, or just noted here in the comments?

I think what @fgmccabe is saying (correct me if I'm wrong!) is that when we do partial application it is conventional to supply a prefix of the arguments (left-to-right) rather than a suffix (right-to-left) as you are doing with the form of partial application you're building into stack.new.

For instance in OCaml we can do something like this:

# let f (x : int) (y : bool) = (x, y);;
val f : int -> bool -> int * bool = <fun>
# let g = f 42;;
val g : bool -> int * bool = <fun>
# g true;;;
- : int * bool = (42, true)

The function g partially applies f by supplying 42 as its first argument.

Also see the typing rule for cont.bind in the typed continuations proposal: https://github.com/WebAssembly/stack-switching/blob/29e5c4314cd71b0d74baaac8e56e099f83abbe44/proposals/continuations/Overview.md
which partially applies a continuation to a prefix of its arguments.

Speaking of which, what's the rationale for hard-wiring partial application into stack.new rather than having a separate stack.bind instruction analogous to cont.bind?

Perhaps your thinking is that the former can be easier to optimise (albeit at the cost of being less expressive)?

I think questions over to what extent we support partial application / closures for stack-switching in Wasm are (like a number of other design questions) largely orthogonal to whether we're talking about BoS or WasmFX.

@tlively
Copy link
Member Author

tlively commented May 14, 2024

Ah yes, I agree that it seems initially less surprising to bind the arguments from left to right (although I don't believe we have precedent in standardized Wasm). I also agree that whether we allow partial applications is a discussion we can have separately. If we do allow partial application, however, there would be strong benefits to doing it as described in this PR, even though it is counterintuitive.

@conrad-watt
Copy link
Contributor

conrad-watt commented May 14, 2024

If we do allow partial application, however, there would be strong benefits to doing it as described in this PR, even though it is counterintuitive.

Can you expand on this (is this point related to conventions in Binaryen or LLVM)? My intuition is also that binding should be left-to-right, and if we consider partial binding of funcrefs in future, she should try to stay consistent with whatever we decide for stacks.

EDIT:

as a knee-jerk argument in favour of left-to-right binding, any source-level functional language with partial application will expect left-to-right binding, so making a different decision in Wasm will make it harder for these languages to map their functions down to Wasm functions+bind.

@tlively
Copy link
Member Author

tlively commented May 14, 2024

It's not because of any conventions, it's because only right-to-left binding gives the property that (stack.new x y) (switch x) can be rewritten as (stack.new_switch x y) for any valid x and y. If we bind left-to-right, this only works for some valid x and y, not all of them, so binding right-to-left provides more optimization opportunities in binaryen.

Binding right-to-left might create a mismatch with the source that causes some issues for producers, but I anticipate that root functions will generally be provided by the producer-owned runtime rather than arbitrary source code, so I expect that producers will be able to use right-to-left binding without issue.

@fgmccabe
Copy link
Collaborator

In support of 'partial application':

Without partial application, an instruction like stack.new (or cont.new) is arguably nearly useless. To see this, consider creating a generator; let's call the generator 'walk'. The walk function walks over a binary tree, yielding each element it finds in response to a 'next' prompt.

The question is, how does walk know which tree to walk over: without partial application, it becomes extremely difficult to parameterize walk with the right tree.

Note that this is not really an issue with stack.new_switch -- since this can be iconified as 'call a function on a new stack'. Using stack.new_switch it is straightforward to pass in to the walk function the correct tree to walk.

@conrad-watt
Copy link
Contributor

conrad-watt commented May 14, 2024

@tlively apologies, I've re-read the OP more carefully

It also gives us the new property that (stack.new x y) (stack.switch x) can be rewritten as (stack.new_switch x y); previously that was only true if the parameters of the stack type at x matched the parameters of the function y...

Am I correct in understanding that this is a special case of a general impedance involving Wasm's value stack? That is, you effectively want to optimise g arg2 (f arg1) to (flip g)∘f arg1 arg2 since there's an efficient (flip g)∘f available (stack.new_switch), but in Wasm this would require flipping the order of the stack arguments - e.g. arg2 arg1 (call f) (call g) would have to become arg1 arg2 (call (flip g)∘f).

If I'm getting this correct, I'm not sure that supporting an "easy" version of this optimisation for this special case is worth going against the left-to-right binding convention that is pretty ubiquitous across other languages. Side question - can the "argument flipping" version of this optimisation still be performed in the left-to-right binding scenario, or is it not practical?

EDIT: the "abstract setting" above is a little messier than I first thought because g has to become flip g to make the argument order work while retaining the analogy to stack.new, switch, and stack.new_switch. I think the more practical question is - if we kept left-to-right binding, could the stack.new... stack.switch to stack.new_switch optimisation be retained by rearranging the order of the provided arguments during the optimisation?

@titzer
Copy link
Contributor

titzer commented May 14, 2024

+1 to making sure we align any partial application for stack switching to what we might do for normal function partial application. In particular, we had func.bind but kicked it out of the function references MVP. I also wouldn't want to overly weight a Binaryen convenience at this stage.

@tlively
Copy link
Member Author

tlively commented May 14, 2024

@conrad-watt, yes, this funkiness is due to the behavior of the value stack.

If we have left-to-right binding, then we could still optimize to stack.new_switch. In the worst case it would require extra locals to shuffle operands into the correct locations and it would require synthesizing a new stack type. In the best case, the operands wouldn't have side effects and could simply be moved past each other into the correct positions. This is all well-supported in Binaryen since performing any kind of peephole optimization may require similar operand shuffling.

v2* v1* (stack.new x y) (switch x) => v1* v2* (stack.new_switch x' y)
-- expand(C.funcs[y]) = func t1* t2* rt -> []
-- C |- v1* : t1*
-- C |- v2* : t2*
-- expand(C.types[x]) = stack t2* rt
-- expand(C.types[x']) = stack t1* t2* rt

We could avoid having to synthesize a new stack type if we allowed the stack type immediate for stack.new_switch to specify a prefix of the parameters required by the function immediate, but then we would lose the property that (stack.new_switch x y) could be rewritten as (stack.new x y) (stack.switch x), which would slightly complicate the reduction rules.

I just figured it would be nicest if we arranged the types so the rewriting was always valid without any extra complication :)

@fgmccabe
Copy link
Collaborator

@conrad-watt: The argument order question can only be understood if you go significantly closer to what is happening on the machine in terms of register manipulation. It may also be that some of the issues are specific to stack switching: because passing arguments when switching stacks can strongly interfere with how arguments are passed to functions.

In a normal use of the switch instruction, there is good reason to expect that the parameters of the switch are prepared in a way that is strongly reminiscent of how arguments are prepared for a function call -- probably in a left-to-right order since that is how wasm prefers it.

So, at the machine level, a switch can be implemented as:

prepare arguments in registers (if possible)
prepare return stack reference (in a register pair if possible)
spill values as required into target stack (a normal function call does this by spilling onto the same stack)
flip stack pointers
jump to target instruction stream

In a perfect world, (i.e., not V8) this involves two additional operations over function call: prepare return stack reference and flip stack pointers. The first takes the place of 'prepare return address' of a normal function call.

The net of this is that a stack switch should be executable in the same order of magnitude as a function call (perhaps 1.5x number of instructions).

Implementing the register shuffle necessary to keep the overall left-to-right order could double this (it's linear in the number of arguments).

(Apologies for the 'wall of text')

@tlively
Copy link
Member Author

tlively commented May 14, 2024

To clarify, @fgmccabe and I are describing completely separate arguments for why we might prefer right-to-left partial application. I was describing how it makes things nicer in the spec and in producer-side optimizers like Binaryen. @fgmccabe is describing how it can make the engine implementation of switching to a newly allocated stack more efficient when stack.new and switch are executed separately, specifically because the parameters provided at switch time can be placed directly in the initial argument registers, independent of how many additional parameters have already been bound by stack.new.

@conrad-watt
Copy link
Contributor

conrad-watt commented May 15, 2024

@tlively

We could avoid having to synthesize a new stack type if we allowed the stack type immediate for stack.new_switch to specify a prefix of the parameters required by the function immediate

Since stack.new_switch actually switches to executing the new function, how is it possible to only provide a prefix of the parameters - do you not always need to supply all of them?

EDIT: or put another way, can the input stack type annotation of stack.new_switch just be removed, or alternatively replaced with a direct annotation of the "output" stack type if there's a section ordering issue involving types and funcs?

@fgmccabe

Implementing the register shuffle necessary to keep the overall left-to-right order could double this (it's linear in the number of arguments).

This is an interesting point that I need to think more about. I think I can see the potential benefits at the Wasm->asm level. There are still potentially counterbalancing impedances at the source->Wasm level - for example if a compiler targetting Wasm needs to roll its own "partially-applied function/continuation" abstraction using GC types, but could have avoided this with left-to-right binding, this would seem to outweigh any shuffling cost in the runtime.

@tlively
Copy link
Member Author

tlively commented May 15, 2024

I guess stack.new_switch doesn't really need to take a stack type immediate at all since the parameters are entirely determined by the function type, including the return stack reference type.

@rossberg
Copy link
Member

Without partial application, an instruction like stack.new (or cont.new) is arguably nearly useless. To see this, consider creating a generator; let's call the generator 'walk'. The walk function walks over a binary tree, yielding each element it finds in response to a 'next' prompt.

Right, but this problem is not specific to the initial resumption of a stack. In our experience, the same situations arise rather frequently with non-initial resumptions when building interesting control abstractions. That is why the continuations proposal added cont.bind, which provides a general and orthogonal mechanism to deal with this. I'm not convinced there is any advantage in entangling partial application with stack creation, which appears both more complex and less useful.

tlively added 2 commits May 17, 2024 10:43
The `stack.new` instruction binds some argument to the function that will be
called on the new stack, leaving the rest to be supplied by `stack.switch`.
Previously the arguments bound by `stack.new` were a prefix of the function
arguments, but change it so that they are a suffix of the function
arguments (besides the return stack reference) instead.

This allows us to relax the typing of `stack.new_switch` to be able to use stack
types that only send a prefix of the function arguments while maintaining the
property that `(stack.new_switch x y)` can be rewritten as `(stack.new x
y) (stack.switch x)`. It also gives us the new property that `(stack.new x
y) (stack.switch x)` can be rewritten as `(stack.new_switch x y)`; previously
that was only true if the parameters of the stack type at `x` matched the
parameters of the function `y` or if the stack type had no parameters besides
the return stack reference.
@tlively tlively changed the base branch from instruction-type-details to bos-subtyping May 17, 2024 17:45
@tlively
Copy link
Member Author

tlively commented Jul 29, 2024

Closing this as obsolete, although we might want to revisit this in an updated context.

@tlively tlively closed this Jul 29, 2024
dhil pushed a commit that referenced this pull request Aug 2, 2024
Add missing memory specifier to memory.init execution semantics
dhil pushed a commit to dhil/wasm-stack-switching that referenced this pull request Jan 15, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants