-
Notifications
You must be signed in to change notification settings - Fork 74
Document a setjmp/longjmp convention #225
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
SetjmpLongjmp.md
Outdated
binaryen has a conversion from the old instructions to the latest | ||
instructions. (`try_table` etc.) | ||
|
||
* Emscripten (TBD version) has the runtime support for the convention |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i meant the Emscripten version which contains emscripten-core/emscripten#21502 here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The next release, which I expect to be done within a few days now, will contain this, so I think we can remove the 'TBD' part. Also we can have a link to the file here:
https://github.com/emscripten-core/emscripten/blob/main/system/lib/compiler-rt/emscripten_setjmp.c
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will it be 3.1.57?
i prefer to mention versions for each components as not everyone uses the latest.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah 3.1.57 is the next one
@aheejin @sbc100 @sunfishcode please review if you have time. thank you. |
Will be OOO for 3~4 days. Can I take a look after I return? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for writing this!
single longjmp execution. | ||
A runtime can use a part of `jmp_buf` for this structure. It's also ok to use | ||
a separate thread-local storage to place this structure. A runtime without | ||
multi-threading support can simply place this structure in a global variable. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this __WasmLongjmpArgs
struct exposed as part of the ABI between libc and the compiler? If it isn't, would it make sense to remove this section, or at least add a caveat that this section is non-normative?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's a part of the ABI. the compiler-generated code needs to know how to read members of this structure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I misunderstood that part then.
In that case, I wonder if it would make sense to further simplify the ABI, from this:
pop __WasmLongjmpArgs pointer from the operand stack
$env = __WasmLongjmpArgs.env
$val = __WasmLongjmpArgs.val
$label = $__wasm_setjmp_test($env, $func_invocation_id)
if ($label == 0) {
;; not for us. rethrow.
call $__wasm_longjmp($env, $val)
}
to this:
$args = pop __WasmLongJmpArgs pointer from the operand stack
$label = $__wasm_setjmp_test($args, $func_invocation_id)
doing the loading of $val
and $env
, as well as the if ($label == 0)
and the __wasm_longjmp
call while we're at it, all within the __wasm_setjmp_test
call.
That way, we'd have less code inline. Would that make sense?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
well, at least $val needs to be visible to the catching logic as it's the return value of setjmp()
while we can make __wasm_setjmp_test
somehow return $val as well, it seems like the opposite (at least incompatible) direction from multivalue TODO, where we can make $env and $val direct parameters of the exception.
personally i'm ok with either ways.
@aheejin do you have any preference? (asking because i guess the multivalue TODO comment is yours.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As @yamt said, val
needs to be accessible from the catching code, so I'm not sure what's the point of passing arg
to __wasm_setjmp_test
and making it return val
. Also
if ($label == 0) {
;; not for us. rethrow.
call $__wasm_longjmp($env, $val)
}
this part cannot go into __wasm_setjmp_test
, no? It doesn't look like we can save a ton here, regardsless of whether we do multivalue or not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this part cannot go into __wasm_setjmp_test, no?
i guess it can. why not?
__wasm_setjmp_test can just call __wasm_longjmp internally. or throw the exception directly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, yeah, you're right. Would you like to submit a PR to the LLVM repo doing this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, yeah, you're right. Would you like to submit a PR to the LLVM repo doing this?
do you only mean "make __wasm_setjmp_test rethrow"?
or what sunfishcode suggested?
or both?
i added them to the "Future directions" section for now.
SetjmpLongjmp.md
Outdated
``` | ||
void __wasm_setjmp(void *env, uint32_t label, void *func_invocation_id); | ||
uint32_t __wasm_setjmp_test(void *env, void *func_invocation_id); | ||
void __wasm_longjmp(void *env, int val); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be correct to change these void *env
s to jmp_buf env
s? That would make it a little clearer which env
these are talking about.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure.
|
||
#### Reserved area in jmp_buf | ||
|
||
The first 6 words of C jmp_buf is reserved for the use by the runtime. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add a note here making it explicit that the contents of these 6 words are not public?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok.
|
||
## Future directions | ||
|
||
* `__WasmLongjmpArgs` can be replaced with WebAssembly multivalue. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand correctly, this is just an internal implementation detail, right? If so, would it make sense to omit it in this document?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no. as mentioned above, it's a part of the current ABI. (thus making it use multivalue is unfortunately another ABI change.)
of course. no hurry. thank you. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for working on this! Sorry for the late reply, and thank you for waiting for me.
SetjmpLongjmp.md
Outdated
This document describes a convention to implement C setjmp/longjmp via | ||
[WebAssembly exception-handling proposal]. | ||
|
||
[WebAssembly exception-handling proposal]: https://github.com/WebAssembly/exception-handling |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
80% of what this doc describes does not seem to be wasm EH-specific. They also apply to the old Emscripten (JS-based) SjLj. How about stating that most of this doc applies to both ways of SjLj handling described in https://emscripten.org/docs/porting/setjmp-longjmp.html and explaining about some wasm-EH specific parts within the doc, such as __WasmLongjmpArgs
?
single longjmp execution. | ||
A runtime can use a part of `jmp_buf` for this structure. It's also ok to use | ||
a separate thread-local storage to place this structure. A runtime without | ||
multi-threading support can simply place this structure in a global variable. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As @yamt said, val
needs to be accessible from the catching code, so I'm not sure what's the point of passing arg
to __wasm_setjmp_test
and making it return val
. Also
if ($label == 0) {
;; not for us. rethrow.
call $__wasm_longjmp($env, $val)
}
this part cannot go into __wasm_setjmp_test
, no? It doesn't look like we can save a ton here, regardsless of whether we do multivalue or not.
SetjmpLongjmp.md
Outdated
`__wasm_longjmp` is similar to C `longjmp`. | ||
If `val` is 0, it's `__wasm_longjmp`'s responsibility to convert it to 1. | ||
It performs a long jump by filling a `__WasmLongjmpArgs` structure and | ||
throwing `__c_longjmp` exception with its address. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about mentioning that we throw an exception with a tag C_LONGJMP
? It can be hard to associate what this function does with the 'Exception' section above.
Also, given that most of this doc is not exclusive to Wasm EH, we can mention we have emscriptne_longjmp
JS-based EH in which we throw a JS exception.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i added some explanation about JS exception based convention.
i'm not sure how far it should be explained in this doc though.
SetjmpLongjmp.md
Outdated
This convention uses a WebAssembly exception to perform a non-local jump | ||
for C `longjmp`. | ||
|
||
The name of exception is `__c_longjmp`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where does this name come from and what meaning does this name have?
We have this enum
https://github.com/llvm/llvm-project/blob/299b636a8f1c9cb2382f9dce4cdf6ec6330a79c6/llvm/include/llvm/CodeGen/WasmEHFuncInfo.h#L27
but this is just the name of an enum and doesn't have any meaning outside of LLVM. (Also the enum name is C_LONGJMP
)
Should we instead say we assume the tag index 1 to be the longjmp index in C-based toolchain like LLVM?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where does this name come from and what meaning does this name have?
We have this enum
https://github.com/llvm/llvm-project/blob/299b636a8f1c9cb2382f9dce4cdf6ec6330a79c6/llvm/include/llvm/CodeGen/WasmEHFuncInfo.h#L27
but this is just the name of an enum and doesn't have any meaning outside of LLVM. (Also the enum name is C_LONGJMP)Should we instead say we assume the tag index 1 to be the longjmp index in C-based toolchain like LLVM?
the name is exposed to the outside for dynamic-linking modules.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The C symbol name used for the exception tag is ..
At the very least mention that it is a tag that we are talking about?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i improved exception vs tag distinction a bit.
SetjmpLongjmp.md
Outdated
``` | ||
void | ||
f(void) | ||
{ | ||
jmp_buf env; | ||
if (!setjmp(env)) { | ||
might_call_longjmp(env); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
``` | |
void | |
f(void) | |
{ | |
jmp_buf env; | |
if (!setjmp(env)) { | |
might_call_longjmp(env); | |
} | |
} | |
```c | |
void f(void) { | |
jmp_buf env; | |
if (!setjmp(env)) { | |
might_call_longjmp(env); | |
} | |
} |
- Nit: LLVM and Emscripten's clang-format style
- Specify the block is in C for (slightly) better highlighting
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: LLVM and Emscripten's clang-format style
well, although i have no strong opinions about the style, this is not llvm or emscripten.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not? I pasted the code block into any LLVM source file and clang-formatted it, and it showed no changes (from what I suggested). LLVM's clang-format style is two-space indentation. Which part do you think this is different from LLVM's clang-format style? (And Emscripten's .clang-format
uses two-space indentation too)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i meant that this repo (tool-conventions) is not a part of emscripten or llvm. thus does not necessarily follow their coding style.
if you like their coding style for some reasons, it's fine. i can change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I see what you meant. Yeah I know it's a nitpick (which is the reason I said 'Nit') but I just preferred it to be more like the code style we work on most of the times. Also having a separate line for the return type doesn't seem to be very common either (unless the line exceeds 80-col). I'd appreciate if you change it. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but I just preferred it to be more like the code style we work on most of the times. Also having a separate line for the return type doesn't seem to be very common either (unless the line exceeds 80-col).
well, projects i work on most of the times use different styles.
i suspect there is no single "very common" style even among wasm related projects.
cf.
https://github.com/bytecodealliance/wasm-micro-runtime/blob/main/.clang-format
https://github.com/yamt/toywasm/blob/master/.clang-format
i guess C++ projects tend to prefer smaller indentation width than C projects for obvious reasons.
|
||
* A PR to add the runtime support to wasi-libc is under review: https://github.com/WebAssembly/wasi-libc/pull/483 | ||
|
||
[WebAssemblyLowerEmscriptenEHSjLj.cpp]: https://github.com/llvm/llvm-project/blob/70deb7bfe90af91c68454b70683fbe98feaea87d/llvm/lib/Target/WebAssembly/WebAssemblyLowerEmscriptenEHSjLj.cpp |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[WebAssemblyLowerEmscriptenEHSjLj.cpp]: https://github.com/llvm/llvm-project/blob/70deb7bfe90af91c68454b70683fbe98feaea87d/llvm/lib/Target/WebAssembly/WebAssemblyLowerEmscriptenEHSjLj.cpp | |
[WebAssemblyLowerEmscriptenEHSjLj.cpp]: https://github.com/llvm/llvm-project/blob/main/llvm/lib/Target/WebAssembly/WebAssemblyLowerEmscriptenEHSjLj.cpp |
Not sure if this needs to be from a specific commit version, given that we are not specifying specific lines
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i prefer to always use permalink as the files can be renamed/removed in future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think in that case we should fix the link, because it means the doc is pointing to a wrong file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
well, after all, what this doc refers to is today's version of the file.
SetjmpLongjmp.md
Outdated
documented above. It also supports a few traditional variations of the | ||
setjmp/longjmp ABI. | ||
|
||
* A PR to add the runtime support to wasi-libc is under review: https://github.com/WebAssembly/wasi-libc/pull/483 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that the PR was merged, can we have a link to a specific file, like https://github.com/WebAssembly/wasi-libc/blob/main/libc-top-half/musl/src/setjmp/wasm32/rt.c ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated
SetjmpLongjmp.md
Outdated
binaryen has a conversion from the old instructions to the latest | ||
instructions. (`try_table` etc.) | ||
|
||
* Emscripten (TBD version) has the runtime support for the convention |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The next release, which I expect to be done within a few days now, will contain this, so I think we can remove the 'TBD' part. Also we can have a link to the file here:
https://github.com/emscripten-core/emscripten/blob/main/system/lib/compiler-rt/emscripten_setjmp.c
SetjmpLongjmp.md
Outdated
|
||
Note: older LLVM versions have been using a slightly different runtime ABI, | ||
which is supported by Emscripten. It has been switched to the ABI documented | ||
above by https://github.com/llvm/llvm-project/pull/84137. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: older LLVM versions have been using a slightly different runtime ABI, | |
which is supported by Emscripten. It has been switched to the ABI documented | |
above by https://github.com/llvm/llvm-project/pull/84137. |
Not sure if we need to have a link to the PR, and given that presumably it is a non-goal of this doc to describe the history, I think it's fine to describe only the current status quo.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed
SetjmpLongjmp.md
Outdated
## References | ||
|
||
* [A discussion about the runtime ABI](https://docs.google.com/document/d/1ZvTPT36K5jjiedF8MCXbEmYjULJjI723aOAks1IdLLg/edit#heading=h.metx1fc16ots) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need this? We don't have any problem-exploring design doc links elsewhere in this repo, once we settle on a solution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for working on this! Sorry for the late reply, and thank you for waiting for me.
the last `__wasm_setjmp` call for the jmp_buf. The compiler-generated logic | ||
can use the label value to pretend a return from the corresponding setjmp. | ||
Otherwise, `__wasm_setjmp_test` returns 0. In that case, the | ||
compiler-generated logic should rethrow the exception by calling |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
probably it's also ok to rethrow with delegate/throw_ref?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, yes, I think we can rethrow (or throw_ref
) it. Not sure whether we need delegate
though. But given that it is discouraged to use the current rethrow
instruction in LLVM because it is replaced by the new throw_ref
, and the new throw_ref
has not been implemented in LLVM yet, we can probably try this later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yea, i meant rethrow, not delegate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you!
SetjmpLongjmp.md
Outdated
#### Reserved area in jmp_buf | ||
|
||
The first 6 words of C jmp_buf is reserved for the use by the runtime. | ||
It should also have large enough alignment to store C pointers. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You use the word "word" in the previous sentence and "C pointer" in this sentence, but are we not talking about the same thing? IIUC normally when folks way "word" they mean "pointer sized thing", so should we just stick to using one or other in this document?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(or am I misunderstand and are you using pointer and word to mean different sized things)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You use the word "word" in the previous sentence and "C pointer" in this sentence, but are we not talking about the same thing? IIUC normally when folks way "word" they mean "pointer sized thing", so should we just stick to using one or other in this document?
i added explanation.
I think we can merge this? Anyone has any more comments? |
|
||
## Overview | ||
|
||
This document describes a convention to implement C setjmp/longjmp via |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My understanding is that this convention is now the default for LLVM, right? I think this document should say that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's documented in the "Implementations" section.
SetjmpLongjmp.md
Outdated
The first 6 words of C jmp_buf is reserved for the use by the runtime. | ||
("words" here are C pointer types specified in the [C ABI].) | ||
It should have large enough alignment to store C pointers. | ||
The actual contents of this area is private to the runtime implementation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The actual contents of this area is private to the runtime implementation. | |
The actual contents of this area are private to the runtime implementation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed. thank you.
I guess I can merge this now? Thanks for working on it again! |
No description provided.