From 8d80cb0227bb3c60b72aedc0ddee5b79709f4204 Mon Sep 17 00:00:00 2001 From: YAMAMOTO Takashi Date: Mon, 1 Apr 2024 14:25:34 +0900 Subject: [PATCH 01/25] Document a setjmp/longjmp convention --- SetjmpLongjmp.md | 176 +++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 176 insertions(+) create mode 100644 SetjmpLongjmp.md diff --git a/SetjmpLongjmp.md b/SetjmpLongjmp.md new file mode 100644 index 0000000..b30ecf7 --- /dev/null +++ b/SetjmpLongjmp.md @@ -0,0 +1,176 @@ +# C setjmp/longjmp in WebAssembly + +## Overview + +This document describes a convention to implement C setjmp/longjmp via +[WebAssembly exception-handling proposal]. + +[WebAssembly exception-handling proposal]: https://github.com/WebAssembly/exception-handling + +## Runtime ABI + +### Linear memory structures + +This convention uses a few structures on the WebAssembly linear memory. + +#### jmp_buf + +The first 6 words of C jmp_buf is reserved for the use of the runtime. +The `void *env` argument used by the ABI functions mentioned below points to +this 6-word area. +It should also have large enough alignment to store C pointers. + +#### __WasmLongjmpArgs + +An equivalent of the following structure is used to associate necessary +data to the WebAssembly exception. + +``` +struct __WasmLongjmpArgs { + void *env; + int val; +}; +``` + +The lifetime of this structure is rather short. It lives only during a +single longjmp execution. +A runtime can use a part of `jmpbuf` for this structure. It's also ok to use +a separate thread-local storage to place this structure. A runtime without +multi-threading support can simply place this structure in a global variable. + +### Exception + +This convention uses a WebAssembly exception to perform a non-local jump +for C `longjmp`. + +The name of exception is `__c_longjmp`. +The type of exception is `(param i32)`. (Or, `(param i64)` for [memory64]) +The parameter of the exception is the address of `__WasmLongjmpArgs` structure +on the linear memory. + +[memory64]: https://github.com/WebAssembly/memory64 + +### functions + +``` +void __wasm_setjmp(void *env, uint32_t label, void *func_invocation_id); +uint32_t __wasm_setjmp_test(void *env, void *func_invocation_id); +void __wasm_longjmp(void *env, int val); +``` + +`__wasm_setjmp` records the necessary data in the `env` so that it can be +used by `__wasm_longjmp` later. +`label` is a non-zero identifier to distinguish setjmp call-sites within +the function. Note that a C function can contain multiple setjmp() calls. +`func_invocation_id` is the identifier to distinguish invocations of this +C function. Note that, when a C function which calls setjmp() is invoked +recursively, setjmp/longjmp needs to distinguish them. + +`__wasm_setjmp_test` tests if the longjmp target belongs to the current +function invocation. if it does, this function returns the `label` value +saved by `__wasm_setjmp`. Otherwise, it returns 0. + +`__wasm_longjmp` is similar to C `longjmp`. +If `val` is 0, it's `__wasm_longjmp`'s responsibility to convert it to 1. +It performs a long jump by filling a `__WasmLongjmpArgs` structure and +throwing `__c_longjmp` exception with its address. + +## Code conversion + +The C compiler detects `setjmp` and `longjmp` calls in a program and +converts them into the corresponding WebAssembly exception-handling +instructions and calls to the above mentioned runtime ABI. + +### functions calling setjmp() + +On the function entry, the compiler would generate the logic to create +the identifier of this function invocation, typically by performing an +equivalent of `alloca(1)`. Note that the alloca size is not important +because the pointer is merely used as an identifier and never be dereferenced. + +Also, the compiler converts C `setjmp` calls to `__wasm_setjmp` calls. + +Also, for code blocks which possibly call `longjmp` directly or indirectly, +the compiler generates instructions to catch and process the +`__c_longjmp` exception accordingly. + +For an example, a C function like this would be converted like +the following pseudo code. +``` + void + f(void) + { + jmp_buf env; + if (!setjmp(env)) { + might_call_longjmp(env); + } + } +``` + +``` + $func_invocation_id = alloca(1) + + ;; 100 is a label generated by the compiler + call $__wasm_setjmp($env, 100, $func_invocation_id) + + block + block (result i32) + try_table (catch $__c_longjmp 0) + call $might_call_longjmp + end + ;; might_call_longjmp didn't call longjmp + br 1 + end + ;; might_call_longjmp called longjmp + pop __WasmLongjmpArgs pointer from the operand stack + $env = __WasmLongjmpArgs.env + $val = __WasmLongjmpArgs.val + $label = $__wasm_setjmp_test($env, $func_invocation_id) + if ($label == 0) { + ;; not for us. rethrow. + call $__wasm_longjmp($env, $val) + } + ;; ours. + ;; somehow jump to the block corresponding to the $label + : + : + end +``` + +### longjmp calls + +The compiler converts C `longjmp` calls to `__wasm_longjmp` calls. + +## Implementations + +* LLVM has a pass ([WebAssemblyLowerEmscriptenEHSjLj.cpp]) to perform + the convertion mentioned above. It can be enabled with the + `-mllvm -wasm-enable-sjlj` option. + + Note: older LLVM versions have been using a slightly different runtime ABI, + which is supported by Emscripten. It has been switched to the ABI documented + above by https://github.com/llvm/llvm-project/pull/84137. + + Note: as of writing this, LLVM produces a bit older version of + exception-handling instructions. (`try`, `delegate`, etc) + binaryen has a conversion from the old instructions to the latest + instructions. (`try_table` etc.) + +* Emscripten (TBD version) has the runtime support for the convention + documented above. It also supports a few traditional variations of the + setjmp/longjmp ABI. + +* A PR to add the runtime support to wasi-libc is under review: https://github.com/WebAssembly/wasi-libc/pull/483 + +[WebAssemblyLowerEmscriptenEHSjLj.cpp]: https://github.com/llvm/llvm-project/blob/70deb7bfe90af91c68454b70683fbe98feaea87d/llvm/lib/Target/WebAssembly/WebAssemblyLowerEmscriptenEHSjLj.cpp + +## Future directions + +* `__WasmLongjmpArgs` can be replaced with WebAssembly multivalue. + +* If/When WebAssembly exception gets more ubiquitous, we might want to move + the runtime to compiler-rt. + +## References + +* [A discussion about the runtime ABI](https://docs.google.com/document/d/1ZvTPT36K5jjiedF8MCXbEmYjULJjI723aOAks1IdLLg/edit#heading=h.metx1fc16ots) From 7a3233a8a336deda18d59ebcb3af5931e1f02f14 Mon Sep 17 00:00:00 2001 From: YAMAMOTO Takashi Date: Mon, 1 Apr 2024 14:46:21 +0900 Subject: [PATCH 02/25] typo --- SetjmpLongjmp.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/SetjmpLongjmp.md b/SetjmpLongjmp.md index b30ecf7..5f7c462 100644 --- a/SetjmpLongjmp.md +++ b/SetjmpLongjmp.md @@ -34,7 +34,7 @@ struct __WasmLongjmpArgs { The lifetime of this structure is rather short. It lives only during a single longjmp execution. -A runtime can use a part of `jmpbuf` for this structure. It's also ok to use +A runtime can use a part of `jmp_buf` for this structure. It's also ok to use a separate thread-local storage to place this structure. A runtime without multi-threading support can simply place this structure in a global variable. From ddd1e17f3cbb56ee43bf7931fbb4182280d0a5bc Mon Sep 17 00:00:00 2001 From: YAMAMOTO Takashi Date: Mon, 1 Apr 2024 15:09:55 +0900 Subject: [PATCH 03/25] a note about the reserved area --- SetjmpLongjmp.md | 14 ++++++++++++-- 1 file changed, 12 insertions(+), 2 deletions(-) diff --git a/SetjmpLongjmp.md b/SetjmpLongjmp.md index 5f7c462..d8047f0 100644 --- a/SetjmpLongjmp.md +++ b/SetjmpLongjmp.md @@ -13,13 +13,23 @@ This document describes a convention to implement C setjmp/longjmp via This convention uses a few structures on the WebAssembly linear memory. -#### jmp_buf +#### Reserved area in jmp_buf -The first 6 words of C jmp_buf is reserved for the use of the runtime. +The first 6 words of C jmp_buf is reserved for the use by the runtime. The `void *env` argument used by the ABI functions mentioned below points to this 6-word area. It should also have large enough alignment to store C pointers. +##### Notes about the size of reserved area in jmp_buf + +Emscripten has been using 6 words. (`unsigned long [6]`) + +GCC and Clang uses `intptr_t [5]` for their [setjmp/longjmp builtins]. +It isn't relevant right now though, because LLVM's WebAssembly target +doesn't provide these builtins. + +[setjmp/longjmp builtins]: https://gcc.gnu.org/onlinedocs/gcc/Nonlocal-Gotos.html + #### __WasmLongjmpArgs An equivalent of the following structure is used to associate necessary From 933eba353ef3c75f0cfa89bacb2962dbd06ecab7 Mon Sep 17 00:00:00 2001 From: YAMAMOTO Takashi Date: Mon, 1 Apr 2024 15:23:16 +0900 Subject: [PATCH 04/25] dynamic-linking consideration --- SetjmpLongjmp.md | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/SetjmpLongjmp.md b/SetjmpLongjmp.md index d8047f0..d2f9a9c 100644 --- a/SetjmpLongjmp.md +++ b/SetjmpLongjmp.md @@ -151,6 +151,15 @@ the following pseudo code. The compiler converts C `longjmp` calls to `__wasm_longjmp` calls. +## Dynamic-linking consideration + +In case of [dynamic-linking], it's the dynamic linker's responsibility +to provide the exception tag for this convention with the name +"env.__c_longjmp". Modules should import the tag so that cross-module +longjmp works. + +[dynamic-linking]: DynamicLinking.md + ## Implementations * LLVM has a pass ([WebAssemblyLowerEmscriptenEHSjLj.cpp]) to perform From afc947c09147bd275180e8e263f8b287e9a8d213 Mon Sep 17 00:00:00 2001 From: YAMAMOTO Takashi Date: Tue, 2 Apr 2024 10:55:37 +0900 Subject: [PATCH 05/25] make it clear "env" arguments are jmp_buf --- SetjmpLongjmp.md | 10 ++++------ 1 file changed, 4 insertions(+), 6 deletions(-) diff --git a/SetjmpLongjmp.md b/SetjmpLongjmp.md index d2f9a9c..1caded8 100644 --- a/SetjmpLongjmp.md +++ b/SetjmpLongjmp.md @@ -16,8 +16,6 @@ This convention uses a few structures on the WebAssembly linear memory. #### Reserved area in jmp_buf The first 6 words of C jmp_buf is reserved for the use by the runtime. -The `void *env` argument used by the ABI functions mentioned below points to -this 6-word area. It should also have large enough alignment to store C pointers. ##### Notes about the size of reserved area in jmp_buf @@ -37,7 +35,7 @@ data to the WebAssembly exception. ``` struct __WasmLongjmpArgs { - void *env; + void *env; // a pointer to jmp_buf int val; }; ``` @@ -63,9 +61,9 @@ on the linear memory. ### functions ``` -void __wasm_setjmp(void *env, uint32_t label, void *func_invocation_id); -uint32_t __wasm_setjmp_test(void *env, void *func_invocation_id); -void __wasm_longjmp(void *env, int val); +void __wasm_setjmp(jmp_buf env, uint32_t label, void *func_invocation_id); +uint32_t __wasm_setjmp_test(jmp_buf env, void *func_invocation_id); +void __wasm_longjmp(jmp_buf env, int val); ``` `__wasm_setjmp` records the necessary data in the `env` so that it can be From 07dfc547d3e6d8648762b902ff5a2eebe27d0ef3 Mon Sep 17 00:00:00 2001 From: YAMAMOTO Takashi Date: Tue, 2 Apr 2024 10:59:01 +0900 Subject: [PATCH 06/25] note that the layout of the 6-word area is not public --- SetjmpLongjmp.md | 1 + 1 file changed, 1 insertion(+) diff --git a/SetjmpLongjmp.md b/SetjmpLongjmp.md index 1caded8..954485f 100644 --- a/SetjmpLongjmp.md +++ b/SetjmpLongjmp.md @@ -17,6 +17,7 @@ This convention uses a few structures on the WebAssembly linear memory. The first 6 words of C jmp_buf is reserved for the use by the runtime. It should also have large enough alignment to store C pointers. +The contents of this area is private to the runtime implementation. ##### Notes about the size of reserved area in jmp_buf From 871cab8befd86eea274d484de1518b5f8b6b4695 Mon Sep 17 00:00:00 2001 From: YAMAMOTO Takashi Date: Wed, 10 Apr 2024 18:26:13 +0900 Subject: [PATCH 07/25] add links to runtime implementations --- SetjmpLongjmp.md | 13 +++++++++---- 1 file changed, 9 insertions(+), 4 deletions(-) diff --git a/SetjmpLongjmp.md b/SetjmpLongjmp.md index 954485f..f60b31f 100644 --- a/SetjmpLongjmp.md +++ b/SetjmpLongjmp.md @@ -174,14 +174,19 @@ longjmp works. binaryen has a conversion from the old instructions to the latest instructions. (`try_table` etc.) -* Emscripten (TBD version) has the runtime support for the convention - documented above. It also supports a few traditional variations of the - setjmp/longjmp ABI. +* Emscripten (TBD version) has the runtime support ([emscripten_setjmp.c]) + for the convention documented above. + It also supports a few traditional variations of the setjmp/longjmp ABI. -* A PR to add the runtime support to wasi-libc is under review: https://github.com/WebAssembly/wasi-libc/pull/483 +* wasi-libc has the runtime support ([wasi-libc rt.c]) for the convention + documented above. [WebAssemblyLowerEmscriptenEHSjLj.cpp]: https://github.com/llvm/llvm-project/blob/70deb7bfe90af91c68454b70683fbe98feaea87d/llvm/lib/Target/WebAssembly/WebAssemblyLowerEmscriptenEHSjLj.cpp +[emscripten_setjmp.c]: https://github.com/emscripten-core/emscripten/blob/7d66497d96cdcffa394ad67d87f7118137edf9ab/system/lib/compiler-rt/emscripten_setjmp.c + +[wasi-libc rt.c]: https://github.com/WebAssembly/wasi-libc/blob/d03829489904d38c624f6de9983190f1e5e7c9c5/libc-top-half/musl/src/setjmp/wasm32/rt.c + ## Future directions * `__WasmLongjmpArgs` can be replaced with WebAssembly multivalue. From 544f14815f258995635a22ba14750eb3ed7dd0cc Mon Sep 17 00:00:00 2001 From: YAMAMOTO Takashi Date: Wed, 10 Apr 2024 18:28:02 +0900 Subject: [PATCH 08/25] remove google doc reference --- SetjmpLongjmp.md | 4 ---- 1 file changed, 4 deletions(-) diff --git a/SetjmpLongjmp.md b/SetjmpLongjmp.md index f60b31f..d3e2a36 100644 --- a/SetjmpLongjmp.md +++ b/SetjmpLongjmp.md @@ -193,7 +193,3 @@ longjmp works. * If/When WebAssembly exception gets more ubiquitous, we might want to move the runtime to compiler-rt. - -## References - -* [A discussion about the runtime ABI](https://docs.google.com/document/d/1ZvTPT36K5jjiedF8MCXbEmYjULJjI723aOAks1IdLLg/edit#heading=h.metx1fc16ots) From cf83c3ef5642ba3b5ba221eadb74bc7526b92545 Mon Sep 17 00:00:00 2001 From: YAMAMOTO Takashi Date: Wed, 10 Apr 2024 18:31:30 +0900 Subject: [PATCH 09/25] remove a note about history --- SetjmpLongjmp.md | 4 ---- 1 file changed, 4 deletions(-) diff --git a/SetjmpLongjmp.md b/SetjmpLongjmp.md index d3e2a36..e32eee8 100644 --- a/SetjmpLongjmp.md +++ b/SetjmpLongjmp.md @@ -165,10 +165,6 @@ longjmp works. the convertion mentioned above. It can be enabled with the `-mllvm -wasm-enable-sjlj` option. - Note: older LLVM versions have been using a slightly different runtime ABI, - which is supported by Emscripten. It has been switched to the ABI documented - above by https://github.com/llvm/llvm-project/pull/84137. - Note: as of writing this, LLVM produces a bit older version of exception-handling instructions. (`try`, `delegate`, etc) binaryen has a conversion from the old instructions to the latest From a9689fe13e75cca31e6e34d1761db4bbebb81160 Mon Sep 17 00:00:00 2001 From: YAMAMOTO Takashi Date: Wed, 10 Apr 2024 18:32:11 +0900 Subject: [PATCH 10/25] be explicit about LLVM version --- SetjmpLongjmp.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/SetjmpLongjmp.md b/SetjmpLongjmp.md index e32eee8..abdf5cb 100644 --- a/SetjmpLongjmp.md +++ b/SetjmpLongjmp.md @@ -161,8 +161,8 @@ longjmp works. ## Implementations -* LLVM has a pass ([WebAssemblyLowerEmscriptenEHSjLj.cpp]) to perform - the convertion mentioned above. It can be enabled with the +* LLVM (19 and later) has a pass ([WebAssemblyLowerEmscriptenEHSjLj.cpp]) + to perform the convertion mentioned above. It can be enabled with the `-mllvm -wasm-enable-sjlj` option. Note: as of writing this, LLVM produces a bit older version of From 13891ab03ab859cb32cd0fa8f2bc14346bf07e0a Mon Sep 17 00:00:00 2001 From: YAMAMOTO Takashi Date: Wed, 10 Apr 2024 18:35:04 +0900 Subject: [PATCH 11/25] specify language code blocks for possibly better rendering --- SetjmpLongjmp.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/SetjmpLongjmp.md b/SetjmpLongjmp.md index abdf5cb..02039bf 100644 --- a/SetjmpLongjmp.md +++ b/SetjmpLongjmp.md @@ -105,7 +105,7 @@ the compiler generates instructions to catch and process the For an example, a C function like this would be converted like the following pseudo code. -``` +```c void f(void) { @@ -116,7 +116,7 @@ the following pseudo code. } ``` -``` +```wat $func_invocation_id = alloca(1) ;; 100 is a label generated by the compiler From c2feeb40db509d9bc31453fdb92053f8f1b595fd Mon Sep 17 00:00:00 2001 From: YAMAMOTO Takashi Date: Wed, 10 Apr 2024 18:40:28 +0900 Subject: [PATCH 12/25] specify language for a few more code blocks --- SetjmpLongjmp.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/SetjmpLongjmp.md b/SetjmpLongjmp.md index 02039bf..d402f47 100644 --- a/SetjmpLongjmp.md +++ b/SetjmpLongjmp.md @@ -34,7 +34,7 @@ doesn't provide these builtins. An equivalent of the following structure is used to associate necessary data to the WebAssembly exception. -``` +```c struct __WasmLongjmpArgs { void *env; // a pointer to jmp_buf int val; @@ -61,7 +61,7 @@ on the linear memory. ### functions -``` +```c void __wasm_setjmp(jmp_buf env, uint32_t label, void *func_invocation_id); uint32_t __wasm_setjmp_test(jmp_buf env, void *func_invocation_id); void __wasm_longjmp(jmp_buf env, int val); From 119de614806558a375aa2b0ac9a57a701588e4c2 Mon Sep 17 00:00:00 2001 From: YAMAMOTO Takashi Date: Wed, 10 Apr 2024 18:41:57 +0900 Subject: [PATCH 13/25] fix indentation --- SetjmpLongjmp.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/SetjmpLongjmp.md b/SetjmpLongjmp.md index d402f47..161ffdd 100644 --- a/SetjmpLongjmp.md +++ b/SetjmpLongjmp.md @@ -136,8 +136,8 @@ the following pseudo code. $val = __WasmLongjmpArgs.val $label = $__wasm_setjmp_test($env, $func_invocation_id) if ($label == 0) { - ;; not for us. rethrow. - call $__wasm_longjmp($env, $val) + ;; not for us. rethrow. + call $__wasm_longjmp($env, $val) } ;; ours. ;; somehow jump to the block corresponding to the $label From a0cdee8a05d09bbac6a76ef46fed1302af04464c Mon Sep 17 00:00:00 2001 From: YAMAMOTO Takashi Date: Wed, 10 Apr 2024 19:36:51 +0900 Subject: [PATCH 14/25] mentions emscriptes js exception --- SetjmpLongjmp.md | 18 +++++++++++++++++- 1 file changed, 17 insertions(+), 1 deletion(-) diff --git a/SetjmpLongjmp.md b/SetjmpLongjmp.md index 161ffdd..ce7196a 100644 --- a/SetjmpLongjmp.md +++ b/SetjmpLongjmp.md @@ -5,6 +5,9 @@ This document describes a convention to implement C setjmp/longjmp via [WebAssembly exception-handling proposal]. +This document also briefly mentions another convention based on JavaScript +exceptions. + [WebAssembly exception-handling proposal]: https://github.com/WebAssembly/exception-handling ## Runtime ABI @@ -159,6 +162,20 @@ longjmp works. [dynamic-linking]: DynamicLinking.md +## Emscripten JavaScript-based exceptions + +Emscripten has a mode to use JavaScript-based exceptions instead of +WebAssembly exceptions. In that mode, `emscripten_longjmp` function, +which throws a JavaScript exception, is used instead of `__wasm_longjmp`. + +```c +void emscripten_longjmp(uintptr_t env, int val); +``` + +The compiler translates C function calls which possibly ends up with +calling `longjmp` to indirect calls via a JavaScript wrapper which +catches the JavaScript exception. + ## Implementations * LLVM (19 and later) has a pass ([WebAssemblyLowerEmscriptenEHSjLj.cpp]) @@ -172,7 +189,6 @@ longjmp works. * Emscripten (TBD version) has the runtime support ([emscripten_setjmp.c]) for the convention documented above. - It also supports a few traditional variations of the setjmp/longjmp ABI. * wasi-libc has the runtime support ([wasi-libc rt.c]) for the convention documented above. From 862948eed295dafed44abc329c94a357cfd297f0 Mon Sep 17 00:00:00 2001 From: YAMAMOTO Takashi Date: Thu, 11 Apr 2024 09:19:32 +0900 Subject: [PATCH 15/25] fill emscripten version --- SetjmpLongjmp.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/SetjmpLongjmp.md b/SetjmpLongjmp.md index ce7196a..67074a3 100644 --- a/SetjmpLongjmp.md +++ b/SetjmpLongjmp.md @@ -187,7 +187,7 @@ catches the JavaScript exception. binaryen has a conversion from the old instructions to the latest instructions. (`try_table` etc.) -* Emscripten (TBD version) has the runtime support ([emscripten_setjmp.c]) +* Emscripten (3.1.57 or later) has the runtime support ([emscripten_setjmp.c]) for the convention documented above. * wasi-libc has the runtime support ([wasi-libc rt.c]) for the convention From 1f28a42d68f8ddb2ef89043f2aac1c5bf8fbe30b Mon Sep 17 00:00:00 2001 From: YAMAMOTO Takashi Date: Thu, 11 Apr 2024 10:00:35 +0900 Subject: [PATCH 16/25] add some headweaving about catching logic --- SetjmpLongjmp.md | 15 +++++++++++++++ 1 file changed, 15 insertions(+) diff --git a/SetjmpLongjmp.md b/SetjmpLongjmp.md index 67074a3..e646039 100644 --- a/SetjmpLongjmp.md +++ b/SetjmpLongjmp.md @@ -102,10 +102,25 @@ because the pointer is merely used as an identifier and never be dereferenced. Also, the compiler converts C `setjmp` calls to `__wasm_setjmp` calls. +For each setjmp callsite, the compiler allocates non-zero identifier called +"label". The label value passed to `__wasm_setjmp` is recorded by the +runtime and returned by later `__wasm_setjmp_test` when processing a longjmp +to the corresponding jmp_buf. + Also, for code blocks which possibly call `longjmp` directly or indirectly, the compiler generates instructions to catch and process the `__c_longjmp` exception accordingly. +When catching the exception, the compiler-generated logic calls +`___wasm_setjmp_test` to see if the exception is for this invocation +of this function. +If it is, `__wasm_setjmp_test` returns the non-zero label value recorded by +the last `__wasm_setjmp` call for the jmp_buf. The compiler-generated logic +can use the label value to pretend a return from the corresponding setjmp. +Otherwise, `__wasm_setjmp_test` returns 0. In that case, the +compiler-generated logic should rethrow the exception by calling +`__wasm_longjmp` so that it can be eventually caught by the right function. + For an example, a C function like this would be converted like the following pseudo code. ```c From ed850b7e0951f916784cf3f707dc4f35f946b64b Mon Sep 17 00:00:00 2001 From: YAMAMOTO Takashi Date: Thu, 11 Apr 2024 12:44:26 +0900 Subject: [PATCH 17/25] fix a typo --- SetjmpLongjmp.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/SetjmpLongjmp.md b/SetjmpLongjmp.md index e646039..e7c1c03 100644 --- a/SetjmpLongjmp.md +++ b/SetjmpLongjmp.md @@ -112,7 +112,7 @@ the compiler generates instructions to catch and process the `__c_longjmp` exception accordingly. When catching the exception, the compiler-generated logic calls -`___wasm_setjmp_test` to see if the exception is for this invocation +`__wasm_setjmp_test` to see if the exception is for this invocation of this function. If it is, `__wasm_setjmp_test` returns the non-zero label value recorded by the last `__wasm_setjmp` call for the jmp_buf. The compiler-generated logic From 4a4d4480f6a9811090ebd20ee1575bd6359164e8 Mon Sep 17 00:00:00 2001 From: YAMAMOTO Takashi Date: Thu, 11 Apr 2024 16:53:23 +0900 Subject: [PATCH 18/25] use "..." instead of ":" to mean omission --- SetjmpLongjmp.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/SetjmpLongjmp.md b/SetjmpLongjmp.md index e7c1c03..ac3de81 100644 --- a/SetjmpLongjmp.md +++ b/SetjmpLongjmp.md @@ -159,8 +159,8 @@ the following pseudo code. } ;; ours. ;; somehow jump to the block corresponding to the $label - : - : + ... + ... end ``` From 54ca9fb5d9fcc4a1301aa7ffdc1290588a3ed519 Mon Sep 17 00:00:00 2001 From: YAMAMOTO Takashi Date: Thu, 11 Apr 2024 17:30:03 +0900 Subject: [PATCH 19/25] reduce indentation in wat --- SetjmpLongjmp.md | 50 ++++++++++++++++++++++++------------------------ 1 file changed, 25 insertions(+), 25 deletions(-) diff --git a/SetjmpLongjmp.md b/SetjmpLongjmp.md index ac3de81..c93247b 100644 --- a/SetjmpLongjmp.md +++ b/SetjmpLongjmp.md @@ -135,33 +135,33 @@ the following pseudo code. ``` ```wat - $func_invocation_id = alloca(1) - - ;; 100 is a label generated by the compiler - call $__wasm_setjmp($env, 100, $func_invocation_id) - - block - block (result i32) - try_table (catch $__c_longjmp 0) - call $might_call_longjmp - end - ;; might_call_longjmp didn't call longjmp - br 1 + $func_invocation_id = alloca(1) + + ;; 100 is a label generated by the compiler + call $__wasm_setjmp($env, 100, $func_invocation_id) + + block + block (result i32) + try_table (catch $__c_longjmp 0) + call $might_call_longjmp end - ;; might_call_longjmp called longjmp - pop __WasmLongjmpArgs pointer from the operand stack - $env = __WasmLongjmpArgs.env - $val = __WasmLongjmpArgs.val - $label = $__wasm_setjmp_test($env, $func_invocation_id) - if ($label == 0) { - ;; not for us. rethrow. - call $__wasm_longjmp($env, $val) - } - ;; ours. - ;; somehow jump to the block corresponding to the $label - ... - ... + ;; might_call_longjmp didn't call longjmp + br 1 end + ;; might_call_longjmp called longjmp + pop __WasmLongjmpArgs pointer from the operand stack + $env = __WasmLongjmpArgs.env + $val = __WasmLongjmpArgs.val + $label = $__wasm_setjmp_test($env, $func_invocation_id) + if ($label == 0) { + ;; not for us. rethrow. + call $__wasm_longjmp($env, $val) + } + ;; ours. + ;; somehow jump to the block corresponding to the $label + ... + ... + end ``` ### longjmp calls From 6da62fc012f4e555faed55aa26773ce81cd868a0 Mon Sep 17 00:00:00 2001 From: YAMAMOTO Takashi Date: Thu, 11 Apr 2024 17:36:51 +0900 Subject: [PATCH 20/25] reformat C code with LLVM's .clang-format --- SetjmpLongjmp.md | 18 ++++++++---------- 1 file changed, 8 insertions(+), 10 deletions(-) diff --git a/SetjmpLongjmp.md b/SetjmpLongjmp.md index c93247b..d27b71e 100644 --- a/SetjmpLongjmp.md +++ b/SetjmpLongjmp.md @@ -39,8 +39,8 @@ data to the WebAssembly exception. ```c struct __WasmLongjmpArgs { - void *env; // a pointer to jmp_buf - int val; + void *env; // a pointer to jmp_buf + int val; }; ``` @@ -124,14 +124,12 @@ compiler-generated logic should rethrow the exception by calling For an example, a C function like this would be converted like the following pseudo code. ```c - void - f(void) - { - jmp_buf env; - if (!setjmp(env)) { - might_call_longjmp(env); - } - } +void f(void) { + jmp_buf env; + if (!setjmp(env)) { + might_call_longjmp(env); + } +} ``` ```wat From c91b637bee92978976c6497eb226d95b64403d87 Mon Sep 17 00:00:00 2001 From: YAMAMOTO Takashi Date: Fri, 12 Apr 2024 13:16:37 +0900 Subject: [PATCH 21/25] distinguish exception and tag a bit more clearly --- SetjmpLongjmp.md | 19 +++++++++++-------- 1 file changed, 11 insertions(+), 8 deletions(-) diff --git a/SetjmpLongjmp.md b/SetjmpLongjmp.md index d27b71e..fe61762 100644 --- a/SetjmpLongjmp.md +++ b/SetjmpLongjmp.md @@ -50,15 +50,17 @@ A runtime can use a part of `jmp_buf` for this structure. It's also ok to use a separate thread-local storage to place this structure. A runtime without multi-threading support can simply place this structure in a global variable. -### Exception +### Exception tag This convention uses a WebAssembly exception to perform a non-local jump for C `longjmp`. -The name of exception is `__c_longjmp`. -The type of exception is `(param i32)`. (Or, `(param i64)` for [memory64]) -The parameter of the exception is the address of `__WasmLongjmpArgs` structure -on the linear memory. +The exception is created with an exception tag named `__c_longjmp`. +The name is used for both of [static linking](Linking.md) and +[dynamic linking](DynamicLinking.md). +The type of exception tag is `(param i32)`. (Or, `(param i64)` for [memory64]) +The parameter is the address of the `__WasmLongjmpArgs` structure on the +linear memory. [memory64]: https://github.com/WebAssembly/memory64 @@ -85,7 +87,8 @@ saved by `__wasm_setjmp`. Otherwise, it returns 0. `__wasm_longjmp` is similar to C `longjmp`. If `val` is 0, it's `__wasm_longjmp`'s responsibility to convert it to 1. It performs a long jump by filling a `__WasmLongjmpArgs` structure and -throwing `__c_longjmp` exception with its address. +throwing an exception with its address. The exception is created with +the `__c_longjmp` exception tag. ## Code conversion @@ -108,8 +111,8 @@ runtime and returned by later `__wasm_setjmp_test` when processing a longjmp to the corresponding jmp_buf. Also, for code blocks which possibly call `longjmp` directly or indirectly, -the compiler generates instructions to catch and process the -`__c_longjmp` exception accordingly. +the compiler generates instructions to catch and process exceptions with +the `__c_longjmp` exception tag accordingly. When catching the exception, the compiler-generated logic calls `__wasm_setjmp_test` to see if the exception is for this invocation From 680e18f088d1b6a9ddb722e1fd2a41ab27fd8dac Mon Sep 17 00:00:00 2001 From: YAMAMOTO Takashi Date: Fri, 12 Apr 2024 13:29:36 +0900 Subject: [PATCH 22/25] add a few ideas to the "Future directions" section --- SetjmpLongjmp.md | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/SetjmpLongjmp.md b/SetjmpLongjmp.md index fe61762..156eb2a 100644 --- a/SetjmpLongjmp.md +++ b/SetjmpLongjmp.md @@ -219,5 +219,14 @@ catches the JavaScript exception. * `__WasmLongjmpArgs` can be replaced with WebAssembly multivalue. +* Or, alternatively, we can make `__wasm_setjmp_test` take the + `__WasmLongjmpArgs` pointer so that we can drop the `__WasmLongjmpArgs` + structure layout from the ABI. + +* It might be simpler for the complier-generated catching logic to rethrow + the exception with the `rethrow`/`throw_ref` instruction instead of + calling `__wasm_longjmp`. Or, it might be simpler to make + `__wasm_setjmp_test` rethow the exception internally. + * If/When WebAssembly exception gets more ubiquitous, we might want to move the runtime to compiler-rt. From 82a90db069e07c963bf4770169de67c2d8ad4fa1 Mon Sep 17 00:00:00 2001 From: YAMAMOTO Takashi Date: Fri, 12 Apr 2024 14:31:41 +0900 Subject: [PATCH 23/25] make it clear what "words" are --- SetjmpLongjmp.md | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/SetjmpLongjmp.md b/SetjmpLongjmp.md index 156eb2a..f5abe1c 100644 --- a/SetjmpLongjmp.md +++ b/SetjmpLongjmp.md @@ -19,12 +19,15 @@ This convention uses a few structures on the WebAssembly linear memory. #### Reserved area in jmp_buf The first 6 words of C jmp_buf is reserved for the use by the runtime. -It should also have large enough alignment to store C pointers. -The contents of this area is private to the runtime implementation. +("words" here are C pointer types specified in the [C ABI].) +It should have large enough alignment to store C pointers. +The actual contents of this area is private to the runtime implementation. + +[C ABI]: BasicCABI.md ##### Notes about the size of reserved area in jmp_buf -Emscripten has been using 6 words. (`unsigned long [6]`) +Emscripten has been using 6 `unsigned long`s. (`unsigned long [6]`) GCC and Clang uses `intptr_t [5]` for their [setjmp/longjmp builtins]. It isn't relevant right now though, because LLVM's WebAssembly target From baa023e08b8d5a5b683bb9540d1198c9649efbba Mon Sep 17 00:00:00 2001 From: YAMAMOTO Takashi Date: Fri, 12 Apr 2024 15:02:35 +0900 Subject: [PATCH 24/25] capitalize --- SetjmpLongjmp.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/SetjmpLongjmp.md b/SetjmpLongjmp.md index f5abe1c..8d0c6f7 100644 --- a/SetjmpLongjmp.md +++ b/SetjmpLongjmp.md @@ -67,7 +67,7 @@ linear memory. [memory64]: https://github.com/WebAssembly/memory64 -### functions +### Functions ```c void __wasm_setjmp(jmp_buf env, uint32_t label, void *func_invocation_id); @@ -99,7 +99,7 @@ The C compiler detects `setjmp` and `longjmp` calls in a program and converts them into the corresponding WebAssembly exception-handling instructions and calls to the above mentioned runtime ABI. -### functions calling setjmp() +### Functions calling setjmp() On the function entry, the compiler would generate the logic to create the identifier of this function invocation, typically by performing an @@ -168,7 +168,7 @@ void f(void) { end ``` -### longjmp calls +### Longjmp calls The compiler converts C `longjmp` calls to `__wasm_longjmp` calls. From 8c2c54fd07c3ee179bb73871ac9214c79549205b Mon Sep 17 00:00:00 2001 From: YAMAMOTO Takashi Date: Thu, 25 Apr 2024 09:26:47 +0900 Subject: [PATCH 25/25] a gramatical fix --- SetjmpLongjmp.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/SetjmpLongjmp.md b/SetjmpLongjmp.md index 8d0c6f7..1495147 100644 --- a/SetjmpLongjmp.md +++ b/SetjmpLongjmp.md @@ -21,7 +21,7 @@ This convention uses a few structures on the WebAssembly linear memory. The first 6 words of C jmp_buf is reserved for the use by the runtime. ("words" here are C pointer types specified in the [C ABI].) It should have large enough alignment to store C pointers. -The actual contents of this area is private to the runtime implementation. +The actual contents of this area are private to the runtime implementation. [C ABI]: BasicCABI.md