Add pthread based worker custom message support #16239

chouquette · 2022-02-10T14:01:50Z

Hi,

This MR adds support for handling custom messages in pthread based workers, similar to https://emscripten.org/docs/api_reference/module.html#Module.onCustomMessage but even when building without PROXY_TO_WORKER

kripken · 2022-02-14T19:30:37Z

I think this might be reasonable to add, especially since we've had something similar in another mode. @sbc100 what do you think?

If we decide to go with this, please update the docs under site/ in the location that you linked to in module.html.

sbc100

I like the idea in general! I wonder about the naming though. Is onCustomMessage the best name? Do we have existing present for using this naming convention? I don't know that I have any great alternatives... maybe onPostMessageor onUserMessage (like the void* user_data from the C world)?

sbc100 · 2022-02-14T20:07:23Z

src/library_pthread.js

+          if (Module['onCustomMessage']) {
+            Module['onCustomMessage'](d);
+          } else {
+            throw 'Custom message received but worker Module.onCustomMessage not implemented.';


How about:

#if ASSERTIONS assert(Module['onCustomMessage'], 'Custom message received but worker Module.onCustomMessage not defined.'). #endif Module['onCustomMessage'](d);

Done for both occurences

sbc100 · 2022-02-14T20:08:50Z

tests/custom_messages_worker_shell.html

+    </script>
+    {{{ SCRIPT }}}
+  </body>
+</html>


I don't think we should need to add this html file .. I think we can just set Module.onCustomMessage in a pre-js, no? (that way the test will run on node too).

Good point. I got that working for a browser test, but I suppose it shouldn't be a browser test if it's to be ran with node?

Is core OK or should it be a 'other' test? (Or something else entirely?)

I replaced the shell file with a --pre-js for now, let me know if I should move the test to another suite

sbc100 · 2022-02-14T20:12:42Z

tests/custom_messages_worker.c

+
+EM_JS(void, run_test, (), {
+   function sendMessageToMainThread(cmd, payload) {
+     self.postMessage({


What is self here?

The main worker (I believe?), but it appears to be useless, so let's remove it

chouquette · 2022-02-18T10:32:04Z

I wonder about the naming though

I'm fine with onUserMessage :) The main reason I went with onCustomMessage was that it was already existing but only for some configurations, and I didn't see a good reason not to keep the same name.

chouquette · 2022-03-02T08:10:59Z

Hi,

Gentle ping on this PR, please let me know if you want the naming, or anything else, to change :)

sbc100 · 2022-03-02T21:55:13Z

src/library_pthread.js

@@ -274,6 +274,11 @@ var LibraryPThread = {
          if (Module['onAbort']) {
            Module['onAbort'](d['arg']);
          }
+        } else if (cmd === 'custom') {


Can you wrap this new block in #if expectToReceiveOnModule('onCustomMessage')

sbc100 · 2022-03-02T21:55:25Z

src/worker.js

@@ -293,6 +293,11 @@ self.onmessage = (e) => {
      if (Module['_pthread_self']()) { // If this thread is actually running?
        Module['_emscripten_proxy_execute_queue'](e.data.queue);
      }
+    } else if (e.data.cmd === 'custom') {


And this one.

sbc100 · 2022-03-02T21:58:02Z

tests/test_browser.py

@@ -4263,6 +4263,10 @@ def test_canvas_size_proxy(self):
  def test_custom_messages_proxy(self):
    self.btest(test_file('custom_messages_proxy.c'), expected='1', args=['--proxy-to-worker', '--shell-file', test_file('custom_messages_proxy_shell.html'), '--post-js', test_file('custom_messages_proxy_postjs.js')])

+  @requires_threads
+  def test_custom_message_worker(self):
+    self.btest(test_file('custom_messages_worker.c'), expected='1', args=['-sUSE_PTHREADS', '-sPTHREAD_POOL_SIZE=2', '--pre-js', test_file('custom_messages_worker_pre.js')])


I don't think you need a pthread pool here do you? Can you remove the PTHREAD_POOL_SIZE setting?

Can you use btest_exit here rather than btest (and remove the expected argument which will default to 0).

Can you also run this test under node in test_other.py?

sbc100 · 2022-03-02T21:59:14Z

tests/test_browser.py

@@ -4263,6 +4263,10 @@ def test_canvas_size_proxy(self):
  def test_custom_messages_proxy(self):
    self.btest(test_file('custom_messages_proxy.c'), expected='1', args=['--proxy-to-worker', '--shell-file', test_file('custom_messages_proxy_shell.html'), '--post-js', test_file('custom_messages_proxy_postjs.js')])

+  @requires_threads
+  def test_custom_message_worker(self):
+    self.btest(test_file('custom_messages_worker.c'), expected='1', args=['-sUSE_PTHREADS', '-sPTHREAD_POOL_SIZE=2', '--pre-js', test_file('custom_messages_worker_pre.js')])


Should we call this custom_message_pthread? Does that better describe what its testing?

sbc100 · 2022-03-02T22:01:00Z

site/source/docs/api_reference/module.rst

@@ -163,3 +163,5 @@ Other methods

  When compiled with ``PROXY_TO_WORKER = 1`` (see `settings.js <https://github.com/emscripten-core/emscripten/blob/main/src/settings.js>`_), this callback (which should be implemented on both the client and worker's ``Module`` object) allows sending custom messages and data between the web worker and the main thread (using the ``postCustomMessage`` function defined in `proxyClient.js <https://github.com/emscripten-core/emscripten/blob/main/src/proxyClient.js>`_ and `proxyWorker.js <https://github.com/emscripten-core/emscripten/blob/main/src/proxyWorker.js>`_).

+  When compiled with ``USE_PTHREADS = 1`` (see `settings.js <https://github.com/emscripten-core/emscripten/blob/main/src/settings.js>`_), this callback will be invoked when a message containing the command ``custom`` is received. It allows to send messages back and forth between workers and the main thread using the ``Worker.postMessage`` function.


Drop the = 1 here.. its not needed.

juj · 2022-03-04T15:23:28Z

Apologies, but I feel strongly against merging this.

The issue here is that layering new functionality on Module does not scale well, and does not DCE at all. There is a lot of bad history in Emscripten design from its early days that used Module as a general hub for sharing information across random places, and that is what has lead a lot of people vocally complain that Emscripten is complex and bloated.

There are a number of blog posts that have found themselves wanting to ridicule Emscripten for the fact that the tiniest "hello world" printf apps produce large output code sizes, so there has been a lot of work going in to ensure that what people perceive as bloat is being actively reduced.

Because of that, I don't think we should have this kind of addition merged in, since it increases code size for all pthreads users. While the code size increase is "just linear" in the number of bytes added, the cognitive load to read the build output increases superlinearly really fast.

I would recommend instead adopting an approach of using existing library functions. We already have a number of different APIs for proxying and sending messages between Workers, couldn't one of those be used instead?

juj · 2022-03-04T15:25:39Z

As for the existing Module.onCustomMessage API - that would be good to go the route of deprecation in the future, for the same reasons.

sbc100 · 2022-03-04T15:33:15Z

We recently developed an approach that allows use to extend/use the incoming module API in an opt-in way that doesn't bloat the code for users unless they explicitly opt into it.

The technique was enabled by this change: #16346

And first used here: #16361

This means that we should never need to increase the default ALL_INCOMING_MODULE_JS_API .. and in fact we can probably shrink it over time to reduce the default code size.

Given that we have this opt-in mechanism now, I think these kind of changes are a lot more acceptable.

We can have a separate debate about "should we allow users to hook directly into the postMessage loop".. but there is no (default) code size bloat associated with this change if we decide the answer is yes.

sbc100 · 2022-03-04T15:34:49Z

BTW, I totally agree that these kinds of changes are no acceptable if they increase the code size by default.

juj · 2022-03-04T15:52:40Z

We can have a separate debate about "should we allow users to hook directly into the postMessage loop"

It should always have been the case that people can directly inject their own postMessage events. All the message event handlers that Emscripten Worker-based APIs have should use their own dedicated message detection mechanism to play nice with custom user submitted events. I think this is still true with all the APIs.

One thing in particular is that Emscripten should not be assigning worker.onmessage or self.onmessage, but instead does .addEventListener('message', ...) so that if the user has existing JS code that does expect to own the .onmessage variable, it can do so without issues.

Restricting users from being able to submit custom postMessages would be limiting from site extensibility viewpoint.

We recently developed an approach that allows use to extend/use the incoming module API in an opt-in way that doesn't bloat the code for users unless they explicitly opt into it.

I do recall that, and I don't think it is the best solution tbh. It fixes complexity by adding more complexity. While it does fix the final build output in terms of code size, it does so by making Emscripten harder to use (a new INCOMING_MODULE_JS_API setting to have to worry about), and the source files (library_pthread.js, worker.js) still have the code complexity.

(Though now that I read this, I think the code size here does grow, and it is not adhering to the setting in INCOMING_MODULE_JS_API)

Note that I hope I am not setting up a double standard: I do also leverage custom -s settings like this, e.g. the upcoming WASM_WORKERS_NO_TLS whenever I need to add things that don't DCE well otherwise.

However I think the critical difference here is that such settings should be introduced only when we realize there is no other way to get to emit the code/feature otherwise. If that is the case, then I think the complexity is warranted. However in this case I think this use case can be solved with existing JS and C/C++ library functions without needing to add non-DCEing functionality?

(Or if not, my apologies, but in that case, I hope we can look a bit more in detail about the specific use case to see why the existing message passing library functions will not cut it)

sbc100 · 2022-03-04T15:58:20Z

(Though now that I read this, I think the code size here does grow, and it is not adhering to the setting in INCOMING_MODULE_JS_API)

See #16239 (comment). I was not planning on having this land without that change.

sbc100 · 2022-03-04T16:01:10Z

Regarding the issue at hand, the ability to receive custom messages on a worker, I didn't know about addEventListener('message', ...). If that works, it could indeed mean that this change is not needed. We should add test to ensure it does.

chouquette · 2022-03-08T10:10:50Z

Hi and sorry about the bit of delay.

Indeed the addEventListener('message', ...) way is working, and should be the correct one since it requires less intrusive changes, however the problem with that approach is that the main message handler will trigger the

        else {
          err("worker sent an unknown command " + cmd);
        }

path. I'm not entirely at ease with removing the error in case of an unknown message, and moving the error in a build setting dependent block doesn't seem to user friendly.

I'm unsure what's the way to go from here, but it seems that this MR should be closed as most of its code will be removed anyway

…39 (comment)

juj · 2022-03-08T10:46:16Z

Indeed the addEventListener('message', ...) way is working

Great, that's good to hear!

however the problem with that approach is that the main message handler will trigger the

Oops, that looks like a bug.. the error message should only trigger if receiving a message that looks like it should be handled by the library_pthread.js message listener. Posted #16450 to fix that. Does that help?

sbc100 · 2022-03-08T19:55:49Z

However I think the critical difference here is that such settings should be introduced only when we realize there is no other way to get to emit the code/feature otherwise. If that is the case, then I think the complexity is warranted. However in this case I think this use case can be solved with existing JS and C/C++ library functions without needing to add non-DCEing functionality?

I agree that using INCOMING_MODULE_JS_API should be a last resort. If there is a better/easier way to inject the customization I'm all for it.

How do you envisage a use calling .addEventListener('message', ...) , though? Do we want to recommend that folks use the mappings in libray_pthread.js to look up and manipulate the worker objects that back the pthreads? I was hoping we could consider those details internal. Perhaps we should have supported API for getting access the worker that is running a given pthread?

chouquette · 2022-03-09T13:02:37Z

Posted #16450 to fix that. Does that help?

Apparently yes! Thanks

However, I spoke too soon when I said that addEventListener is working. I can add some additional listeners from pthread, but I failed to add an extra listener for the main thread.

Using postMessage from a pthread invokes the handler defined in library_pthread.js correctly, but I didn't manage to invoke any custom handler. My understanding of JavaScript might be the issue here though 😅

AFAIU I should add the event handler to the Worker instance that represents the main thread but I'm a bit confused there, in my case the main thread isn't supposed to be a pthread (I don't build with PROXY_TO_WORKER), yet messages sent from a worker/pthread appear to be received in the main thread (I can list all other running, which if I understood correctly denotes that the code is running in the main thread)

If I add a listener through the window object, it doesn't receive any messages sent by workers. I'm not sure what I'm missing but I could definitely use some help. (in the event this would be easier in a real time conversation I'm present on your discord server using the same nick)

sbc100 · 2022-03-09T15:17:54Z

Posted #16450 to fix that. Does that help?

Apparently yes! Thanks

However, I spoke too soon when I said that addEventListener is worker. I can add some additional listeners from pthread, but I failed to add an extra listener for the main thread.

Using postMessage from a pthread invokes the handler defined in library_pthread.js correctly, but I didn't manage to invoke any custom handler. My understanding of JavaScript might be the issue here though sweat_smile

AFAIU I should add the event handler to the Worker instance that represents the main thread but I'm a bit confused there, in my case the main thread isn't supposed to be a pthread (I don't build with PROXY_TO_WORKER), yet messages sent from a worker/pthread appear to be received in the main thread (I can list all other running, which if I understood correctly denotes that the code is running in the main thread)

If I add a listener through the window object, it doesn't receive any messages sent by workers. I'm not sure what I'm missing but I could definitely use some help. (in the event this would be easier in a real time conversation I'm present on your discord server using the same nick)

I think you would need to somehow attach you even handler to each new worker object that gets created. These event listeners handlers are added in library_pthread.js. SeeloadWasmModuleToWorker. The question I have is how best to inject your extra handler .. or add to to workers as they are created.

chouquette · 2022-03-09T15:43:18Z

I think you would need to somehow attach you even handler to each new worker object that gets created.

This should work indeed, but it would still require the user to be able to inject their handler into emscripten somehow no? I was hopping to achieve something less intrusive through addEventListener, ideally without modifying emscripten.

To put it another way, I don't really see the difference between the original attempt in this MR and exposing another event handler to all workers. I'll do another pass at the previous comments with a rested head tomorrow as I might have missed something

sbc100 · 2022-03-09T15:50:04Z

I think you would need to somehow attach you even handler to each new worker object that gets created.

This should work indeed, but it would still require the user to be able to inject their handler into emscripten somehow no? I was hopping to achieve something less intrusive through addEventListener, ideally without modifying emscripten.

Yes, according to the discussion happening over on #16450 it should be possible for you to call addEventListener on new workers as they are created. As of today I think you would need to do something like PThread.pthreads[pthread_ptr].worker to get access to the worker of a given thread.. with that you should be able to do addEventListener?

chouquette · 2022-03-09T16:40:55Z

That was my initial attempt, but so far Module.PThread.pthreads[Module._pthread_self()]; yields undefined from the main thread.

From other threads that's not an issue though

sbc100 · 2022-03-09T17:16:07Z

That was my initial attempt, but so far Module.PThread.pthreads[Module._pthread_self()]; yields undefined from the main thread.

Yes that is expected, the handler would only need to be installed from the main thread and on the workers it owns.

…16450)

chouquette · 2022-03-10T08:02:10Z

Doesn't that mean that it's not possible to add a message handler for the main thread?

To try and clarify, I'm trying to send a message from a worker to the main thread in order to transfer some objects.

The message is sent & correctly received by the worker.onmessage handler from the main thread, but any handler I register from the HTML page is not invoked, while I was able to do so using cmd: custom and registering a handler in Module['onCustomMessage']

Again, sorry if I'm missing something

chouquette · 2022-03-10T15:11:29Z

Oh I think I'm starting to understand my confusion, feel free to ignore my last comment for the time being, sorry about that

chouquette · 2022-03-11T13:47:08Z

I can now confirm that this patchset is unrequired.

For the sake of explaining my confusion, should it be useful to anyone else, my main mistake was to assume that worker.postMessage would cause the handler executed in the main thread but with a different worker object (ie. the worker that addEventListener would have been invoked on). This is why I was struggling to find a Worker instance that I could invoke addEventListener on.

However in reality, the event handler in invoked on the same worker object, but from a different thread. Meaning that when the handler is invoked, it can access things that are only accessible from the main thread, so it's fairly easy to find the correct worker object knowing the thread ID

TL;DR I can do what I want with this kind of code:

    MAIN_THREAD_EM_ASM({
        Module.PThread.pthreads[$0].worker.addEventListener('message', function (e) {
// handle the event in the main thread context
        });
    }, pthread_self());

and later on invoke the handler through the usual postMessage

I hope this makes sense and can help someone struggling with starting with emscripten as I am 😅

Thanks a lot for your help and time with this! I'll now close the MR

sbc100 reviewed Feb 14, 2022

View reviewed changes

chouquette added 5 commits February 16, 2022 14:41

Add onCustomMessage support

3f2b10d

tests: Add a worker customMessage test

269d4ca

Use assertions rather than explicit if/else and throw

84b3fc1

Remove useless self. in function invocation

380aa8a

Use a pre-js instead of a shell file

e8cbca2

chouquette force-pushed the add_worker_on_custom_message branch from 33c5e99 to e8cbca2 Compare February 18, 2022 10:28

documentation draft

fa5a375

sbc100 reviewed Mar 2, 2022

View reviewed changes

juj added a commit to juj/emscripten that referenced this pull request Mar 8, 2022

Fix library_pthread.js onmessage error detection. emscripten-core#162…

16ece82

…39 (comment)

juj mentioned this pull request Mar 8, 2022

Fix library_pthread.js onmessage error detection #16450

Merged

juj added a commit that referenced this pull request Mar 9, 2022

Fix library_pthread.js onmessage error detection. #16239 (comment) (#…

504f4e0

…16450)

chouquette mentioned this pull request Mar 10, 2022

workers: Use addEventListener instead of onmessage #16461

Closed

chouquette closed this Mar 11, 2022

sbc100 mentioned this pull request Jun 15, 2022

--shared-memory is disallowed when using -sSHARED_MEMORY=1 compiling C++ code #17213

Open

		@@ -163,3 +163,5 @@ Other methods

		When compiled with ``PROXY_TO_WORKER = 1`` (see `settings.js <https://github.com/emscripten-core/emscripten/blob/main/src/settings.js>`_), this callback (which should be implemented on both the client and worker's ``Module`` object) allows sending custom messages and data between the web worker and the main thread (using the ``postCustomMessage`` function defined in `proxyClient.js <https://github.com/emscripten-core/emscripten/blob/main/src/proxyClient.js>`_ and `proxyWorker.js <https://github.com/emscripten-core/emscripten/blob/main/src/proxyWorker.js>`_).

		When compiled with ``USE_PTHREADS = 1`` (see `settings.js <https://github.com/emscripten-core/emscripten/blob/main/src/settings.js>`_), this callback will be invoked when a message containing the command ``custom`` is received. It allows to send messages back and forth between workers and the main thread using the ``Worker.postMessage`` function.

Add pthread based worker custom message support #16239

Add pthread based worker custom message support #16239

Uh oh!

Conversation

chouquette commented Feb 10, 2022

Uh oh!

kripken commented Feb 14, 2022

Uh oh!

sbc100 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

chouquette commented Feb 18, 2022

Uh oh!

chouquette commented Mar 2, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

juj commented Mar 4, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

juj commented Mar 4, 2022

Uh oh!

sbc100 commented Mar 4, 2022

Uh oh!

sbc100 commented Mar 4, 2022

Uh oh!

juj commented Mar 4, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sbc100 commented Mar 4, 2022

Uh oh!

sbc100 commented Mar 4, 2022

Uh oh!

chouquette commented Mar 8, 2022

Uh oh!

juj commented Mar 8, 2022

Uh oh!

sbc100 commented Mar 8, 2022

Uh oh!

chouquette commented Mar 9, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sbc100 commented Mar 9, 2022

Uh oh!

chouquette commented Mar 9, 2022

Uh oh!

sbc100 commented Mar 9, 2022

Uh oh!

chouquette commented Mar 9, 2022

Uh oh!

sbc100 commented Mar 9, 2022

Uh oh!

chouquette commented Mar 10, 2022

chouquette commented Mar 2, 2022 •

edited

Loading

juj commented Mar 4, 2022 •

edited

Loading

juj commented Mar 4, 2022 •

edited

Loading

chouquette commented Mar 9, 2022 •

edited

Loading

chouquette commented Mar 11, 2022 •

edited

Loading