GH-130396: Work around for broken `pthread_get_stackaddr_np` on Emscripten #131088

hoodmane · 2025-03-11T13:10:42Z

Emscripten reports having pthread_get_stackaddr_np but it does nothing. I opened an upstream PR to actually fill in the stack information: emscripten-core/emscripten#23887

Until we can update to an Emscripten version that includes this fix, we can work around it by using #defines to replace pthread_get_stackaddr_np with the modified version.

Issue: Implement stack overflow protection for linux based on actual stack depth #130396

… Emscripten Emscripten reports having `pthread_get_stackaddr_np` but it does nothing. I opened an upstream PR to actually fill in the stack information: emscripten-core/emscripten#23887 Until we can update to an Emscripten version that includes this fix, we can work around it by using `#define`s to replace `pthread_get_stackaddr_np` with the modified version.

freakboy3742

The general approach makes sense to me - although I'd like confirmation from @markshannon that he's OK with this as well.

In terms of functionality: this definitely works a lot better - I'm not getting the -sASSERTIONS error we were seeing in our pairing session earlier this week. However, it's not passing a full test run for me, for reasons that possibly seem related to (or, at least, intertwined with) this change:

$ ./python.sh -m test -v --rerun -W
...
0:02:30 load avg: 0.10 [250/485] test_ioctl skipped
0:02:30 load avg: 0.10 [251/485] test_ipaddress
0:02:30 load avg: 0.10 [251/485] test_ipaddress passed
0:02:30 load avg: 0.10 [252/485] test_isinstance
wasm://wasm/02937f2a:1


RangeError: Maximum call stack size exceeded
    at wasm://wasm/02937f2a:wasm-function[12560]:0x65658a
    at wasm://wasm/02937f2a:wasm-function[2087]:0x1ceea7
    at wasm://wasm/02937f2a:wasm-function[2128]:0x1d01e1
    at wasm://wasm/02937f2a:wasm-function[3676]:0x2b02da
    at wasm://wasm/02937f2a:wasm-function[3677]:0x2b03c7
    at wasm://wasm/02937f2a:wasm-function[2289]:0x1db42d
    at wasm://wasm/02937f2a:wasm-function[3290]:0x26f6ef
    at wasm://wasm/02937f2a:wasm-function[3291]:0x27f317
    at wasm://wasm/02937f2a:wasm-function[760]:0x16d46b
    at wasm://wasm/02937f2a:wasm-function[759]:0x16d397

Node.js v20.17.0

It happens a lot earlier in the test run if you use -uall (as we would be doing in a CI config):

$ ./python.sh -m test -v -uall --rerun -W
...
0:00:22 load avg: 0.10 [ 98/485] test_build_details
0:00:22 load avg: 0.10 [ 98/485] test_build_details passed
0:00:22 load avg: 0.10 [ 99/485] test_builtin
wasm://wasm/02937f2a:1


RangeError: Maximum call stack size exceeded
    at wasm://wasm/02937f2a:wasm-function[2011]:0x1c23ef
    at wasm://wasm/02937f2a:wasm-function[3218]:0x266cf6
    at wasm://wasm/02937f2a:wasm-function[2011]:0x1c241d
    at wasm://wasm/02937f2a:wasm-function[3218]:0x266cf6
    at wasm://wasm/02937f2a:wasm-function[2011]:0x1c241d
    at wasm://wasm/02937f2a:wasm-function[3218]:0x266cf6
    at wasm://wasm/02937f2a:wasm-function[2011]:0x1c241d
    at wasm://wasm/02937f2a:wasm-function[3218]:0x266cf6
    at wasm://wasm/02937f2a:wasm-function[2011]:0x1c241d
    at wasm://wasm/02937f2a:wasm-function[3218]:0x266cf6

Node.js v20.17.0

bedevere-app · 2025-03-12T00:28:54Z

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phrase I have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

hoodmane · 2025-03-12T05:36:58Z

Pyodide gives much better stack traces on this sort of failure. I think one thing it does is call _Py_DumpTraceback(). I'll try to add that logic.

hoodmane · 2025-03-12T05:38:10Z

When I ran the tests locally with this fix I think they all passed, though I didn't pass -uall.

freakboy3742 · 2025-03-12T06:03:27Z

When I ran the tests locally with this fix I think they all passed, though I didn't pass -uall.

I've been able to reproduce this failure locally (on macOS) and on the buildbot machine (linux).

hoodmane · 2025-03-12T17:01:31Z

Indeed, I can reproduce it too. Maybe let's just add @support.skip_emscripten_stack_overflow() to test_infinite_cycle_in_bases for now, it's only one more recursion test that is broken...

freakboy3742 · 2025-03-13T01:15:53Z

With the fix from #131158 applied, running on macOS, the test suite completes, but I get three test failures (test_os, test_shutil and test_unicode_file_functions). However, they all appear to be for filesystem access reasons, unrelated to this PR.

I'm also seeing failures with -uall related in test_builtin, test_capi, test_descr; there are probably others; that's as far as I got with manually disabling single tests with -x. However, as with the test_isinstance failure, I get the impression that these are more traditional stack overflow issues that need @support.skip_emscripten_stack_overflow() markers (some of them already have skip_wasi_stack_overflow). We can clean up those and the filesystem tests in a separate PR.

The new stack traces are really pretty though 😄

So - I'm happy to sign off on this; once we've got confirmation from @markshannon that he doesn't object to the approach, we can merge and we're one step closer to a buildbot!

hoodmane · 2025-03-13T08:02:16Z

The fix for this just barely missed inclusion into Emscripten 4.0.5 but it will definitely be in 4.0.6.

markshannon · 2025-03-13T08:53:41Z

This approach seems fine to me.

Protecting against stack overflow is very much a "practicality beats purity" thing.
We want extremely few, ideally no, cases where the VM crashes, and within that constraint we want to maximize the available stack available.

I used pthread_get_stackaddr_np because it happens to work for linux and similar systems. If there is something else that works better for the webassembly platforms, please feel free to use that.

hoodmane · 2025-03-13T09:28:01Z

The goal of Emscripten is to behave as much like linux as possible, subject to other the architecture and platform constraints and the desire to minimize code size. So in cases like this, finding discrepancies with normal linux and resolving them in Emscripten is ideal.

If there were a cheap way to ask about the true stack in addition to the spill stack, that would be great but as it is at least we can ask about the spill stack correctly. Maybe now if we set the size of the spill stack to be kind of small the stack checks will work a bit more often.

… Emscripten (python#131088) Implements a workaround implementation of `pthread_get_stackaddr_np` for Emscripten. This will be replaced by an implementation that will be included in Emscripten 4.0.6.

hoodmane requested a review from freakboy3742 March 11, 2025 13:10

hoodmane requested a review from markshannon as a code owner March 11, 2025 13:10

bedevere-app bot added the awaiting review label Mar 11, 2025

bedevere-app bot mentioned this pull request Mar 11, 2025

Implement stack overflow protection for linux based on actual stack depth #130396

Open

hoodmane added the skip news label Mar 11, 2025

freakboy3742 requested changes Mar 12, 2025

View reviewed changes

bedevere-app bot removed the awaiting review label Mar 12, 2025

bedevere-app bot added the awaiting changes label Mar 12, 2025

Merge branch 'main' into emscripten-pthread_get_stackaddr_np

d79cdaa

freakboy3742 approved these changes Mar 13, 2025

View reviewed changes

bedevere-app bot added awaiting merge and removed awaiting changes labels Mar 13, 2025

Update comment

297a53c

freakboy3742 merged commit f3e275f into python:main Mar 13, 2025
47 checks passed

bedevere-app bot removed the awaiting merge label Mar 13, 2025

hoodmane deleted the emscripten-pthread_get_stackaddr_np branch March 14, 2025 09:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GH-130396: Work around for broken `pthread_get_stackaddr_np` on Emscripten #131088

GH-130396: Work around for broken `pthread_get_stackaddr_np` on Emscripten #131088

hoodmane commented Mar 11, 2025 •

edited by bedevere-app bot

Loading

freakboy3742 left a comment

bedevere-app bot commented Mar 12, 2025

hoodmane commented Mar 12, 2025

hoodmane commented Mar 12, 2025

freakboy3742 commented Mar 12, 2025

hoodmane commented Mar 12, 2025

freakboy3742 commented Mar 13, 2025

hoodmane commented Mar 13, 2025

markshannon commented Mar 13, 2025

hoodmane commented Mar 13, 2025

GH-130396: Work around for broken pthread_get_stackaddr_np on Emscripten #131088

GH-130396: Work around for broken pthread_get_stackaddr_np on Emscripten #131088

Conversation

hoodmane commented Mar 11, 2025 • edited by bedevere-app bot Loading

freakboy3742 left a comment

Choose a reason for hiding this comment

bedevere-app bot commented Mar 12, 2025

hoodmane commented Mar 12, 2025

hoodmane commented Mar 12, 2025

freakboy3742 commented Mar 12, 2025

hoodmane commented Mar 12, 2025

freakboy3742 commented Mar 13, 2025

hoodmane commented Mar 13, 2025

markshannon commented Mar 13, 2025

hoodmane commented Mar 13, 2025

GH-130396: Work around for broken `pthread_get_stackaddr_np` on Emscripten #131088

GH-130396: Work around for broken `pthread_get_stackaddr_np` on Emscripten #131088

hoodmane commented Mar 11, 2025 •

edited by bedevere-app bot

Loading