Skip to content

Find the initial sbrk value (== base of the stack) in C code? #12037

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
kripken opened this issue Aug 25, 2020 · 12 comments
Closed

Find the initial sbrk value (== base of the stack) in C code? #12037

kripken opened this issue Aug 25, 2020 · 12 comments

Comments

@kripken
Copy link
Member

kripken commented Aug 25, 2020

Background: One of the tasks for WebAssembly/binaryen#3043 is to stop the special handling of sbrk (DYNAMICTOP_PTR, emscripten_get_sbrk_ptr, etc.) which requires a bunch of work after link. To replace all that, we should implement sbrk 100% in wasm. I have a prototype mostly working in a branch, but have hit the following problem.

The problem: sbrk(0) at program start should return the initial location of the program break, that is, the top of used memory by malloc. In emscripten we have first static allocations, then the stack, and then that region. sbrk.c needs to initialize itself to that value.

It seems there isn't a good way for this atm? stack_ops.s calls out to emscripten_stack_get_base in JS to do it, where we have STACK_BASE. In general though if would be smaller and better to avoid JS for this. And to embed it in the wasm manually (which we do for emscripten_get_sbrk_ptr) requires work after link which is what I'm trying to remove.

This seems like the kind of thing a special intrinsic could do, but it would need to be implemented by the linker (are there such things?)

cc @sbc100 @tlively @aheejin @dschuff @aardappel

@kripken
Copy link
Member Author

kripken commented Aug 25, 2020

It appears __heap_base may be what I want.

@dschuff
Copy link
Member

dschuff commented Aug 25, 2020

In ELF (and IIRC in wasm) the linker generates symbols that point to the beginning and end of sections, e.g. __start_foo and __stop_foo for a section named foo. Does the beginning of the break line up with the start or end of a section e.g. .data?

@dschuff
Copy link
Member

dschuff commented Aug 25, 2020

oh yeah __heap_base is even better.

@kripken
Copy link
Member Author

kripken commented Aug 25, 2020

Ok, yeah, __heap_base seems like the way to go, it works in my testing.

(I don't think there's another way to get to that location - it's after the stack, but that isn't a wasm section.)

Sorry for the wide cc - I initially thought this was much harder than it actually is, but I just wasn't searching properly for __heap_base apparently...

@kripken kripken closed this as completed Aug 25, 2020
@kripken
Copy link
Member Author

kripken commented Aug 26, 2020

There is still a problem with dynamic linking here. __heap_base is not defined there normally, instead there is

 (import "GOT.mem" "__heap_base" (global $gimport$150 (mut i32)))

Is there a way to access that from C? It would require an intrinsic to access an imported global I guess?

@kripken kripken reopened this Aug 26, 2020
@sbc100
Copy link
Collaborator

sbc100 commented Aug 26, 2020

binaryen should turn that into a call to g$__heap_base which I guess the dynamic linker (the JS glue) should define at runtime.

It you look for how the dynamic linker injects __memory_base you should probably do a similar thing. (although __memory_base is not C symbol so it might be a little different). Basically for dynamic linking you need to pre-define __heap_base on the module before you load the first wasm module I think. The value will dynamically calculated based on the static data used in all the modules.

@sbc100
Copy link
Collaborator

sbc100 commented Aug 26, 2020

Sadly it looks like the dynamic linker today uses getMemory to bump the static data for each module thats loaded. See: loadWebAssemblyModule

@kripken
Copy link
Member Author

kripken commented Aug 26, 2020

@sbc100 I see, thanks, then we'll still need wasm-emscripten-finalize for now for dynamic linking, then. That sounds ok though as it's not one of the urgent use cases for WebAssembly/binaryen#3043

@sbc100
Copy link
Collaborator

sbc100 commented Aug 26, 2020

For sure dynamic linking relies heavily on wasm-emscripten-finalize today. Its part of my ORKs to remove that :)

@kripken
Copy link
Member Author

kripken commented Aug 26, 2020

And yes, there is no good way to completely avoid something like getMemory for dynamic linking. I have a PR that will at least greatly simplify it, but some allocations from JS are unavoidable given that we need dynamic libraries to load before the main one, and that their loading is async (we'd need to add an extra async startup step, but that's a big cost).

@sbc100
Copy link
Collaborator

sbc100 commented Aug 26, 2020

Agreed. For dynamic linking we don't have the same goal of removing JS static layout alterations, and thats fine because wasm-ld isn't assuming a static layout in that case.

@kripken
Copy link
Member Author

kripken commented Aug 26, 2020

Ok, I think I have this working for dynamic linking now too, by updating __heap_base that is then imported into the main module.

@kripken kripken closed this as completed Aug 26, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants