Skip to content

Conversation

msorvig
Copy link
Contributor

@msorvig msorvig commented Jun 14, 2024

Export emscripten::val and emscripten::memory_view in order to prevent embind errors:

BindingError: _emval_take_value has unknown type N10emscripten11memory_viewIhEE

Embind generates a numerical type id from the address of the std::type_info object resulting from evaluating a typeid expressiion (e.g. 'void *id = &typeid(T)').

However, C++ does not guarantee that this address is identical for all evaluations of the typeid expression. In practice it is when using static linking, but not when using dynamic linking when the libraries are
built with the '-fvisibility=hidden' compiler option.

The non-identical id's then cause embind to decide that types have not been registered when used from a library, since they have been registered with a different id by the main wasm module.

Exporting the types in question makes typeid addresses identical again, and fixes/works around the issue.

@msorvig
Copy link
Contributor Author

msorvig commented Jun 14, 2024

This looks to be sufficient for Qt's use case. More types can be exported if needed.

There's a bit more to it: ideally embind should work with non-exported ("hidden") types as well, and it looks like C++ API like std::type_info::hash_code() and std::type_index may not work correctly for non-exported types.

Looking at at system/lib/libcxx/include/typeinfo, there are two options from implementing typeinfo:

  1. _LIBCPP_TYPEINFO_COMPARISON_IMPLEMENTATION = 1: (default): "This implementation of type_info assumes a unique copy of the RTTI for a given type inside a program"
  2. _LIBCPP_TYPEINFO_COMPARISON_IMPLEMENTATION = 2: "This implementation of type_info does not assume there is always a unique copy of the RTTI for a given type inside a program"

Emscripten currently uses the default option 1, but does not generate unique RTTI for non-exported types in libraries/side modules.

@brendandahl
Copy link
Collaborator

@sbc100 Do you think there's some way we could properly have unique rtti for the program?

@msorvig msorvig force-pushed the embind-visibility-hidden branch from 76f29df to e7062a7 Compare June 17, 2024 11:55
@msorvig
Copy link
Contributor Author

msorvig commented Jun 17, 2024

Note previous discussion on #16711

@msorvig msorvig force-pushed the embind-visibility-hidden branch from e7062a7 to 9391428 Compare June 17, 2024 11:59
@sbc100
Copy link
Collaborator

sbc100 commented Jun 17, 2024

@sbc100 Do you think there's some way we could properly have unique rtti for the program?

I think the way to achieve that is make sure the types are marked as defaultand not hidden visibility.

This fix seems correct to me for the core embind types. They always want to be shared I think.

We also have -DEMSCRIPTEN_HAS_UNBOUND_TYPE_NAMES=0 which looks like it tells embind not to use RTTI info at all... although I don't really understand the naming of that macro.

self.set_setting('MAIN_MODULE', 1)

@requires_dylink
def test_embind_dylink_visibility_hidden(self):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks similar to what test_dylink_rtti is doing in test_core.py, but with some embind specifics. As such, perhaps it belongs alongside that test?

Rather than duplicating the three helper functions from test_core.py why not just put this test in test_core.py itself?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It also might make more sense alongside test_embind_no_rtti

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, tst_core could make sense as well (I had it there first). Is it primarily a dylink or embind test? test_embind_no_rtti sounds related, I'll check what that does.

(Note I have vacation time is coming up, so I'll get back to this in a couple of week's time.)

@brendandahl
Copy link
Collaborator

@msorvig would you mind updating this branch to re-run tests.

@laminowany
Copy link
Contributor

With this patch I'm still getting parameter 0 has unknown type N10emscripten3valE error for my code.

@brendandahl
Copy link
Collaborator

@laminowany Do you have a small reproducer we could test with?

@laminowany
Copy link
Contributor

@brendandahl unfortunately I don't really have a "small" case. Basically first I'm building the Qt framework from dev branch, then I'm building this https://github.com/mitchcurtis/slate using previously built Qt.
Are there some intermediate build files that could give you some insights? Or do you need some specific information that I could extract?

@brendandahl
Copy link
Collaborator

@brendandahl unfortunately I don't really have a "small" case. Basically first I'm building the Qt framework from dev branch, then I'm building this https://github.com/mitchcurtis/slate using previously built Qt. Are there some intermediate build files that could give you some insights? Or do you need some specific information that I could extract?

The output from cmake --verbose could be helpful or relalvent emcc flags you're using.

@laminowany
Copy link
Contributor

Here are flags I'm using:
em++ -g -s MODULARIZE=1 -s EXPORT_NAME=app_entry -s EXPORTED_RUNTIME_METHODS=UTF16ToString,stringToUTF16,JSEvents,specialHTMLTargets,FS,callMain -s INITIAL_MEMORY=50MB -s MAXIMUM_MEMORY=4GB -s MAX_WEBGL_VERSION=2 -s WASM_BIGINT=1 -s STACK_SIZE=5MB -s ALLOW_MEMORY_GROWTH -sMAIN_MODULE=1 -sFETCH -lembind

@laminowany
Copy link
Contributor

This line seems to cause an issue:
emscripten::val rawPlatform = emscripten::val::global("navigator")["platform"];

@laminowany
Copy link
Contributor

Seems like emscripten::val does not have a stable rawType id across libraries? It works when applying patch from #18418

@laminowany
Copy link
Contributor

I'm new to Emscripten project so I dont know a lot about codebase, feel free to correct me!

So basically seems to me that types exported by embind:

EMSCRIPTEN_BINDINGS(builtin) {
using namespace emscripten::internal;
_embind_register_void(TypeID<void>::get(), "void");
_embind_register_bool(TypeID<bool>::get(), "bool", true, false);
static_assert(sizeof(bool) == 1);
register_integer<char>("char");
register_integer<signed char>("signed char");
register_integer<unsigned char>("unsigned char");
register_integer<signed short>("short");
register_integer<unsigned short>("unsigned short");
register_integer<signed int>("int");
register_integer<unsigned int>("unsigned int");

have different typeIDs, if they come from different object files (i.e. side module vs main module).

This is understandable given then typeID is generated using RTTI:

template<typename T>
struct LightTypeID {
static constexpr TYPEID get() {
if (has_unbound_type_names) {
#if __has_feature(cxx_rtti)
return &typeid(T);

(to be fair I'm not sure what is the lifetime of typeid object in Emscripten and if is safe to take its address)

There is also other mechanism to generate typeIDs without RTTI which can be triggered by setting EMSCRIPTEN_HAS_UNBOUND_TYPE_NAMES=0, but it seems to be bugged on the program I tested.

Could you point me @brendandahl into some direction?
I see two potential approaches:

  1. Come up with new way to generate typeID which would be stable? Maybe typeid(T).name() is a better candidate since type names remain the same across object files? Then, it can be stored by name (or hash of it) in a dictionary like unordered_map.
  2. Having a second mechanism of checking if type is registered in embind_shared,js. Currently its done here:
    $requireRegisteredType: (rawType, humanName) => {
    var impl = registeredTypes[rawType];
    if (undefined === impl) {
    throwBindingError(`${humanName} has unknown type ${getTypeName(rawType)}`);
    }
    return impl;
    },

    where its being checked in registeredTypes by raw type. Maybe having another register like registeredNames and then checking as a fallback by name could be a fix? This is more or less what embind: Add workaround for unstable type-ids between shared libraries and executable #18418 is doing. Or is it too much of workaround?

@brendandahl
Copy link
Collaborator

Option 1 sounds interesting. I'm not that familiar with typeid(T).name() inner workings, but it appears to be what we'd want. If you could get it to work with an integer hash of the name, then all the JS may "just work" as is.

@sbc100
Copy link
Collaborator

sbc100 commented Feb 4, 2025

IIRC typeid in general should be stable across DLLs. I'm pretty sure we do have tests for it.

The problem only occurs when you do -fvisibility=hidden then types are not shared at all between DLLs. I wonder if we can add some explict visibility decorators to the types in question to force them not to be hidden?

@laminowany
Copy link
Contributor

After further investigation it seems like this patch is actually solving my problem! 😄
I was building locally different version of Emscripten in the past, and when I cherry-picked this change I did not have clean build. Running emcc --clear-cache did the trick for me and now it seems like everything is working 🎉
Sorry for the confusion, that was my mistake.

@sbc100 sbc100 requested a review from brendandahl February 5, 2025 22:49
@sbc100
Copy link
Collaborator

sbc100 commented Feb 6, 2025

@msorvig are you still interested in getting this landed? I think it just requires a test update and it should be good to go.

@msorvig
Copy link
Contributor Author

msorvig commented Feb 7, 2025

Hello, sorry for being a bit unresponsive - Yes, let's see if we can get this merged.

I'm not 100% sure if this fixes completely the issue, but it should help. Emscripten should implement RTTI correctly also for unexported / -fvisibility=hidden types, and then embind can use type_info::hash_code() for its ID.

type_info::name() was suggested above and is also an option, but it looks like the reference documentation at https://en.cppreference.com/w/cpp/types/type_info states that the name is not guaranteed to be unique or stable.

@msorvig msorvig force-pushed the embind-visibility-hidden branch 4 times, most recently from 789474b to 560db4e Compare February 11, 2025 15:27
@laminowany
Copy link
Contributor

What is the status of it? Could we get it merged @sbc100 ?

Export emscripten::val and emscripten::memory_view
in order to prevent embind errors:

  BindingError: _emval_take_value has unknown type N10emscripten11memory_viewIhEE

Embind generates a numerical type id from the address
of the std::type_info object resulting from evaluating
a typeid expressiion (e.g. 'void *id = &typeid(T)').

However, C++ does not guarantee that this address is
identical for all evaluations of the typeid expression.
In practice it is when using static linking, but not
when using dynamic linking when the libraries are
built with the '-fvisibility=hidden' compiler option.

The non-identical id's then cause embind to decide
that types have not been registered when used from a
library, since they have been registered with a different
id by the main wasm module.

Exporting the types in question makes typeid addresses
identical again, and fixes/works around the issue.
@msorvig msorvig force-pushed the embind-visibility-hidden branch from 05efbbd to bd45403 Compare March 17, 2025 15:03
@sbc100
Copy link
Collaborator

sbc100 commented Mar 17, 2025

@brendandahl is this good with you now?

@sbc100 sbc100 merged commit 5c63a71 into emscripten-core:main Mar 17, 2025
28 checks passed
@laminowany
Copy link
Contributor

Thanks! ❤️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants