Skip to content

Conversation

LanderlYoung
Copy link

@LanderlYoung LanderlYoung commented Aug 21, 2025

  1. fix symbol map line parse
  2. unescape ascii code
  3. update tests: test/runner "other.test_emsymbolizer*"

This PR fixes #24982

@LanderlYoung LanderlYoung force-pushed the bufix/emsymbolizer_symbolmap branch from c81e65c to d93eb85 Compare August 21, 2025 07:24
…en-core#24982

1. fix symbol map line parse
2. unescape ascii code
3. update tests: test/runner "other.test_emsymbolizer*"
@LanderlYoung LanderlYoung force-pushed the bufix/emsymbolizer_symbolmap branch from d93eb85 to f20fd5e Compare August 21, 2025 08:53
@LanderlYoung LanderlYoung force-pushed the bufix/emsymbolizer_symbolmap branch from 549ccad to d792fa6 Compare August 21, 2025 11:41
@LanderlYoung LanderlYoung changed the title fix #24982: emsymbolizer failed to parse symbol map from C++ project Fix #24982: emsymbolizer failed to parse symbol map from C++ project Aug 21, 2025
Copy link
Member

@dschuff dschuff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Copy link
Member

@kripken kripken left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm otherwise

@sbc100
Copy link
Collaborator

sbc100 commented Aug 25, 2025

Is there some reason we cannot or should not just update the symbol map to avoid this mangling in the first place.

What is the source of the mangling in the first place? What style of mangling is std::out_of_range::~out_of_range\28\29 ?

@dschuff
Copy link
Member

dschuff commented Aug 25, 2025

I don't know what kind of mangling that is; it doesn't exactly match others that I'm familiar with.
My assumption was that there are users that are depending on the contents of the symbol map. and would have some kind of workflow that would be broken if we were to change it. Maybe I'm wrong and we could just do that. Or, maybe it's ess of a breaking change to remove the special mangling than to go to a fully-C++-mangled state, since presumably users who depend on the current format would have their own demangling, so if we just stopped adding the mangling, maybe they wouldn't be broken?

@sbc100
Copy link
Collaborator

sbc100 commented Aug 25, 2025

I don't know what kind of mangling that is; it doesn't exactly match others that I'm familiar with. My assumption was that there are users that are depending on the contents of the symbol map. and would have some kind of workflow that would be broken if we were to change it. Maybe I'm wrong and we could just do that. Or, maybe it's ess of a breaking change to remove the special mangling than to go to a fully-C++-mangled state, since presumably users who depend on the current format would have their own demangling, so if we just stopped adding the mangling, maybe they wouldn't be broken?

I'd be tempted to make that change to the symbol map to include fully demanged symbols. @kripken do you remember why we have these strange escape codes in the map file?

@kripken
Copy link
Member

kripken commented Aug 26, 2025

I'm not sure. But isn't the symbol map just copying the name section? It should contain whatever is there iirc.

@sbc100
Copy link
Collaborator

sbc100 commented Aug 26, 2025

So maybe the issue is that the name section is broken?

@sbc100
Copy link
Collaborator

sbc100 commented Aug 26, 2025

So maybe the issue is that the name section is broken?

The name sections is fine. The problem seems to stem from the wasm-opt --print-function-map command.. I guess binaryen doesn't like those symbols.

I think we should file a binaryen bug, or maybe just avoid binaryen for this purpose (since we don't need to run process the whole wasm file, only the name section.

@kripken
Copy link
Member

kripken commented Aug 26, 2025

@sbc100 do you want this PR to wait until we figure that out, or can this land?

@sbc100
Copy link
Collaborator

sbc100 commented Aug 26, 2025

@sbc100 do you want this PR to wait until we figure that out, or can this land?

I don't think we we should land this since it would only act to further lock in the strange escaping behavior

Copy link
Collaborator

@sbc100 sbc100 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should go with #25053 instead

@kripken
Copy link
Member

kripken commented Aug 26, 2025

Oh, I think the first part of this PR is still important though:

fix symbol map line parse

That handles splitting when the line contains :: as a separator.

We can refocus this PR on just that?

@sbc100
Copy link
Collaborator

sbc100 commented Aug 26, 2025

Oh, I think the first part of this PR is still important though:

fix symbol map line parse

That handles splitting when the line contains :: as a separator.

We can refocus this PR on just that?

Ah, yes that could make sense.

@LanderlYoung LanderlYoung requested a review from sbc100 August 27, 2025 03:17
@LanderlYoung LanderlYoung force-pushed the bufix/emsymbolizer_symbolmap branch from 3a59a68 to b881a2e Compare August 27, 2025 03:30
@LanderlYoung
Copy link
Author

Thanks guys! The name escape part has been reverted. Please review 🌹

Copy link
Member

@dschuff dschuff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good overall, just a couple of nits on the test.

@LanderlYoung LanderlYoung force-pushed the bufix/emsymbolizer_symbolmap branch from af71e60 to b2ffeb2 Compare August 28, 2025 06:32
@LanderlYoung
Copy link
Author

Some other.test_codesize* tests seemed to fail, but I didn't find a relation with this code change.

Copy link
Member

@dschuff dschuff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you merge the PR to the tip of the main branch, the code size test failures should go away. And you'll have to do that anyway, since you have a conflict.

@@ -11031,6 +11011,29 @@ def check_symbolmap_info(address, func):
# The name section will not show bar, as it's inlined into main
check_symbolmap_info(unreachable_addr, '__original_main')

# 3. Test symbol map on C++ name mangling
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# 3. Test symbol map on C++ name mangling
# 3. Test symbol map on demangled C++ names

@@ -218,12 +218,19 @@ def symbolize_address_sourcemap(module, address, force_file):

def symbolize_address_symbolmap(module, address, symbol_map_file):
"""Symbolize using a symbol map file."""

def split_symbolmap_line(line):
index = line.find(':')
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can just do return line.split(':', 1)

If you want and explicit error you could also add assert ':' in line, 'f'invalid symbolmap line: {line}'

check_cpp_symbolmap_info(unreachable_addr, 'Namespace::bar') # the function name
check_cpp_symbolmap_info(unreachable_addr, 'Namespace::ClassA') # the type parameter
check_cpp_symbolmap_info(unreachable_addr, 'Namespace::ClassB') # the parameter
check_symbol_map_contains('out_to_js') # the imported function
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I think maybe we should be making test_symbol_map any more complex by including emsymbolizer stuff too. Can we go back adding this to a new or existing test_emsymboliser_xx test?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or... could we just turn test_dwarf.c into a c++ test and then have the existing symbol map test include C++ symbols? that would make this a very small increment over the existing test.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

emsymbolizer failed to parse symbol map from C++ project
4 participants