Skip to content

[libcxx] Option to disable overridden function detection #108734

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

petrhosek
Copy link
Member

@petrhosek petrhosek commented Sep 15, 2024

The overriden function detection currently uses a custom section which can break existing linker scripts that are commonly used in embedded systems. It also results in the overridden functions being kept in the binary even though it's not being used leading to extra bloat which is a concern for embedded systems. This change introduces an option to disable this feature.

The overriden function detection requires the use of a custom section
which can break existing linker scripts that are commonly used in
embedded systems. This change introduces an option to disable this
feature.
@petrhosek petrhosek added the libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi. label Sep 15, 2024
@petrhosek petrhosek requested a review from ldionne September 15, 2024 06:54
@petrhosek petrhosek requested a review from a team as a code owner September 15, 2024 06:54
@llvmbot
Copy link
Member

llvmbot commented Sep 15, 2024

@llvm/pr-subscribers-libcxx

Author: Petr Hosek (petrhosek)

Changes

The overriden function detection requires the use of a custom section which can break existing linker scripts that are commonly used in embedded systems. This change introduces an option to disable this feature.


Full diff: https://github.com/llvm/llvm-project/pull/108734.diff

3 Files Affected:

  • (modified) libcxx/CMakeLists.txt (+4)
  • (modified) libcxx/include/__config_site.in (+1)
  • (modified) libcxx/src/include/overridable_function.h (+2-2)
diff --git a/libcxx/CMakeLists.txt b/libcxx/CMakeLists.txt
index f1942e963ccc31..da492cdd3d3c0e 100644
--- a/libcxx/CMakeLists.txt
+++ b/libcxx/CMakeLists.txt
@@ -127,6 +127,9 @@ option(LIBCXX_ENABLE_VENDOR_AVAILABILITY_ANNOTATIONS
    to provide compile-time errors when using features unavailable on some version of
    the shared library they shipped should turn this on and see `include/__configuration/availability.h`
    for more details." OFF)
+option(LIBCXX_ENABLE_OVERRIDDEN_FUNCTION_DETECTION
+  "Whether to support detecting if an overridable function (typically a weak symbol)
+   has been overidden by a user or not." ON)
 
 if (CMAKE_CXX_COMPILER_ID STREQUAL "GNU")
   set(LIBCXX_DEFAULT_TEST_CONFIG "llvm-libc++-shared-gcc.cfg.in")
@@ -763,6 +766,7 @@ config_define_if_not(LIBCXX_ENABLE_UNICODE _LIBCPP_HAS_NO_UNICODE)
 config_define_if_not(LIBCXX_ENABLE_WIDE_CHARACTERS _LIBCPP_HAS_NO_WIDE_CHARACTERS)
 config_define_if_not(LIBCXX_ENABLE_TIME_ZONE_DATABASE _LIBCPP_HAS_NO_TIME_ZONE_DATABASE)
 config_define_if_not(LIBCXX_ENABLE_VENDOR_AVAILABILITY_ANNOTATIONS _LIBCPP_HAS_NO_VENDOR_AVAILABILITY_ANNOTATIONS)
+config_define_if_not(LIBCXX_ENABLE_OVERRIDDEN_FUNCTION_DETECTION _LIBCPP_HAS_NO_OVERRIDDEN_FUNCTION_DETECTION)
 
 if (LIBCXX_ENABLE_ASSERTIONS)
   message(DEPRECATION "LIBCXX_ENABLE_ASSERTIONS is deprecated and will be removed in LLVM 20. Please use LIBCXX_HARDENING_MODE instead.")
diff --git a/libcxx/include/__config_site.in b/libcxx/include/__config_site.in
index bf2d31d8eeb1b9..b30304a2c34651 100644
--- a/libcxx/include/__config_site.in
+++ b/libcxx/include/__config_site.in
@@ -22,6 +22,7 @@
 #cmakedefine _LIBCPP_HAS_THREAD_API_WIN32
 #cmakedefine _LIBCPP_DISABLE_VISIBILITY_ANNOTATIONS
 #cmakedefine _LIBCPP_HAS_NO_VENDOR_AVAILABILITY_ANNOTATIONS
+#cmakedefine _LIBCPP_HAS_NO_OVERRIDDEN_FUNCTION_DETECTION
 #cmakedefine _LIBCPP_NO_VCRUNTIME
 #cmakedefine _LIBCPP_TYPEINFO_COMPARISON_IMPLEMENTATION @_LIBCPP_TYPEINFO_COMPARISON_IMPLEMENTATION@
 #cmakedefine _LIBCPP_HAS_NO_FILESYSTEM
diff --git a/libcxx/src/include/overridable_function.h b/libcxx/src/include/overridable_function.h
index 6c70f6242ddd63..f80176727ade55 100644
--- a/libcxx/src/include/overridable_function.h
+++ b/libcxx/src/include/overridable_function.h
@@ -63,7 +63,7 @@
 // want to be defining special sections inside user's executables which use our headers.
 //
 
-#if defined(_LIBCPP_OBJECT_FORMAT_MACHO)
+#if defined(_LIBCPP_OBJECT_FORMAT_MACHO) && !defined(_LIBCPP_HAS_NO_OVERRIDDEN_FUNCTION_DETECTION)
 
 #  define _LIBCPP_CAN_DETECT_OVERRIDDEN_FUNCTION 1
 #  define _LIBCPP_MAKE_OVERRIDABLE_FUNCTION_DETECTABLE                                                                 \
@@ -97,7 +97,7 @@ _LIBCPP_HIDE_FROM_ABI bool __is_function_overridden(_Ret (*__fptr)(_Args...)) no
 _LIBCPP_END_NAMESPACE_STD
 
 // The NVPTX linker cannot create '__start/__stop' sections.
-#elif defined(_LIBCPP_OBJECT_FORMAT_ELF) && !defined(__NVPTX__)
+#elif defined(_LIBCPP_OBJECT_FORMAT_ELF) && !defined(__NVPTX__) && !defined(_LIBCPP_HAS_NO_OVERRIDDEN_FUNCTION_DETECTION)
 
 #  define _LIBCPP_CAN_DETECT_OVERRIDDEN_FUNCTION 1
 #  define _LIBCPP_MAKE_OVERRIDABLE_FUNCTION_DETECTABLE __attribute__((__section__("__lcxx_override")))

Copy link

⚠️ C/C++ code formatter, clang-format found issues in your code. ⚠️

You can test this locally with the following command:
git-clang-format --diff 9fc789d922ddf1c849f62bdfbe034f3cd9354967 c470197b1876ea4bf4b9983a77dfeb6085e94159 --extensions h -- libcxx/src/include/overridable_function.h
View the diff from clang-format here.
diff --git a/libcxx/src/include/overridable_function.h b/libcxx/src/include/overridable_function.h
index f80176727a..bbab351bbc 100644
--- a/libcxx/src/include/overridable_function.h
+++ b/libcxx/src/include/overridable_function.h
@@ -97,7 +97,8 @@ _LIBCPP_HIDE_FROM_ABI bool __is_function_overridden(_Ret (*__fptr)(_Args...)) no
 _LIBCPP_END_NAMESPACE_STD
 
 // The NVPTX linker cannot create '__start/__stop' sections.
-#elif defined(_LIBCPP_OBJECT_FORMAT_ELF) && !defined(__NVPTX__) && !defined(_LIBCPP_HAS_NO_OVERRIDDEN_FUNCTION_DETECTION)
+#elif defined(_LIBCPP_OBJECT_FORMAT_ELF) && !defined(__NVPTX__) &&                                                     \
+    !defined(_LIBCPP_HAS_NO_OVERRIDDEN_FUNCTION_DETECTION)
 
 #  define _LIBCPP_CAN_DETECT_OVERRIDDEN_FUNCTION 1
 #  define _LIBCPP_MAKE_OVERRIDABLE_FUNCTION_DETECTABLE __attribute__((__section__("__lcxx_override")))

@philnik777
Copy link
Contributor

Why does this require a feature to disable? Can't you just update the linker script? That shouldn't be that hard. Or we could even look into putting it into a subsection like .text.__libcxx_whatever_it_was. Wouldn't that also solve the problem?

@petrhosek
Copy link
Member Author

Why does this require a feature to disable? Can't you just update the linker script? That shouldn't be that hard.

Ideally yes, we would update the linker script, but that can be challenging since we always don't have access to linker scripts used by our partners and even when we do, it can require significant effort to find and update every single linker script. I'm worried that this is really going to hamper libc++ adoption in the embedded space.

Or we could even look into putting it into a subsection like .text.__libcxx_whatever_it_was. Wouldn't that also solve the problem?

That doesn't work. We rely on the generated __start_<name> and __stop_<name> symbols but those are only generated for sections that use C identifier names (so not .text). Plus if we used .text.<whatever>, it'd be merged merged into .text output section and __is_function_overridden would always return true making this feature useless.

@ldionne
Copy link
Member

ldionne commented Sep 16, 2024

I'm iffy about adding an option just to disable that -- that feels like a hack. There might be a more general configuration option trying to emerge here. What do these linker scripts do that is special and breaks how this detection work?

@philnik777
Copy link
Contributor

I'm iffy about adding an option just to disable that -- that feels like a hack. There might be a more general configuration option trying to emerge here. What do these linker scripts do that is special and breaks how this detection work?

Just my guess: Linker scripts usually define where the sections go explicitly. Since our section isn't part of one of the standard sections it's not picked the linker script wildcards it simply discards it.

@petrhosek
Copy link
Member Author

I'm iffy about adding an option just to disable that -- that feels like a hack. There might be a more general configuration option trying to emerge here. What do these linker scripts do that is special and breaks how this detection work?

Just my guess: Linker scripts usually define where the sections go explicitly. Since our section isn't part of one of the standard sections it's not picked the linker script wildcards it simply discards it.

That's precisely the reason. See for example the Raspberry Pi Pico linker script in Pico SDK which is one of the projects where this issue came up. Since the linker script doesn't place the __lcxx_override section explicitly, the linker places it arbitrarily. On this particular target, the section is placed somewhere that breaks assumptions the boot ROM makes about code locations (between .boot2 and .text). Since operator new is in __lcxx_override, any call to new breaks the system.

I'd also point out that Pico SDK was using libc++ without any issues. They started reporting breakages after upgrading their toolchain to a version that included 3145265. Whether intentional or not, for embedded users that change is a breaking change and I think we should have an opt-out mechanism because updating all affected linker scripts to take __lcxx_override into account will take some time.

@petrhosek
Copy link
Member Author

Ping?

@philnik777
Copy link
Contributor

More general thought: Could we implement this mechanism through aliases instead: https://godbolt.org/z/E5oKj9hfe? I think that might make __is_function_overriden a bit more lean and avoid having to put the functions into a separate section. That probably also doesn't work everywhere, but that doesn't seem like a huge problem to me.

@petrhosek
Copy link
Member Author

More general thought: Could we implement this mechanism through aliases instead: https://godbolt.org/z/E5oKj9hfe? I think that might make __is_function_overriden a bit more lean and avoid having to put the functions into a separate section. That probably also doesn't work everywhere, but that doesn't seem like a huge problem to me.

I looked into it and try to implement this approach in libc++, but then realized that this approach has a serious downside in that both the original function and the one that overrides it would be kept in the binary even though the original is only used for the address comparison. This may not be a big deal on other systems but it's a concern for baremetal uses where every byte matters and don't want to ship any dead code.

With the existing implementation, that is also the case, but if you use a custom linker script you can manually discard it although that would effectively disable __is_function_overriden same as this change does but is more involved. Using that approach would likely result in embedded developers cargo culting the same snippet into every linker script which I think is undesirable.

Therefore, I'd would like to merge this change as is. I can try to finish and upload the implementation based on aliases for ELF as a follow up PR, but for baremetal we always want to disable this regardless of the implementation.

@petrhosek petrhosek added the embedded Support for embedded development label Oct 18, 2024
@philnik777
Copy link
Contributor

More general thought: Could we implement this mechanism through aliases instead: https://godbolt.org/z/E5oKj9hfe? I think that might make __is_function_overriden a bit more lean and avoid having to put the functions into a separate section. That probably also doesn't work everywhere, but that doesn't seem like a huge problem to me.

I looked into it and try to implement this approach in libc++, but then realized that this approach has a serious downside in that both the original function and the one that overrides it would be kept in the binary even though the original is only used for the address comparison. This may not be a big deal on other systems but it's a concern for baremetal uses where every byte matters and don't want to ship any dead code.

With the existing implementation, that is also the case, but if you use a custom linker script you can manually discard it although that would effectively disable __is_function_overriden same as this change does but is more involved. Using that approach would likely result in embedded developers cargo culting the same snippet into every linker script which I think is undesirable.

Therefore, I'd would like to merge this change as is. I can try to finish and upload the implementation based on aliases for ELF as a follow up PR, but for baremetal we always want to disable this regardless of the implementation.

How would that result in dead code being shipped? If you properly override all the overloads the compiler can discard the libc++ versions and if you don't provide any yourself the checks can be constant-folded assuming you're using LTO.

@petrhosek
Copy link
Member Author

More general thought: Could we implement this mechanism through aliases instead: https://godbolt.org/z/E5oKj9hfe? I think that might make __is_function_overriden a bit more lean and avoid having to put the functions into a separate section. That probably also doesn't work everywhere, but that doesn't seem like a huge problem to me.

I looked into it and try to implement this approach in libc++, but then realized that this approach has a serious downside in that both the original function and the one that overrides it would be kept in the binary even though the original is only used for the address comparison. This may not be a big deal on other systems but it's a concern for baremetal uses where every byte matters and don't want to ship any dead code.
With the existing implementation, that is also the case, but if you use a custom linker script you can manually discard it although that would effectively disable __is_function_overriden same as this change does but is more involved. Using that approach would likely result in embedded developers cargo culting the same snippet into every linker script which I think is undesirable.
Therefore, I'd would like to merge this change as is. I can try to finish and upload the implementation based on aliases for ELF as a follow up PR, but for baremetal we always want to disable this regardless of the implementation.

How would that result in dead code being shipped? If you properly override all the overloads the compiler can discard the libc++ versions and if you don't provide any yourself the checks can be constant-folded assuming you're using LTO.

LTO isn't generally usable on baremetal because it's incompatible with certain linker scripts constructs; this is something the community has been trying to address for some time but there still isn't a working solution.

@petrhosek
Copy link
Member Author

#114961 implements the overridden function detection using aliases as discussed in the libc++ monthly meeting.

@petrhosek petrhosek closed this Nov 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
embedded Support for embedded development libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants