Skip to content

Add the remaining toolchain-distributed runtime libs to autolink-extract #64312

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Mar 27, 2023
Merged

Add the remaining toolchain-distributed runtime libs to autolink-extract #64312

merged 6 commits into from
Mar 27, 2023

Conversation

gwynne
Copy link
Contributor

@gwynne gwynne commented Mar 13, 2023

This is a potential workaround for the issues reported in #58380. In testing, this change resulted in a 90% reduction in both link time and linker RAM usage with no measurable impact on the output of the build.

(I call it a workaround rather than a fix because the underlying behavior is arguably ld.gold's fault, not Swift's.)

Fixes #58380.

@al45tair
Copy link
Contributor

From the associated discussion:

I realize it may not be possible/safe to generically deduplicate platform libraries like libpthread, libm, etc., but just deduping libswiftGlibc, libBlocksRuntime, and the three libdispatch libraries would take care of over 1/3 of the repeats.

I am indeed a little concerned about the "// Common-use ordering-agnostic Linux system libs" section, as well as the "// Foundation support libs" section.

The problem is that while 99% of users don't care about link order, the 1% that do are going to be tripped up by this. How is performance without those two sections? Does that bring us down to acceptable levels?

@gwynne
Copy link
Contributor Author

gwynne commented Mar 13, 2023

The problem is that while 99% of users don't care about link order, the 1% that do are going to be tripped up by this. How is performance without those two sections? Does that bring us down to acceptable levels?

@al45tair I challenge the idea that link order is meaningful with respect to this issue in the first place. It's not possible to control the link order produced by swift-autolink-extract in the first place, and the deduplication of its output does not and can not! have any effect whatsoever on any other libraries swiftc includes in the linker argument list, which is where any attempts to control link order would have to take place. I don't think that a 1% to be tripped up by this exists in the first place. And if it did, wouldn't they already be getting tripped up by the duplication?

As for the effect of removing those two sections on the performance improvement, it's severe - the effect of duplicated libraries scales a bit worse than linearly (but not quite as bad as quadratically) as the number of duplicates increases; the system libs group makes up over 25% of the duplications (being consistently the ones with more duplicates than any other), and the Foundation support libs another almost 10%, which means you're losing as much as 50% of the improvement (maybe a bit less if the package doesn't import FoundationXML or FoundationNetworking).

Meanwhile, when I tried this PR with one of the larger non-OSS projects I work with, we saw both time and RAM usage drop by over 98% - the memory usage went from almost 17GB to under 750MB, and wall time from 2 full minutes to under 10s (and if that's not scary enough, ld.gold's reported mapped memory usage landed at 800MB... down from an utterly ridiculous 240GB). IMO, the sheer absurdity these numbers are reaching at a scale that Swift has otherwise been handling fairly well doesn't lend itself to halfway fixes, and I have every reason to believe the risk factor is acceptably low.

@al45tair
Copy link
Contributor

I challenge the idea that link order is meaningful with respect to this issue in the first place. It's not possible to control the link order produced by swift-autolink-extract in the first place, and the deduplication of its output does not and can not! have any effect whatsoever on any other libraries swiftc includes in the linker argument list, which is where any attempts to control link order would have to take place. I don't think that a 1% to be tripped up by this exists in the first place. And if it did, wouldn't they already be getting tripped up by the duplication?

Ah, I see what you're saying; because this is about autolinking specifically, anyone who was worried about link order is already going to have a bad day because there's no control over that. They'd have to turn autolinking off and then manually list the things to link with.

In which case I withdraw any objection.

you're losing as much as 50% of the improvement

You're right, that's too big a hit.

@al45tair
Copy link
Contributor

@swift-ci Please smoke test

Copy link
Member

@compnerd compnerd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ugh, this is really not the ideal solution. Generally speaking, I think that we should explore disabling autolink-extract on Linux and have SPM drive the link dependencies and possibly offer some control over the link order.

@compnerd
Copy link
Member

@swift-ci please test

@gwynne
Copy link
Contributor Author

gwynne commented Mar 13, 2023

Ugh, this is really not the ideal solution. Generally speaking, I think that we should explore disabling autolink-extract on Linux and have SPM drive the link dependencies and possibly offer some control over the link order.

Linux is the only target that uses autolink-extract in the first place, to my knowledge, and at least in the short term I much prefer the relatively non-invasive band-aid that makes a problem that's actively affecting users effectively vanish. (Besides, if I have to explain this issue to one more poor confused dev trying to build on DigitalOcean or Fly.Io and getting bizarre errors that randomly go away when they make some unrelated ill-advised change that has nothing to do with anything... 😒.) I certainly agree the larger issues around linkage should be dealt with in the longer term.

@gwynne
Copy link
Contributor Author

gwynne commented Mar 13, 2023

🤨 The Windows CI passes but the Linux doesn't? Also, I can't find any hints as to what went wrong in the Linux tests, it just says the SwiftPM tests suddenly exited with failure, no error messages, and those tests ran without issue when I did a buildbox_linux build locally.

@al45tair
Copy link
Contributor

Looks like this:

error: Exited with signal code 4

which isn't helpful.

I don't think this is anything you could have done. I'll re-trigger the tests.

@al45tair
Copy link
Contributor

@swift-ci Please test Linux platform

@compnerd
Copy link
Member

    /home/build-user/build/buildbot_linux/unified-swiftpm-build-linux-x86_64/x86_64-unknown-linux-gnu/release/IndexStoreDBPackageTests.xctest: error while loading shared libraries: libswift_RegexParser.so: cannot open shared object file: No such file or directory

I think that @gwynne already ran into this problem.

@gwynne
Copy link
Contributor Author

gwynne commented Mar 13, 2023

Yes, I had to apply this diff to swift-corelibs-xctest/CMakeLists.txt:

diff --git a/CMakeLists.txt b/CMakeLists.txt
index a259764..712f65f 100644
--- a/CMakeLists.txt
+++ b/CMakeLists.txt
@@ -62,6 +62,7 @@ if(NOT CMAKE_SYSTEM_NAME STREQUAL Darwin)
   endif()
 endif()
 set_target_properties(XCTest PROPERTIES
+  INSTALL_RPATH "$ORIGIN"
   Swift_MODULE_DIRECTORY ${CMAKE_CURRENT_BINARY_DIR}/swift
   INTERFACE_INCLUDE_DIRECTORIES ${CMAKE_CURRENT_BINARY_DIR}/swift)

The built libXCTest.so references libswift_RegexParser.so, but has no RUNPATH command in its headers, and the built .xctest binary references libXCTest.so before anything else that would have made the library findable, to the point where ldd shows it twice (paths trimmed for readability, emphasis mine):

build-user@b98bdb9ebc35:~$ ldd /home/build-user/build/buildbot_linux/unified-swiftpm-build-linux-aarch64/aarch64-unknown-linux-gnu/release/IndexStoreDBPackageTests.xctest
    linux-vdso.so.1 (0x0000ffff8b460000)
    libXCTest.so => lib/swift/linux/libXCTest.so (0x0000ffff8b1e0000)
    libm.so.6 => /lib/aarch64-linux-gnu/libm.so.6 (0x0000ffff8b140000)
    libFoundation.so => lib/swift/linux/libFoundation.so (0x0000ffff8a8c0000)
    libBlocksRuntime.so => lib/swift/linux/libBlocksRuntime.so (0x0000ffff8a8a0000)
    libswift_Concurrency.so => lib/swift/linux/libswift_Concurrency.so (0x0000ffff8a820000)
    libswift_StringProcessing.so => lib/swift/linux/libswift_StringProcessing.so (0x0000ffff8a730000)
    libswiftDispatch.so => lib/swift/linux/libswiftDispatch.so (0x0000ffff8a6e0000)
    libswiftCore.so => lib/swift/linux/libswiftCore.so (0x0000ffff8a0e0000)
    libswiftGlibc.so => lib/swift/linux/libswiftGlibc.so (0x0000ffff8a0b0000)
    libdispatch.so => lib/swift/linux/libdispatch.so (0x0000ffff8a030000)
    libstdc++.so.6 => /lib/aarch64-linux-gnu/libstdc++.so.6 (0x0000ffff89e00000)
    libgcc_s.so.1 => /lib/aarch64-linux-gnu/libgcc_s.so.1 (0x0000ffff89dd0000)
    libc.so.6 => /lib/aarch64-linux-gnu/libc.so.6 (0x0000ffff89c20000)
    /lib/ld-linux-aarch64.so.1 (0x0000ffff8b427000)
**  libswift_RegexParser.so => not found  **
    libicuucswift.so.65 => lib/swift/linux/libicuucswift.so.65 (0x0000ffff89a20000)
    libicui18nswift.so.65 => lib/swift/linux/libicui18nswift.so.65 (0x0000ffff89710000)
    libicudataswift.so.65 => lib/swift/linux/libicudataswift.so.65 (0x0000ffff87c50000)
**  libswift_RegexParser.so => lib/swift/linux/libswift_RegexParser.so (0x0000ffff87b30000) **

I have no idea if my fix is the right one (I know @compnerd is doubtful, and he's probably right).

@gwynne
Copy link
Contributor Author

gwynne commented Mar 13, 2023

Update: I think I know why this PR broke it - if I'm correct, it's a particularly weird and thoroughly ironic case of link order mattering after all. Or more precisely, the link order mattering in a way that exposed a bug in the swift-corelibs-xctest CMake script. Without the deduplication, there was always both an -lFoundation and a -lswift_regexParser appearing both before and after every single -lxctest, which had the effect of masking the missing runpath by changing the load order in the emitted .xctest binary. Now, with -lFoundation deduplicated, the result looks something like -lXCTest -lXCTest -lXCTest -lFoundation -lswift_RegexParser, which is too late to "save" the lookup for XCTest.

I still stand by my assertion that the deduplication doesn't need to concern itself with link order; a bug in the build system is all that made it significant in this case, and it absolutely should not be. This should be fixed in swift-corelibs-xctest, but I'll add a workaround here that should take care of it (by exploiting the very fact that the ordering matters, sadly).

@compnerd
Copy link
Member

That work around is terrible and you should feel bad 😆 :shipit:!

@compnerd
Copy link
Member

@swift-ci please test

@gwynne
Copy link
Contributor Author

gwynne commented Mar 13, 2023

Opened swiftlang/swift-corelibs-xctest#432 to track fixing the XCTest build issue.

@compnerd
Copy link
Member

@swift-ci please test

@compnerd
Copy link
Member

@swift-ci please test

@compnerd
Copy link
Member

@swift-ci please test macOS platform

@compnerd
Copy link
Member

This is a potential workaround for the issues reported in #58380. In testing, this change resulted in a 90% reduction in both link time and linker RAM usage with no measurable impact on the output of the build.

(I call it a workaround rather than a fix because the underlying behavior is arguably ld.gold's fault, not Swift's.)

This really is a self-inflicted wound. The issue isn't gold, its Swift's behaviour of injecting a custom section and then extracting it to recreate a linker response file rather than creating a build graph and generating the proper command for linking.

@gwynne
Copy link
Contributor Author

gwynne commented Mar 26, 2023

@al45tair @compnerd Can we get this merged in time for 5.9?

@al45tair
Copy link
Contributor

@gwynne I think this is worth having. It'll need cherry-picking to the release/5.9 branch.

@al45tair al45tair merged commit 1b64e76 into swiftlang:main Mar 27, 2023
al45tair added a commit to al45tair/swift that referenced this pull request Mar 27, 2023
Add the remaining toolchain-distributed runtime libs to autolink-extract
@al45tair
Copy link
Contributor

Cherry pick to 5.9 is here: #64633

@al45tair
Copy link
Contributor

(Note that it'll need approval from one of the branch managers to get it merged; I can't do that myself).

@gwynne gwynne deleted the patch-2 branch March 28, 2023 19:40
al45tair added a commit that referenced this pull request Mar 30, 2023
[Linux] (Cherry Pick) Merge pull request #64312 from gwynne/patch-2
MaxDesiatov added a commit that referenced this pull request May 2, 2023
This reverts commit 1b64e76, reversing
changes made to 5ef25e7.
MaxDesiatov added a commit that referenced this pull request Jun 1, 2023
[5.8] CMake: fix missing `SWIFT_CONCURRENCY_GLOBAL_EXECUTOR`

Explanation: Resolves issues with static linking on Linux
Risk: Medium, affects Linux builds and top-level CMake declarations.
Original PRs: #65795 and #64312 for `main`, #65824 and #64633 for `release/5.9`
Reviewed by: @al45tair @drexin @etcwilde 
Resolves: some of the issues reported in #65097, also resolves #58380
Tests: Added in swiftlang/swift-integration-tests#118

`SWIFT_CONCURRENCY_GLOBAL_EXECUTOR` is defined in `stdlib/cmake/modules/StdlibOptions.cmake`, which is not included during the first pass of evaluation of the root `CMakeLists.txt`. It is available on subsequent evaluations after the value is stored in CMake cache. This led to subtle bugs, where `usr/lib/swift_static/linux/static-stdlib-args.lnk` didn't contain certain flags on clean toolchain builds, but did contain them in incremental builds.

Not having these autolinking flags in toolchain builds leads to errors when statically linking executables on Linux.

Additionally, since our trivial lit tests previously didn't link Dispatch statically, they didn't expose a bug where `%import-static-libdispatch` substitution had a missing value. To fix that I had to update `lit.cfg` and clean up some of the related path computations to infer a correct substitution value.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[SR-16121] Increasingly excessive memory requirements for linking on Linux
3 participants