-
Notifications
You must be signed in to change notification settings - Fork 1.7k
dart compile exe fails on macOS versions older than macOS 12 #49783
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
On the latest beta, this crash doesn't seem to happen, instead we get the same failure as on x64 (see also #49275). |
@athomas can you confirm on which beta range you saw the crash (I presume beta 2.18.0-271.7,beta does not exhibit the crash but 2.18.0-271.4.beta exhibits the crash) |
Also does clang upgrades done in the main branch get carried over to the beta branch too ? |
Crash confirmed on 2.18.0-271.0.dev (the branch point). Clang updates are changes to DEPS, and don't carry over unless CP'ed or landed before the branch point. |
To summarize, there's two issues here
This has been reported in the past (see e.g. #49275, #49326) @athomas says this issue seems to be a regression in 2.18. If this happens on our CI, we should be able to bisect the issue on one of the VMs? |
The corresponding code for this error message can be found in llvm/lib/Object/MachOObjectFile.cpp which may be helpful for debuging the issue. /cc @sstrickl |
Assuming the crash wasn't flaky, it seems to have disappeared. It only occured on arm64 so we have relatively few datapoints on CI, however it was reproducible locally as well. The signing issue seems to be tied to macOS version: |
So looking at the We could try to patch this up as we generate the merged binary, changing the file offset of any load sections with non-zero size that start at an offset in the headers to the file offset of the first section within it, adjusting the vm address, size, and file size of the load command accordingly. However, if the vm address of a load segment needs to, say, be on a page boundary for mmap reasons, that might be a problem, given that in the example build on my machine, the header ends at 3216 and the first section in that load command starts at 3280 (and there's no page boundary between those two values). That is, it's a possibly very brittle change that isn't even needed for later MacOS versions that have an updated |
Could you state which MacOS versions would be considered later which have the updated Flutter claims to support (12 - google tested) and best effort (10.11 and 11. |
|
We support several older MachOS versions, not just the oldest. I don't think we can just drop older version support now. The ability to sign @sstrickl Is there something I'm overlooking?
Phrased in another way: Are you saying that the If so, why wasn't this an issue before? |
Cannot be signed on those MacOS versions. I'd guess that the automated process for signing the SDK uses a version of the |
I don't see a change in our signing infra that fits in that timeframe: On 2022-02-17 we upgrade the signing flow to big sur (but that also affects the 2.17.6 release), then there were changes recently to upgrade it to monterey but that happened after the branchpoint. |
Since a revert has landed I am going to change the priority of this issue to p2. |
Okay, so here's the lowdown of what's behind this and the current plan for the future: We currently create signable MachO standalone executables by adding a new segment load command with a section pointing to the snapshot data in a copy of the However, this approach assumes there's enough space in the padding between the header and the contents of the MachO executable to put this information. Prior to the clang change above, there was (e.g., the headers ended at offset 3064, and the first content started at offset 17280). Afterwards, there wasn't (e.g., the headers ended at offset 3128, and the first content offset was at 3200). Unfortunately, the current code for adding the new header information doesn't check that the available padding can fit the new information, which is why we didn't get an error when this started happening. Instead, we saw Now that we know what the issue is, we've rolled back clang (which as a side effect restores the extra padding) as a temporary fix to unblock the stable release. The longer term fix is to change our approach to combining the snapshot into the MachO executable. There are three possibilities I currently have in mind, though really only the first and third should be considered:
Thus, my plans are to implement the third approach, after which we can reland the clang changes without concern. |
I'm also making a quick CL to add a check that we're not overwriting the section contents with the expanded header, so that we'll get proper failures if this case ends up happening again while the long-term solution is being worked on. |
/cc @mraleph see suggested fix in #49783 (comment). |
@sstrickl Thanks for writing up the issue & solution space!
Do we have good understanding of which parts use file offsets that would need to be modified? (Many things may use offsets into sections and not absolute file offset). It would be good to understand before making the implementation decision.
By piggy-bagging on universal binaries, we rely on tools (strip, signing, ...) to preserve all parts. It may be unusual to have universal binaries with several parts with same architecture. I can imagine there's tools that take universal binaries (containing code for different architectures) and pulling out / stripping to only whats needed - to get smaller binaries. That would be the risk with this approach. |
That's true, that would be a risk. That, and I think we'd have to use a different (non-supported) architecture for the Dart snapshot MachO, though if we mark it as a dsym or something, that might be enough for it not to be chosen when executing, but that doesn't help against stripping universal binaries back to MachO executables. The main thing I'm worried about are file offsets inside sections. Handling load segments and section headers is relatively easy, though whether a load command contains file offsets depends on the specific load segment type and so we'd have to make sure we handle what we know about and hope none get added. However, but if any section contents themselves include file offsets, then it gets even trickier. I'll page back in the MachO specifics I know and see if I have any obvious examples of such to point to. |
Another question: If a universal binary is executed, I assume only the specific architecture's MachO is loaded into memory. So if we add another part, would it get loaded by the dynamic loader or we'd need to manually |
Maybe we could return back to the idea of shipping clang's assembler+linker+linkable-AOT-runtime-library in the SDK instead? This has larger footprint but is guaranteed to work. |
Yeah, that'd probably be the best in the long run, just to treat the AOT runtime as something linked together with the snapshot to create an executabler rather than trying to do the merging mess we currently do. I should have listed that as a possible solution as well, thanks for mentioning it! |
I might be misremembering but I'm pretty sure we already do this approach, opening the file, finding the snapshot offset, and then using the ELF loader which will mmap() the right part of it. |
This catches an inadvertent partial overwrite of the first section in the `dartaotruntime` executable that happens when there's not enough padding after the headers. The first section is a text section, so unless there are no calls to code in this overwritten portion, there are no runtime failures when this happens. However, having a MachO section which has a file offset within the headers is checked by MachO verification code within the `codesign` utility when run with the `-f` flag, and thus caused failures in certain tests and builds. Note that we only use that flag when `codesign` does not have the `linker-signed` option, which is for MacOS versions prior to 12.0. Example failure with this change when this case occurs (e.g., prior to the recent clang revert): ``` $ xcodebuild/ReleaseARM64/dart-sdk/bin/dart compile exe -o test test.dart Info: Compiling with sound null safety Error: AOT compilation failed FormatException: The MachO header overlaps with the first 120 bytes of the section contents ``` Change-Id: Ib27db910777f61b90f162f7a0bcfa4ba6592a5a0 Bug: #49783 Cq-Include-Trybots: luci.dart.try:vm-kernel-precomp-nnbd-mac-release-arm64-try,vm-kernel-precomp-mac-product-x64-try,dart-sdk-mac-arm64-try,dart-sdk-mac-try,pkg-mac-release-arm64-try,pkg-mac-release-try Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/256264 Reviewed-by: Daco Harkes <[email protected]> Reviewed-by: Martin Kustermann <[email protected]> Commit-Queue: Tess Strickland <[email protected]>
This reverts commit 161f8fd. Reason for revert: breaks google3 (b/243899439) Original change's description: > [pkg/dart2native] Add negative padding check to MachO writer. > > This catches an inadvertent partial overwrite of the first section > in the `dartaotruntime` executable that happens when there's not > enough padding after the headers. The first section is a text section, > so unless there are no calls to code in this overwritten portion, there > are no runtime failures when this happens. > > However, having a MachO section which has a file offset within the > headers is checked by MachO verification code within the `codesign` > utility when run with the `-f` flag, and thus caused failures in > certain tests and builds. Note that we only use that flag when > `codesign` does not have the `linker-signed` option, which is for > MacOS versions prior to 12.0. > > Example failure with this change when this case occurs (e.g., prior to > the recent clang revert): > ``` > $ xcodebuild/ReleaseARM64/dart-sdk/bin/dart compile exe -o test test.dart > Info: Compiling with sound null safety > Error: AOT compilation failed > FormatException: The MachO header overlaps with the first 120 bytes of the section contents > ``` > > Change-Id: Ib27db910777f61b90f162f7a0bcfa4ba6592a5a0 > Bug: #49783 > Cq-Include-Trybots: luci.dart.try:vm-kernel-precomp-nnbd-mac-release-arm64-try,vm-kernel-precomp-mac-product-x64-try,dart-sdk-mac-arm64-try,dart-sdk-mac-try,pkg-mac-release-arm64-try,pkg-mac-release-try > Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/256264 > Reviewed-by: Daco Harkes <[email protected]> > Reviewed-by: Martin Kustermann <[email protected]> > Commit-Queue: Tess Strickland <[email protected]> [email protected],[email protected],[email protected] Change-Id: Iccf27ffab6c6b6272e366b77c5092b346145e263 No-Presubmit: true No-Tree-Checks: true No-Try: true Bug: #49783 Cq-Include-Trybots: luci.dart.try:vm-kernel-precomp-nnbd-mac-release-arm64-try,vm-kernel-precomp-mac-product-x64-try,dart-sdk-mac-arm64-try,dart-sdk-mac-try,pkg-mac-release-arm64-try,pkg-mac-release-try Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/256460 Reviewed-by: Martin Kustermann <[email protected]> Commit-Queue: Emmanuel Pellereau <[email protected]> Reviewed-by: Emmanuel Pellereau <[email protected]> Reviewed-by: Tess Strickland <[email protected]>
* Move parsing functions to static methods of the classes that represent the data being parsed. * Move calculations for adjusting offsets in load commands into the corresponding classes. * Make function to insert a new payload segment an instance method of MachOFile. * Remove unnecessary abstractions and definitions for MachO load commands that we don't need to parse. These refactorings are being done to make later changes easier to review, by first lifting the initial refactorings out into a separate CL. Issue: #49783 Change-Id: I133eb368cbb9ee0d8e4f3998ba1a0bbe8555b8aa Cq-Include-Trybots: luci.dart.try:analyzer-mac-release-try,dart-sdk-mac-arm64-try,dart-sdk-mac-try,pkg-mac-release-arm64-try,pkg-mac-release-try,vm-kernel-precomp-mac-product-x64-try,vm-kernel-precomp-nnbd-mac-release-arm64-try Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/256821 Reviewed-by: Daco Harkes <[email protected]> Commit-Queue: Tess Strickland <[email protected]> Reviewed-by: Ryan Macnak <[email protected]>
To create a Dart standalone executable on MacOS, we modify the dartaotruntime executable to add the snapshot contents, and the VM looks into the executable on disk to find the snapshot to load. Previously, we did this by adding a new 64-bit segment load command with a single section, where the section's file offset and size describes the inserted snapshot. This meant the Mach-O header size increased by 152 bytes. Originally, this wasn't an issue as there was plenty of padding, but later clang updates removed most of this padding, and so writing the new header actually overwrote the initial contents of the first section in the file, which happens to be the __text section. In addition, since the first section's offset was now declared to be within the header, utilities that strictly validated the Mach-O format, like codesign, would report errors. This CL changes it so that we actually reserve space in the dartaotruntime header using the -add_empty_section flag to the linker. In addition, we change from using a segment load command to using a (40 byte) note load command. This is because a segment load command specifies that the contents should be loaded in memory, but we don't use that loaded version. Instead, the VM reloads it from the executable on disk so it can appropriately mmap the different parts of the snapshot. A note section instead just declares a section of the executable as arbitrary data that the owner can read from the file and use as desired, which is semantically closer to our current usage. This CL also adds a test to pkg/dartdev/test/commands/compile_test to ensure that corrupting a random part of the snapshot in the executable causes signature verification to fail. This CL also reverts CL 256208, thus relanding the clang changes starting from June that originally raised awareness of the issue by greatly reduced the amount of padding after the load commands. TEST=pkg/dartdev/test/commands/compile_test Bug: #49783 Change-Id: Iee554d87b0eabaecd7a534ca4e4facfefbce6385 Cq-Include-Trybots: luci.dart.try:analyzer-mac-release-try,dart-sdk-mac-arm64-try,dart-sdk-mac-try,pkg-mac-release-arm64-try,pkg-mac-release-try,vm-kernel-precomp-mac-product-x64-try,vm-kernel-precomp-nnbd-mac-release-arm64-try Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/260108 Reviewed-by: Ryan Macnak <[email protected]> Reviewed-by: Daco Harkes <[email protected]> Commit-Queue: Tess Strickland <[email protected]>
To avoid these flags being applied to unexpected build rules, we separate them out into a separate config that is then pulled in explicitly in the `dart_precompiled_runtime{,_product}` executable rules. Only those two executables need the additional empty section: the product version because it becomes `dartaotruntime` in the SDK bundle, and the non-product version because some of our tests build standalone executables using it. The linker flags were originally removed due to flutter/flutter#112687. With the recent removal of bitcode support from Flutter (see flutter/flutter#107883), I can build the `ios_release` target locally without any issue. This also reverts the clang DEPS changes from CL 256208 (again). TEST=pkg/dartdev/test/commands/compile_test Bug: #49783 Cq-Include-Trybots: luci.dart.try:analyzer-mac-release-try,dart-sdk-mac-arm64-try,dart-sdk-mac-try,pkg-mac-release-arm64-try,pkg-mac-release-try,vm-kernel-precomp-mac-product-x64-try,vm-kernel-precomp-nnbd-mac-release-arm64-try Change-Id: Ie46402ec2eeda23109247eb9d7a64935ec2052cb Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/262429 Reviewed-by: Martin Kustermann <[email protected]> Commit-Queue: Tess Strickland <[email protected]>
Now that 9df6656 has appropriately propagated and the build rules in google3 have been updated, re-add checks to ensure that the empty section used to pad the original header exists and that the new header is smaller than the old header. Bug: #49783 Change-Id: I5b8b31bf9edf4814686eb7928457a9a7d53c0727 Cq-Include-Trybots: luci.dart.try:dart-sdk-mac-try,dart-sdk-mac-arm64-try,pkg-mac-release-arm64-try,pkg-mac-release-try,vm-kernel-mac-release-arm64-try,vm-kernel-mac-release-x64-try Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/275622 Commit-Queue: Tess Strickland <[email protected]> Reviewed-by: Daco Harkes <[email protected]> Reviewed-by: Alexander Markov <[email protected]>
The crash was first noticed here:
https://logs.chromium.org/logs/dart-internal/buildbucket/cr-buildbucket/8805124329898895137/+/u/compile_script/stdout
The program being compiled was
bin/find_base_commit.dart
from https://dart.googlesource.com/dart_ci/+/refs/heads/main/builder/.The crash can also be reproduced with a "Hello World" program.
The text was updated successfully, but these errors were encountered: