-
Notifications
You must be signed in to change notification settings - Fork 15.6k
Implement reserveAllocationSpace for SectionMemoryManager #71968
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
24ebe92 to
1110a48
Compare
89ceb93 to
c04dc5a
Compare
bffe9e2 to
b699a14
Compare
gmarkall
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've tested a fairly similar change (based off these changes) in numba/llvmlite#1009 and the only issue I came across was that the alignment for the code segment requested during reserving the allocation can be smaller than the alignment requested when allocating the code segment - this is because the alignment for the code segment at allocation time takes into account the alignment of the stub (from here).
The ultimate effect of this is that sometimes the preallocation can be too small for the later allocation if the code size is right up close to a boundary (a page size boundary?) - for example I saw a preallocation request for 16380 bytes with alignment 4 resulting in an actual preallocation of 16384 bytes, then a later code allocation for 16379 bytes with alignment 8, which ended up trying to use 16392 bytes, slightly larger than the preallocation, and failing.
I hacked around this by by increasing the code alignment to 8 if it was less than 8 (simply because that seems to be the biggest stub alignment potentially used across all targets) and that seemed to resolve the issue. Ideally I would have queried the stub alignment, but I don't think there's an easy way to do that from within an RTDyldMemoryManager.
Perhaps the most correct fix is for RuntimeDyldImpl::computeTotalAllocSize() to take the stub alignment into consideration when computing the code segment alignment?
|
Also, I should have said earlier: many thanks for all your efforts and the trouble you've gone to to put this together - it is certainly looking promising, from the perspective of Numba on AArch64 platforms! |
This is done now, I think? Or are you planning to add more tests? |
b699a14 to
73f54a2
Compare
73f54a2 to
5d3caaf
Compare
5d3caaf to
afafa20
Compare
Implements `reserveAllocationSpace` and provides an option to enable `needsToReserveAllocationSpace` for large-memory environments with AArch64. The [AArch64 ABI](https://github.com/ARM-software/abi-aa/blob/main/sysvabi64/sysvabi64.rst) has limits on the distance between sections as the instructions to reference them are limited to 2 or 4GB. Allocating sections in multiple blocks can result in distances greater than that on systems with lots of memory. In those environments several projects using SectionMemoryManager with MCJIT have run across assertion failures for the R_AARCH64_ADR_PREL_PG_HI21 instruction as it attempts to address across distances greater than 2GB (an int32). Fixes llvm#71963 by allocating all sections in a single contiguous memory allocation, limiting the distance required for instruction offsets similar to how pre-compiled binaries would be loaded into memory. Does not change the default behavior of SectionMemoryManager.
afafa20 to
1922c6c
Compare
The implementation of `reserveAllocationSpace()` now more closely follows that in llvm/llvm-project#71968, following some changes made there. The changes here include: - Improved readability of debugging output - Using a default alignment of 8 in `allocateSection()` to match the default alignment provided by the stub alignment during preallocation. - Replacing the "bespoke" `requiredPageSize()` function with computations using the LLVM `alignTo()` function. - Returning early from preallocation when no space is requested. - Reusing existing preallocations if there is enough space left over from the previous preallocation for all the required segments - this can happen quite frequently because allocations for each segment get rounded up to page sizes, which are usually either 4K or 16K, and many Numba-jitted functions require a lot less than this. - Removal of setting the near hints for memory blocks - this doesn't really have any use when all memory is preallocated, and forced to be "near" to other memory. - Addition of extra asserts to validate alignment of allocated sections.
|
Now that numba/llvmlite#1009 (which essentially implements the same changes made here, just inside llvmlite) is merged and an RC has been produced, I've received various reports that this fixes issues related to #71963 on both macOS and LinuxAArch64 systems, and no reports of adverse effects - so FWIW, my confidence in this patch being a good fix is quite high. |
|
Hi - is this PR still in work? The changes in this PR helped to resolve a bug that we saw in CUDA-Q when using ARM, so I was curious if this had a path forward for getting merged into |
|
Just to add, we've been using this implementation in Numba / llvmlite for a few months now:
So as far as I can tell, the implementation here seems quite robust. |
* Fix LLVM aarch64 relocation overflow Diffs originated from llvm/llvm-project#71968 and were modified to target the specific version of LLVM in use by CUDA Quantum (16.0.6). * Change default ReserveAlloc value from false to true
The patched code from llvm/llvm-project#71968, moved into a new class SafeSectionMemoryManager, adjusted to work on LLVM < 16, and used in place of the regular memory manager. XXX experimental XXX this may be a terrible idea XXX several details include #include directives would need adjustment for prehistoric LLVM versions Reported-by: Anthonin Bonnefoy <[email protected]> Discussion: https://postgr.es/m/CAO6_Xqr63qj%3DSx7HY6ZiiQ6R_JbX%2B-p6sTPwDYwTWZjUmjsYBg%40mail.gmail.com Signed-off-by: Anthonin Bonnefoy <[email protected]>
The patched code from llvm/llvm-project#71968, moved into a new class SafeSectionMemoryManager, adjusted to work on LLVM < 16, and used in place of the regular memory manager. XXX experimental XXX this may be a terrible idea XXX several details include #include directives would need adjustment for prehistoric LLVM versions Reported-by: Anthonin Bonnefoy <[email protected]> Discussion: https://postgr.es/m/CAO6_Xqr63qj%3DSx7HY6ZiiQ6R_JbX%2B-p6sTPwDYwTWZjUmjsYBg%40mail.gmail.com Signed-off-by: Anthonin Bonnefoy <[email protected]>
|
To add an additional impacted project, I had this issue happening on PostgreSQL instances. I've written more details about the impact in a message to the pgsql-hackers mailing list. I've also tested the patched SectionMemoryManager on an impacted database and it fixed the segfaults. |
Supply a new memory manager for RuntimeDyld, so that we can avoid putting ARM code too far apart. This is the code from llvm/llvm-project#71968, copied into our tree and moved into a new namespace llvm::backport, and adjusted to work on older LLVM versions. This should fix the spate of crashes we've been receiving lately from users on ARM systems. Reported-by: Anthonin Bonnefoy <[email protected]> Reviewed-by: Anthonin Bonnefoy <[email protected]> Discussion: https://postgr.es/m/CAO6_Xqr63qj%3DSx7HY6ZiiQ6R_JbX%2B-p6sTPwDYwTWZjUmjsYBg%40mail.gmail.com
Supply a new memory manager for RuntimeDyld that avoids putting ARM code too far apart. This is the code from llvm/llvm-project#71968, copied into our tree and moved into a new namespace llvm::backport, and adjusted to work on older LLVM versions. This should fix the spate of crashes we've been receiving lately from users on ARM systems. Reported-by: Anthonin Bonnefoy <[email protected]> Reviewed-by: Anthonin Bonnefoy <[email protected]> Discussion: https://postgr.es/m/CAO6_Xqr63qj%3DSx7HY6ZiiQ6R_JbX%2B-p6sTPwDYwTWZjUmjsYBg%40mail.gmail.com
Supply a new memory manager for RuntimeDyld that avoids putting ARM code too far apart. This is the code from llvm/llvm-project#71968, copied into our tree and moved into a new namespace llvm::backport, and adjusted to work on older LLVM versions. This should fix the spate of crashes we've been receiving lately from users on ARM systems. XXX Ideally the LLVM project will commit this, and then we can resync with the code in the LLVM 19.x stable branch, instead of using the code from their PR, before we ship it! Reported-by: Anthonin Bonnefoy <[email protected]> Reviewed-by: Anthonin Bonnefoy <[email protected]> Discussion: https://postgr.es/m/CAO6_Xqr63qj%3DSx7HY6ZiiQ6R_JbX%2B-p6sTPwDYwTWZjUmjsYBg%40mail.gmail.com
Supply a new memory manager for RuntimeDyld that avoids putting ARM code too far apart. This is the code from llvm/llvm-project#71968, copied into our tree and moved into a new namespace llvm::backport, and adjusted to work on older LLVM versions. This should fix the spate of crashes we've been receiving lately from users on ARM systems. XXX Ideally the LLVM project will commit this, and then we can resync with the code in the LLVM 19.x stable branch, instead of using the code from their PR, before we ship it! Reported-by: Anthonin Bonnefoy <[email protected]> Reviewed-by: Anthonin Bonnefoy <[email protected]> Discussion: https://postgr.es/m/CAO6_Xqr63qj%3DSx7HY6ZiiQ6R_JbX%2B-p6sTPwDYwTWZjUmjsYBg%40mail.gmail.com
Supply a new memory manager for RuntimeDyld that avoids putting ARM code too far apart. This is the code from llvm/llvm-project#71968, copied into our tree and moved into a new namespace llvm::backport, with minor adjustments to work on LLVM 12-18. This should fix the spate of crashes we've been receiving lately from users on ARM systems. XXX Ideally the LLVM project will commit this, and then we can resync with the code in the LLVM 19.x stable branch, instead of using the code from their PR, before we ship it! Reported-by: Anthonin Bonnefoy <[email protected]> Reviewed-by: Anthonin Bonnefoy <[email protected]> Discussion: https://postgr.es/m/CAO6_Xqr63qj%3DSx7HY6ZiiQ6R_JbX%2B-p6sTPwDYwTWZjUmjsYBg%40mail.gmail.com
Supply a new memory manager for RuntimeDyld that avoids putting ARM code too far apart. This is the code from llvm/llvm-project#71968, copied into our tree and moved into a new namespace llvm::backport, with minor adjustments to work on LLVM 12-18. This should fix the spate of crashes we've been receiving lately from users on ARM systems. XXX Ideally the LLVM project will commit this, and then we can resync with the code in the LLVM 19.x stable branch, instead of using the code from their PR, before we ship it! Reported-by: Anthonin Bonnefoy <[email protected]> Reviewed-by: Anthonin Bonnefoy <[email protected]> Discussion: https://postgr.es/m/CAO6_Xqr63qj%3DSx7HY6ZiiQ6R_JbX%2B-p6sTPwDYwTWZjUmjsYBg%40mail.gmail.com
Supply a new memory manager for RuntimeDyld that avoids putting ARM code too far apart. This is the code from llvm/llvm-project#71968, copied into our tree and moved into a new namespace llvm::backport, with minor adjustments to work on LLVM 12-18. This should fix the spate of crashes we've been receiving lately from users on ARM systems. XXX Ideally the LLVM project will commit this, and then we can resync with the code in the LLVM 19.x stable branch, instead of using the code from their PR, before we ship it! Reported-by: Anthonin Bonnefoy <[email protected]> Reviewed-by: Anthonin Bonnefoy <[email protected]> Discussion: https://postgr.es/m/CAO6_Xqr63qj%3DSx7HY6ZiiQ6R_JbX%2B-p6sTPwDYwTWZjUmjsYBg%40mail.gmail.com
|
@gmarkall's comment sounds promising:
On that basis I'm ok with this landing in main. However, if we see any issues related to it the bias should be to revert, rather than try to fix issues (if the fixes would introduce further changes in behavior): We're discussing deprecation and removal of MCJIT and RuntimeDyld (see https://discourse.llvm.org/t/rfc-add-deprecation-warnings-to-mcjit-and-runtimedyld/80465 and https://discourse.llvm.org/t/rfc-removing-mcjit-and-runtimedyld/80464) and the goal at this point is stability rather than bug-fixes. If any of you are using MCJIT directly you'll want to look at switching to ORC / LLJIT. If you're already on ORC / LLJIT please be aware that we'll be switching to |
Supply a new memory manager for RuntimeDyld that avoids putting ARM code too far apart. This is the code from llvm/llvm-project#71968, copied into our tree and moved into a new namespace llvm::backport, with minor adjustments to work on LLVM 12-18. This should fix the spate of crashes we've been receiving lately from users on ARM systems. XXX Ideally the LLVM project will commit this, and then we can resync with the code in the LLVM 19.x stable branch, instead of using the code from their PR, before we ship it! Reported-by: Anthonin Bonnefoy <[email protected]> Reviewed-by: Anthonin Bonnefoy <[email protected]> Discussion: https://postgr.es/m/CAO6_Xqr63qj%3DSx7HY6ZiiQ6R_JbX%2B-p6sTPwDYwTWZjUmjsYBg%40mail.gmail.com
* Fix LLVM aarch64 relocation overflow Diffs originated from llvm/llvm-project#71968 and were modified to target the specific version of LLVM in use by CUDA Quantum (16.0.6). * Change default ReserveAlloc value from false to true
|
I have encountered #71963 in the context of the MLIR |
|
I didn't see any further feedback to address. Is this waiting on me somehow? |
|
I'd add that we've continued to use this with no issues reported in Numba / llvmlite. |
Nope -- this is just me getting swamped and losing track of this. Thanks for the ping! I'll merge this, but take the opportunity to remind everyone: MCJIT is going away eventually -- you should move to ORC if you can, and file LLVM issues if you can't. |
Implements `reserveAllocationSpace` and provides an option to enable `needsToReserveAllocationSpace` for large-memory environments with AArch64. The [AArch64 ABI](https://github.com/ARM-software/abi-aa/blob/main/sysvabi64/sysvabi64.rst#7code-models) has restrictions on the distance between TEXT and GOT sections as the instructions to reference them are limited to 2 or 4GB. Allocating sections in multiple blocks can result in distances greater than that on systems with lots of memory. In those environments several projects using SectionMemoryManager with MCJIT have run across assertion failures for the R_AARCH64_ADR_PREL_PG_HI21 instruction as it attempts to address across distances greater than 2GB (an int32). Fixes llvm#71963 by allocating all sections in a single contiguous memory allocation, limiting the distance required for instruction offsets similar to how pre-compiled binaries would be loaded into memory. Co-authored-by: Lang Hames <[email protected]>
…rm (llvm#172833) This PR enables JIT initialize for AArch64. Up to now it was disabled because of llvm#71963 which was recently fixed by llvm#71968.
Implements
reserveAllocationSpaceand provides an option to enableneedsToReserveAllocationSpacefor large-memory environments with AArch64.The AArch64 ABI has restrictions on the distance between TEXT and GOT sections as the instructions to reference them are limited to 2 or 4GB. Allocating sections in multiple blocks can result in distances greater than that on systems with lots of memory. In those environments several projects using SectionMemoryManager with MCJIT have run across assertion failures for the R_AARCH64_ADR_PREL_PG_HI21 instruction as it attempts to address across distances greater than 2GB (an int32).
Fixes #71963 by allocating all sections in a single contiguous memory allocation, limiting the distance required for instruction offsets similar to how pre-compiled binaries would be loaded into memory.