Skip to content

Conversation

@MikaelSmith
Copy link
Contributor

@MikaelSmith MikaelSmith commented Nov 10, 2023

Implements reserveAllocationSpace and provides an option to enable needsToReserveAllocationSpace for large-memory environments with AArch64.

The AArch64 ABI has restrictions on the distance between TEXT and GOT sections as the instructions to reference them are limited to 2 or 4GB. Allocating sections in multiple blocks can result in distances greater than that on systems with lots of memory. In those environments several projects using SectionMemoryManager with MCJIT have run across assertion failures for the R_AARCH64_ADR_PREL_PG_HI21 instruction as it attempts to address across distances greater than 2GB (an int32).

Fixes #71963 by allocating all sections in a single contiguous memory allocation, limiting the distance required for instruction offsets similar to how pre-compiled binaries would be loaded into memory.

@github-actions
Copy link

github-actions bot commented Nov 10, 2023

✅ With the latest revision this PR passed the C/C++ code formatter.

@MikaelSmith MikaelSmith force-pushed the prealloc-memory branch 2 times, most recently from 89ceb93 to c04dc5a Compare November 10, 2023 23:02
Copy link
Contributor

@gmarkall gmarkall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've tested a fairly similar change (based off these changes) in numba/llvmlite#1009 and the only issue I came across was that the alignment for the code segment requested during reserving the allocation can be smaller than the alignment requested when allocating the code segment - this is because the alignment for the code segment at allocation time takes into account the alignment of the stub (from here).

The ultimate effect of this is that sometimes the preallocation can be too small for the later allocation if the code size is right up close to a boundary (a page size boundary?) - for example I saw a preallocation request for 16380 bytes with alignment 4 resulting in an actual preallocation of 16384 bytes, then a later code allocation for 16379 bytes with alignment 8, which ended up trying to use 16392 bytes, slightly larger than the preallocation, and failing.

I hacked around this by by increasing the code alignment to 8 if it was less than 8 (simply because that seems to be the biggest stub alignment potentially used across all targets) and that seemed to resolve the issue. Ideally I would have queried the stub alignment, but I don't think there's an easy way to do that from within an RTDyldMemoryManager.

Perhaps the most correct fix is for RuntimeDyldImpl::computeTotalAllocSize() to take the stub alignment into consideration when computing the code segment alignment?

@gmarkall
Copy link
Contributor

Also, I should have said earlier: many thanks for all your efforts and the trouble you've gone to to put this together - it is certainly looking promising, from the perspective of Numba on AArch64 platforms!

@gmarkall
Copy link
Contributor

TODO: add tests to MCJITMemoryManagerTest

This is done now, I think? Or are you planning to add more tests?

Implements `reserveAllocationSpace` and provides an option to enable
`needsToReserveAllocationSpace` for large-memory environments with
AArch64.

The [AArch64 ABI](https://github.com/ARM-software/abi-aa/blob/main/sysvabi64/sysvabi64.rst)
has limits on the distance between sections as the instructions to
reference them are limited to 2 or 4GB. Allocating sections in multiple
blocks can result in distances greater than that on systems with lots of
memory. In those environments several projects using
SectionMemoryManager with MCJIT have run across assertion failures for
the R_AARCH64_ADR_PREL_PG_HI21 instruction as it attempts to address
across distances greater than 2GB (an int32).

Fixes llvm#71963 by allocating all sections in a single contiguous memory
allocation, limiting the distance required for instruction offsets
similar to how pre-compiled binaries would be loaded into memory. Does
not change the default behavior of SectionMemoryManager.
gmarkall added a commit to gmarkall/llvmlite that referenced this pull request Nov 22, 2023
The implementation of `reserveAllocationSpace()` now more closely
follows that in llvm/llvm-project#71968,
following some changes made there.

The changes here include:

- Improved readability of debugging output
- Using a default alignment of 8 in `allocateSection()` to match the
  default alignment provided by the stub alignment during preallocation.
- Replacing the "bespoke" `requiredPageSize()` function with
  computations using the LLVM `alignTo()` function.
- Returning early from preallocation when no space is requested.
- Reusing existing preallocations if there is enough space left over
  from the previous preallocation for all the required segments - this
  can happen quite frequently because allocations for each segment get
  rounded up to page sizes, which are usually either 4K or 16K, and many
  Numba-jitted functions require a lot less than this.
- Removal of setting the near hints for memory blocks - this doesn't
  really have any use when all memory is preallocated, and forced to be
  "near" to other memory.
- Addition of extra asserts to validate alignment of allocated sections.
@gmarkall
Copy link
Contributor

gmarkall commented Jan 2, 2024

Now that numba/llvmlite#1009 (which essentially implements the same changes made here, just inside llvmlite) is merged and an RC has been produced, I've received various reports that this fixes issues related to #71963 on both macOS and LinuxAArch64 systems, and no reports of adverse effects - so FWIW, my confidence in this patch being a good fix is quite high.

@bmhowe23
Copy link
Contributor

Hi - is this PR still in work? The changes in this PR helped to resolve a bug that we saw in CUDA-Q when using ARM, so I was curious if this had a path forward for getting merged into main. (However, we did have to change the default value of ReserveAlloc from false to true in order for it to fix our problem.)

@gmarkall
Copy link
Contributor

Just to add, we've been using this implementation in Numba / llvmlite for a few months now:

  • It appears to solve the ARM relocation overflow error
  • We haven't seen any new issues introduced by it

So as far as I can tell, the implementation here seems quite robust.

bmhowe23 added a commit to NVIDIA/cuda-quantum that referenced this pull request Jul 4, 2024
* Fix LLVM aarch64 relocation overflow

Diffs originated from llvm/llvm-project#71968
and were modified to target the specific version of LLVM in use by CUDA
Quantum (16.0.6).

* Change default ReserveAlloc value from false to true
bonnefoa pushed a commit to bonnefoa/postgres that referenced this pull request Aug 27, 2024
The patched code from llvm/llvm-project#71968,
moved into a new class SafeSectionMemoryManager, adjusted to work on LLVM < 16,
and used in place of the regular memory manager.

XXX experimental
XXX this may be a terrible idea
XXX several details include #include directives would need adjustment
for prehistoric LLVM versions

Reported-by: Anthonin Bonnefoy <[email protected]>
Discussion: https://postgr.es/m/CAO6_Xqr63qj%3DSx7HY6ZiiQ6R_JbX%2B-p6sTPwDYwTWZjUmjsYBg%40mail.gmail.com
Signed-off-by: Anthonin Bonnefoy <[email protected]>
bonnefoa pushed a commit to bonnefoa/postgres that referenced this pull request Aug 27, 2024
The patched code from llvm/llvm-project#71968,
moved into a new class SafeSectionMemoryManager, adjusted to work on LLVM < 16,
and used in place of the regular memory manager.

XXX experimental
XXX this may be a terrible idea
XXX several details include #include directives would need adjustment
for prehistoric LLVM versions

Reported-by: Anthonin Bonnefoy <[email protected]>
Discussion: https://postgr.es/m/CAO6_Xqr63qj%3DSx7HY6ZiiQ6R_JbX%2B-p6sTPwDYwTWZjUmjsYBg%40mail.gmail.com
Signed-off-by: Anthonin Bonnefoy <[email protected]>
@bonnefoa
Copy link

To add an additional impacted project, I had this issue happening on PostgreSQL instances. I've written more details about the impact in a message to the pgsql-hackers mailing list.

I've also tested the patched SectionMemoryManager on an impacted database and it fixed the segfaults.

macdice added a commit to macdice/postgres that referenced this pull request Aug 27, 2024
Supply a new memory manager for RuntimeDyld, so that we can avoid
putting ARM code too far apart.  This is the code from
llvm/llvm-project#71968, copied into our tree
and moved into a new namespace llvm::backport, and adjusted to work on
older LLVM versions.

This should fix the spate of crashes we've been receiving lately from
users on ARM systems.

Reported-by: Anthonin Bonnefoy <[email protected]>
Reviewed-by: Anthonin Bonnefoy <[email protected]>
Discussion: https://postgr.es/m/CAO6_Xqr63qj%3DSx7HY6ZiiQ6R_JbX%2B-p6sTPwDYwTWZjUmjsYBg%40mail.gmail.com
macdice added a commit to macdice/postgres that referenced this pull request Aug 27, 2024
Supply a new memory manager for RuntimeDyld that avoids putting ARM code
too far apart.  This is the code from
llvm/llvm-project#71968, copied into our tree
and moved into a new namespace llvm::backport, and adjusted to work on
older LLVM versions.

This should fix the spate of crashes we've been receiving lately from
users on ARM systems.

Reported-by: Anthonin Bonnefoy <[email protected]>
Reviewed-by: Anthonin Bonnefoy <[email protected]>
Discussion: https://postgr.es/m/CAO6_Xqr63qj%3DSx7HY6ZiiQ6R_JbX%2B-p6sTPwDYwTWZjUmjsYBg%40mail.gmail.com
macdice added a commit to macdice/postgres that referenced this pull request Aug 27, 2024
Supply a new memory manager for RuntimeDyld that avoids putting ARM code
too far apart.  This is the code from
llvm/llvm-project#71968, copied into our tree
and moved into a new namespace llvm::backport, and adjusted to work on
older LLVM versions.

This should fix the spate of crashes we've been receiving lately from
users on ARM systems.

XXX Ideally the LLVM project will commit this, and then we can resync
with the code in the LLVM 19.x stable branch, instead of using the code
from their PR, before we ship it!

Reported-by: Anthonin Bonnefoy <[email protected]>
Reviewed-by: Anthonin Bonnefoy <[email protected]>
Discussion: https://postgr.es/m/CAO6_Xqr63qj%3DSx7HY6ZiiQ6R_JbX%2B-p6sTPwDYwTWZjUmjsYBg%40mail.gmail.com
macdice added a commit to macdice/postgres that referenced this pull request Aug 27, 2024
Supply a new memory manager for RuntimeDyld that avoids putting ARM code
too far apart.  This is the code from
llvm/llvm-project#71968, copied into our tree
and moved into a new namespace llvm::backport, and adjusted to work on
older LLVM versions.

This should fix the spate of crashes we've been receiving lately from
users on ARM systems.

XXX Ideally the LLVM project will commit this, and then we can resync
with the code in the LLVM 19.x stable branch, instead of using the code
from their PR, before we ship it!

Reported-by: Anthonin Bonnefoy <[email protected]>
Reviewed-by: Anthonin Bonnefoy <[email protected]>
Discussion: https://postgr.es/m/CAO6_Xqr63qj%3DSx7HY6ZiiQ6R_JbX%2B-p6sTPwDYwTWZjUmjsYBg%40mail.gmail.com
macdice added a commit to macdice/postgres that referenced this pull request Aug 27, 2024
Supply a new memory manager for RuntimeDyld that avoids putting ARM code
too far apart.  This is the code from
llvm/llvm-project#71968, copied into our tree
and moved into a new namespace llvm::backport, with minor adjustments to
work on LLVM 12-18.

This should fix the spate of crashes we've been receiving lately from
users on ARM systems.

XXX Ideally the LLVM project will commit this, and then we can resync
with the code in the LLVM 19.x stable branch, instead of using the code
from their PR, before we ship it!

Reported-by: Anthonin Bonnefoy <[email protected]>
Reviewed-by: Anthonin Bonnefoy <[email protected]>
Discussion: https://postgr.es/m/CAO6_Xqr63qj%3DSx7HY6ZiiQ6R_JbX%2B-p6sTPwDYwTWZjUmjsYBg%40mail.gmail.com
macdice added a commit to macdice/postgres that referenced this pull request Aug 28, 2024
Supply a new memory manager for RuntimeDyld that avoids putting ARM code
too far apart.  This is the code from
llvm/llvm-project#71968, copied into our tree
and moved into a new namespace llvm::backport, with minor adjustments to
work on LLVM 12-18.

This should fix the spate of crashes we've been receiving lately from
users on ARM systems.

XXX Ideally the LLVM project will commit this, and then we can resync
with the code in the LLVM 19.x stable branch, instead of using the code
from their PR, before we ship it!

Reported-by: Anthonin Bonnefoy <[email protected]>
Reviewed-by: Anthonin Bonnefoy <[email protected]>
Discussion: https://postgr.es/m/CAO6_Xqr63qj%3DSx7HY6ZiiQ6R_JbX%2B-p6sTPwDYwTWZjUmjsYBg%40mail.gmail.com
macdice added a commit to macdice/postgres that referenced this pull request Aug 28, 2024
Supply a new memory manager for RuntimeDyld that avoids putting ARM code
too far apart.  This is the code from
llvm/llvm-project#71968, copied into our tree
and moved into a new namespace llvm::backport, with minor adjustments to
work on LLVM 12-18.

This should fix the spate of crashes we've been receiving lately from
users on ARM systems.

XXX Ideally the LLVM project will commit this, and then we can resync
with the code in the LLVM 19.x stable branch, instead of using the code
from their PR, before we ship it!

Reported-by: Anthonin Bonnefoy <[email protected]>
Reviewed-by: Anthonin Bonnefoy <[email protected]>
Discussion: https://postgr.es/m/CAO6_Xqr63qj%3DSx7HY6ZiiQ6R_JbX%2B-p6sTPwDYwTWZjUmjsYBg%40mail.gmail.com
@lhames
Copy link
Contributor

lhames commented Aug 29, 2024

@gmarkall's comment sounds promising:

I've received various reports that this fixes issues related to #71963 on both macOS and LinuxAArch64 systems, and no reports of adverse effects - so FWIW, my confidence in this patch being a good fix is quite high.

On that basis I'm ok with this landing in main. However, if we see any issues related to it the bias should be to revert, rather than try to fix issues (if the fixes would introduce further changes in behavior): We're discussing deprecation and removal of MCJIT and RuntimeDyld (see https://discourse.llvm.org/t/rfc-add-deprecation-warnings-to-mcjit-and-runtimedyld/80465 and https://discourse.llvm.org/t/rfc-removing-mcjit-and-runtimedyld/80464) and the goal at this point is stability rather than bug-fixes.

If any of you are using MCJIT directly you'll want to look at switching to ORC / LLJIT. If you're already on ORC / LLJIT please be aware that we'll be switching to JITLink by default for jit-linking. JITLink's default memory manager already addresses the out-of-range issues, so hopefully they just won't come up after the switch.

postgresql-cfbot pushed a commit to postgresql-cfbot/postgresql that referenced this pull request Aug 30, 2024
Supply a new memory manager for RuntimeDyld that avoids putting ARM code
too far apart.  This is the code from
llvm/llvm-project#71968, copied into our tree
and moved into a new namespace llvm::backport, with minor adjustments to
work on LLVM 12-18.

This should fix the spate of crashes we've been receiving lately from
users on ARM systems.

XXX Ideally the LLVM project will commit this, and then we can resync
with the code in the LLVM 19.x stable branch, instead of using the code
from their PR, before we ship it!

Reported-by: Anthonin Bonnefoy <[email protected]>
Reviewed-by: Anthonin Bonnefoy <[email protected]>
Discussion: https://postgr.es/m/CAO6_Xqr63qj%3DSx7HY6ZiiQ6R_JbX%2B-p6sTPwDYwTWZjUmjsYBg%40mail.gmail.com
MarkusPfundstein pushed a commit to fermioniq/cuda-quantum that referenced this pull request Sep 23, 2024
* Fix LLVM aarch64 relocation overflow

Diffs originated from llvm/llvm-project#71968
and were modified to target the specific version of LLVM in use by CUDA
Quantum (16.0.6).

* Change default ReserveAlloc value from false to true
@castigli
Copy link
Contributor

I have encountered #71963 in the context of the MLIR ExecutionEngine. This PR fixes the bug, is there a reason why this PR was not merged in main?

@MikaelSmith
Copy link
Contributor Author

I didn't see any further feedback to address. Is this waiting on me somehow?

@gmarkall
Copy link
Contributor

I'd add that we've continued to use this with no issues reported in Numba / llvmlite.

@lhames
Copy link
Contributor

lhames commented Dec 18, 2025

I didn't see any further feedback to address. Is this waiting on me somehow?

Nope -- this is just me getting swamped and losing track of this. Thanks for the ping!

I'll merge this, but take the opportunity to remind everyone: MCJIT is going away eventually -- you should move to ORC if you can, and file LLVM issues if you can't.

@lhames lhames merged commit 35b2b24 into llvm:main Dec 18, 2025
10 checks passed
joker-eph pushed a commit that referenced this pull request Dec 19, 2025
…rm (#172833)

This PR enables JIT initialize for AArch64. Up to now it was disabled
because of #71963 which was recently fixed by #71968.
mahesh-attarde pushed a commit to mahesh-attarde/llvm-project that referenced this pull request Dec 19, 2025
Implements `reserveAllocationSpace` and provides an option to enable
`needsToReserveAllocationSpace` for large-memory environments with
AArch64.

The [AArch64
ABI](https://github.com/ARM-software/abi-aa/blob/main/sysvabi64/sysvabi64.rst#7code-models)
has restrictions on the distance between TEXT and GOT sections as the
instructions to reference them are limited to 2 or 4GB. Allocating
sections in multiple blocks can result in distances greater than that on
systems with lots of memory. In those environments several projects
using SectionMemoryManager with MCJIT have run across assertion failures
for the R_AARCH64_ADR_PREL_PG_HI21 instruction as it attempts to address
across distances greater than 2GB (an int32).

Fixes llvm#71963 by allocating all sections in a single contiguous memory
allocation, limiting the distance required for instruction offsets
similar to how pre-compiled binaries would be loaded into memory.

Co-authored-by: Lang Hames <[email protected]>
mahesh-attarde pushed a commit to mahesh-attarde/llvm-project that referenced this pull request Dec 19, 2025
…rm (llvm#172833)

This PR enables JIT initialize for AArch64. Up to now it was disabled
because of llvm#71963 which was recently fixed by llvm#71968.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Assertion failure in RuntimeDyldELF::resolveAArch64Relocation

6 participants