Skip to content

Commit 5c9513a

Browse files
authored
[NVPTX] cap param alignment at 128 (max supported by ptx) (#96117)
Cap the alignment to 128 bytes as that is the maximum alignment supported by PTX. The restriction is mentioned in the parameter passing section (Note D) of the [PTX Writer's Guide to Interoperability] (https://docs.nvidia.com/cuda/ptx-writers-guide-to-interoperability/index.html#parameter-passing) > D. The alignment must be 1, 2, 4, 8, 16, 32, 64, or 128 bytes.
1 parent 0258a60 commit 5c9513a

File tree

2 files changed

+19
-3
lines changed

2 files changed

+19
-3
lines changed

llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -5038,7 +5038,9 @@ bool NVPTXTargetLowering::getTgtMemIntrinsic(
50385038
/// ensures that alignment is 16 or greater.
50395039
Align NVPTXTargetLowering::getFunctionParamOptimizedAlign(
50405040
const Function *F, Type *ArgTy, const DataLayout &DL) const {
5041-
const uint64_t ABITypeAlign = DL.getABITypeAlign(ArgTy).value();
5041+
// Capping the alignment to 128 bytes as that is the maximum alignment
5042+
// supported by PTX.
5043+
const Align ABITypeAlign = std::min(Align(128), DL.getABITypeAlign(ArgTy));
50425044

50435045
// If a function has linkage different from internal or private, we
50445046
// must use default ABI alignment as external users rely on it. Same
@@ -5048,10 +5050,10 @@ Align NVPTXTargetLowering::getFunctionParamOptimizedAlign(
50485050
/*IgnoreCallbackUses=*/false,
50495051
/*IgnoreAssumeLikeCalls=*/true,
50505052
/*IgnoreLLVMUsed=*/true))
5051-
return Align(ABITypeAlign);
5053+
return ABITypeAlign;
50525054

50535055
assert(!isKernelFunction(*F) && "Expect kernels to have non-local linkage");
5054-
return Align(std::max(uint64_t(16), ABITypeAlign));
5056+
return std::max(Align(16), ABITypeAlign);
50555057
}
50565058

50575059
/// Helper for computing alignment of a device function byval parameter.

llvm/test/CodeGen/NVPTX/max-align.ll

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
; RUN: llc < %s -march=nvptx64 -O0 | FileCheck %s
2+
; RUN: %if ptxas %{ llc < %s -march=nvptx64 -O0 | %ptxas-verify %}
3+
4+
5+
; CHECK: .visible .func (.param .align 128 .b8 func_retval0[256]) repro()
6+
define <64 x i32> @repro() {
7+
8+
; CHECK: .param .align 128 .b8 retval0[256];
9+
%1 = tail call <64 x i32> @test(i32 0)
10+
ret <64 x i32> %1
11+
}
12+
13+
; Function Attrs: nounwind
14+
declare <64 x i32> @test(i32)

0 commit comments

Comments
 (0)