Skip to content

Commit 83fe1c1

Browse files
authored
[SYCL][NVPTX] Obey -fcuda-short-ptr when compiling SYCL for NVPTX (#15642)
This flag turns pointers to CUDA's `shared`, `const`, and `local` address spaces into 32-bit pointers. This can potentially save on registers used for addressing calculations. This option was being accepted by the frontend when compiling SYCL code, but was then reporting an error that the backend datalayout doesn't match the expected target description. This was because the option wasn't being caught by all parts of the toolchain, leading to inconsistencies. This PR allows users to pass the option if they wish. They will see a warning that the compiler is linking against a libclc/libspirv that hasn't been compiled with this option, but this is likely harmless since libspirv doesn't manipulate pointers.
1 parent 8106796 commit 83fe1c1

File tree

3 files changed

+43
-1
lines changed

3 files changed

+43
-1
lines changed

clang/lib/Driver/ToolChains/Clang.cpp

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8593,7 +8593,8 @@ void Clang::ConstructJob(Compilation &C, const JobAction &JA,
85938593
}
85948594
}
85958595

8596-
if (IsCuda) {
8596+
// Propagate -fcuda-short-ptr if compiling CUDA or SYCL for NVPTX
8597+
if (IsCuda || (IsSYCLDevice && Triple.isNVPTX())) {
85978598
if (Args.hasFlag(options::OPT_fcuda_short_ptr,
85988599
options::OPT_fno_cuda_short_ptr, false))
85998600
CmdArgs.push_back("-fcuda-short-ptr");
Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
// Check that when we see the expected data layouts for NVPTX when we pass the
2+
// -nvptx-short-ptr option.
3+
4+
// RUN: %clang_cc1 -fsycl-is-device -disable-llvm-passes \
5+
// RUN: -triple nvptx-nvidia-cuda -emit-llvm %s -o - \
6+
// RUN: | FileCheck %s --check-prefix CHECK32
7+
8+
// RUN: %clang_cc1 -fsycl-is-device -disable-llvm-passes \
9+
// RUN: -triple nvptx-nvidia-cuda -emit-llvm -fcuda-short-ptr -mllvm -nvptx-short-ptr %s -o - \
10+
// RUN: | FileCheck %s --check-prefix CHECK32
11+
12+
// RUN: %clang_cc1 -fsycl-is-device -disable-llvm-passes \
13+
// RUN: -triple nvptx64-nvidia-cuda -emit-llvm %s -o - \
14+
// RUN: | FileCheck %s --check-prefix CHECK64-DEFAULT
15+
16+
// RUN: %clang_cc1 -fsycl-is-device -disable-llvm-passes \
17+
// RUN: -triple nvptx64-nvidia-cuda -emit-llvm -fcuda-short-ptr -mllvm -nvptx-short-ptr %s -o - \
18+
// RUN: | FileCheck %s --check-prefix CHECK64-SHORT
19+
20+
// Targeting a 32-bit NVPTX, check that we see universal 32-bit pointers (the
21+
// option changes nothing)
22+
// CHECK32: target datalayout = "e-p:32:32-i64:64-i128:128-v16:16-v32:32-n16:32:64"
23+
24+
// Targeting a 64-bit NVPTX target, check that we see 32-bit pointers for
25+
// shared (3), const (4), and local (5) address spaces only.
26+
// CHECK64-DEFAULT: target datalayout = "e-i64:64-i128:128-v16:16-v32:32-n16:32:64"
27+
// CHECK64-SHORT: target datalayout = "e-p3:32:32-p4:32:32-p5:32:32-i64:64-i128:128-v16:16-v32:32-n16:32:64"
Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
// RUN: %clang -### -nocudalib \
2+
// RUN: -fsycl -fsycl-targets=nvptx64-nvidia-cuda %s 2>&1 \
3+
// RUN: | FileCheck --check-prefix=CHECK-DEFAULT %s
4+
5+
// RUN: %clang -### -nocudalib \
6+
// RUN: -fsycl -fsycl-targets=nvptx64-nvidia-cuda -fcuda-short-ptr %s 2>&1 \
7+
// RUN: | FileCheck --check-prefix=CHECK-SHORT %s
8+
9+
10+
// CHECK-SHORT: "-mllvm" "--nvptx-short-ptr"
11+
// CHECK-SHORT: "-fcuda-short-ptr"
12+
13+
// CHECK-DEFAULT-NOT: "--nvptx-short-ptr"
14+
// CHECK-DEFAULT-NOT: "-fcuda-short-ptr"

0 commit comments

Comments
 (0)