-
Notifications
You must be signed in to change notification settings - Fork 13.5k
[ConstantFold] Drop gep of gep fold entirely #95126
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This is a followup to llvm#93823 and drop the DataLayout-unaware GEP of GEP fold entirely. All cases are now left to the DataLayout-aware constant folder, which will fold everything to a single i8 GEP. We didn't have any test coverage for this fold in LLVM, but some Clang tests change.
@llvm/pr-subscribers-llvm-analysis Author: Nikita Popov (nikic) ChangesThis is a followup to #93823 and drops the DataLayout-unaware GEP of GEP fold entirely. All cases are now left to the DataLayout-aware constant folder, which will fold everything to a single i8 GEP. We didn't have any test coverage for this fold in LLVM, but some Clang tests change. Full diff: https://github.com/llvm/llvm-project/pull/95126.diff 10 Files Affected:
diff --git a/clang/test/CodeGenCUDA/managed-var.cu b/clang/test/CodeGenCUDA/managed-var.cu
index 5206acc62fe00..07e1a1e692c75 100644
--- a/clang/test/CodeGenCUDA/managed-var.cu
+++ b/clang/test/CodeGenCUDA/managed-var.cu
@@ -127,9 +127,10 @@ __device__ __host__ float load2() {
// HOST-LABEL: define {{.*}}@_Z5load3v()
// HOST: %ld.managed = load ptr, ptr @v2, align 16
-// HOST: %0 = getelementptr inbounds [100 x %struct.vec], ptr %ld.managed, i64 0, i64 1, i32 1
-// HOST: %1 = load float, ptr %0, align 4
-// HOST: ret float %1
+// HOST: %0 = getelementptr inbounds [100 x %struct.vec], ptr %ld.managed, i64 0, i64 1
+// HOST: %1 = getelementptr inbounds %struct.vec, ptr %0, i32 0, i32 1
+// HOST: %2 = load float, ptr %1, align 4
+// HOST: ret float %2
float load3() {
return v2[1].y;
}
@@ -139,10 +140,11 @@ float load3() {
// HOST: %0 = getelementptr inbounds [100 x %struct.vec], ptr %ld.managed, i64 0, i64 1
// HOST: %1 = ptrtoint ptr %0 to i64
// HOST: %ld.managed1 = load ptr, ptr @v2, align 16
-// HOST: %2 = getelementptr inbounds [100 x %struct.vec], ptr %ld.managed1, i64 0, i64 1, i32 1
-// HOST: %3 = ptrtoint ptr %2 to i64
-// HOST: %4 = sub i64 %3, %1
-// HOST: %sub.ptr.div = sdiv exact i64 %4, 4
+// HOST: %2 = getelementptr inbounds [100 x %struct.vec], ptr %ld.managed1, i64 0, i64 1
+// HOST: %3 = getelementptr inbounds %struct.vec, ptr %2, i32 0, i32 1
+// HOST: %4 = ptrtoint ptr %3 to i64
+// HOST: %5 = sub i64 %4, %1
+// HOST: %sub.ptr.div = sdiv exact i64 %5, 4
// HOST: %conv = sitofp i64 %sub.ptr.div to float
// HOST: ret float %conv
float addr_taken2() {
diff --git a/clang/test/CodeGenCXX/2011-12-19-init-list-ctor.cpp b/clang/test/CodeGenCXX/2011-12-19-init-list-ctor.cpp
index 4033adc8f0390..14557829268ef 100644
--- a/clang/test/CodeGenCXX/2011-12-19-init-list-ctor.cpp
+++ b/clang/test/CodeGenCXX/2011-12-19-init-list-ctor.cpp
@@ -21,6 +21,6 @@ struct S {
// CHECK: store i32 0, ptr @arr
// CHECK: call void @_ZN1AC1EPKc(ptr {{[^,]*}} getelementptr inbounds (%struct.S, ptr @arr, i32 0, i32 1), ptr noundef @.str)
// CHECK: store i32 1, ptr getelementptr inbounds (%struct.S, ptr @arr, i64 1)
-// CHECK: call void @_ZN1AC1EPKc(ptr {{[^,]*}} getelementptr inbounds (%struct.S, ptr @arr, i64 1, i32 1), ptr noundef @.str.1)
+// CHECK: call void @_ZN1AC1EPKc(ptr {{[^,]*}} getelementptr inbounds (%struct.S, ptr getelementptr inbounds (%struct.S, ptr @arr, i64 1), i32 0, i32 1), ptr noundef @.str.1)
// CHECK: store i32 2, ptr getelementptr inbounds (%struct.S, ptr @arr, i64 2)
-// CHECK: call void @_ZN1AC1EPKc(ptr {{[^,]*}} getelementptr inbounds (%struct.S, ptr @arr, i64 2, i32 1), ptr noundef @.str.2)
+// CHECK: call void @_ZN1AC1EPKc(ptr {{[^,]*}} getelementptr inbounds (%struct.S, ptr getelementptr inbounds (%struct.S, ptr @arr, i64 2), i32 0, i32 1), ptr noundef @.str.2)
diff --git a/clang/test/CodeGenCXX/cxx0x-initializer-stdinitializerlist-pr12086.cpp b/clang/test/CodeGenCXX/cxx0x-initializer-stdinitializerlist-pr12086.cpp
index 6fbe4c7fd17a7..caa92f47a93c2 100644
--- a/clang/test/CodeGenCXX/cxx0x-initializer-stdinitializerlist-pr12086.cpp
+++ b/clang/test/CodeGenCXX/cxx0x-initializer-stdinitializerlist-pr12086.cpp
@@ -79,12 +79,12 @@ std::initializer_list<std::initializer_list<int>> nested = {
// CHECK-DYNAMIC-BL: store i32 {{.*}}, ptr getelementptr inbounds (i32, ptr @_ZGR6nested1_, i64 1)
// CHECK-DYNAMIC-BL: store ptr @_ZGR6nested1_,
// CHECK-DYNAMIC-BL: ptr getelementptr inbounds ({{.*}}, ptr @_ZGR6nested_, i64 1), align 8
-// CHECK-DYNAMIC-BL: store i64 2, ptr getelementptr inbounds ({{.*}}, ptr @_ZGR6nested_, i64 1, i32 1), align 8
+// CHECK-DYNAMIC-BL: store i64 2, ptr getelementptr inbounds ({{.*}}, ptr getelementptr inbounds ({{.*}}, ptr @_ZGR6nested_, i64 1), i32 0, i32 1), align 8
// CHECK-DYNAMIC-BL: store i32 5, ptr @_ZGR6nested2_
// CHECK-DYNAMIC-BL: store i32 {{.*}}, ptr getelementptr inbounds (i32, ptr @_ZGR6nested2_, i64 1)
// CHECK-DYNAMIC-BL: store ptr @_ZGR6nested2_,
// CHECK-DYNAMIC-BL: ptr getelementptr inbounds ({{.*}}, ptr @_ZGR6nested_, i64 2), align 8
-// CHECK-DYNAMIC-BL: store i64 2, ptr getelementptr inbounds ({{.*}}, ptr @_ZGR6nested_, i64 2, i32 1), align 8
+// CHECK-DYNAMIC-BL: store i64 2, ptr getelementptr inbounds ({{.*}}, ptr getelementptr inbounds ({{.*}}, ptr @_ZGR6nested_, i64 2), i32 0, i32 1), align 8
// CHECK-DYNAMIC-BL: store ptr @_ZGR6nested_,
// CHECK-DYNAMIC-BL: ptr @nested, align 8
// CHECK-DYNAMIC-BL: store i64 3, ptr getelementptr inbounds ({{.*}}, ptr @nested, i32 0, i32 1), align 8
@@ -119,13 +119,13 @@ std::initializer_list<std::initializer_list<int>> nested = {
// CHECK-DYNAMIC-BE: store ptr @_ZGR6nested1_,
// CHECK-DYNAMIC-BE: ptr getelementptr inbounds ({{.*}}, ptr @_ZGR6nested_, i64 1), align 8
// CHECK-DYNAMIC-BE: store ptr getelementptr inbounds ([2 x i32], ptr @_ZGR6nested1_, i64 0, i64 2),
-// CHECK-DYNAMIC-BE: ptr getelementptr inbounds ({{.*}}, ptr @_ZGR6nested_, i64 1, i32 1), align 8
+// CHECK-DYNAMIC-BE: ptr getelementptr inbounds ({{.*}}, ptr getelementptr inbounds ({{.*}}, ptr @_ZGR6nested_, i64 1), i32 0, i32 1), align 8
// CHECK-DYNAMIC-BE: store i32 5, ptr @_ZGR6nested2_
// CHECK-DYNAMIC-BE: store i32 {{.*}}, ptr getelementptr inbounds (i32, ptr @_ZGR6nested2_, i64 1)
// CHECK-DYNAMIC-BE: store ptr @_ZGR6nested2_,
// CHECK-DYNAMIC-BE: ptr getelementptr inbounds ({{.*}}, ptr @_ZGR6nested_, i64 2), align 8
// CHECK-DYNAMIC-BE: store ptr getelementptr inbounds ([2 x i32], ptr @_ZGR6nested2_, i64 0, i64 2),
-// CHECK-DYNAMIC-BE: ptr getelementptr inbounds ({{.*}}, ptr @_ZGR6nested_, i64 2, i32 1), align 8
+// CHECK-DYNAMIC-BE: ptr getelementptr inbounds ({{.*}}, ptr getelementptr inbounds ({{.*}}, ptr @_ZGR6nested_, i64 2), i32 0, i32 1), align 8
// CHECK-DYNAMIC-BE: store ptr @_ZGR6nested_,
// CHECK-DYNAMIC-BE: ptr @nested, align 8
// CHECK-DYNAMIC-BE: store ptr getelementptr inbounds ([3 x {{.*}}], ptr @_ZGR6nested_, i64 0, i64 3),
diff --git a/clang/test/CodeGenCXX/cxx0x-initializer-stdinitializerlist.cpp b/clang/test/CodeGenCXX/cxx0x-initializer-stdinitializerlist.cpp
index 3d0cf968bfd54..ef05f0334cd75 100644
--- a/clang/test/CodeGenCXX/cxx0x-initializer-stdinitializerlist.cpp
+++ b/clang/test/CodeGenCXX/cxx0x-initializer-stdinitializerlist.cpp
@@ -365,12 +365,12 @@ namespace partly_constant {
//
// Second init list.
// CHECK: store ptr {{.*}}@[[PARTLY_CONSTANT_SECOND]]{{.*}}, ptr getelementptr inbounds ({{.*}}, ptr {{.*}}@[[PARTLY_CONSTANT_INNER]]{{.*}}, i64 1)
- // CHECK: store i64 2, ptr getelementptr inbounds ({{.*}}, ptr {{.*}}@[[PARTLY_CONSTANT_INNER]]{{.*}}, i64 1, i32 1)
+ // CHECK: store i64 2, ptr getelementptr inbounds ({{.*}}, ptr getelementptr inbounds ({{.*}}, ptr {{.*}}@[[PARTLY_CONSTANT_INNER]]{{.*}}, i64 1), i32 0, i32 1)
//
// Third init list.
// CHECK-NOT: @[[PARTLY_CONSTANT_THIRD]],
// CHECK: store ptr {{.*}}@[[PARTLY_CONSTANT_THIRD]]{{.*}}, ptr getelementptr inbounds ({{.*}}, ptr {{.*}}@[[PARTLY_CONSTANT_INNER]]{{.*}}, i64 2)
- // CHECK: store i64 4, ptr getelementptr inbounds ({{.*}}, ptr {{.*}}@[[PARTLY_CONSTANT_INNER]]{{.*}}, i64 2, i32 1)
+ // CHECK: store i64 4, ptr getelementptr inbounds ({{.*}}, ptr getelementptr inbounds ({{.*}}, ptr {{.*}}@[[PARTLY_CONSTANT_INNER]]{{.*}}, i64 2), i32 0, i32 1)
// CHECK-NOT: @[[PARTLY_CONSTANT_THIRD]],
//
// Outer init list.
diff --git a/clang/test/CodeGenCXX/ms-inline-asm-fields.cpp b/clang/test/CodeGenCXX/ms-inline-asm-fields.cpp
index 403e9c8427760..e3441d0b4614e 100644
--- a/clang/test/CodeGenCXX/ms-inline-asm-fields.cpp
+++ b/clang/test/CodeGenCXX/ms-inline-asm-fields.cpp
@@ -24,7 +24,7 @@ extern "C" int test_param_field(A p) {
extern "C" int test_namespace_global() {
// CHECK: define{{.*}} i32 @test_namespace_global()
-// CHECK: call i32 asm sideeffect inteldialect "mov eax, $1", "{{.*}}"(ptr elementtype(i32) getelementptr inbounds (%struct.A, ptr @_ZN4asdf8a_globalE, i32 0, i32 2, i32 1))
+// CHECK: call i32 asm sideeffect inteldialect "mov eax, $1", "{{.*}}"(ptr elementtype(i32) getelementptr inbounds (%"struct.A::B", ptr getelementptr inbounds (%struct.A, ptr @_ZN4asdf8a_globalE, i32 0, i32 2), i32 0, i32 1))
// CHECK: ret i32
__asm mov eax, asdf::a_global.a3.b2
}
diff --git a/clang/test/OpenMP/threadprivate_codegen.cpp b/clang/test/OpenMP/threadprivate_codegen.cpp
index 2ee328008cc4b..7a6269954d39e 100644
--- a/clang/test/OpenMP/threadprivate_codegen.cpp
+++ b/clang/test/OpenMP/threadprivate_codegen.cpp
@@ -2586,7 +2586,7 @@ int foobar() {
// SIMD1-NEXT: [[TMP12:%.*]] = load i32, ptr [[RES]], align 4
// SIMD1-NEXT: [[ADD3:%.*]] = add nsw i32 [[TMP12]], [[TMP11]]
// SIMD1-NEXT: store i32 [[ADD3]], ptr [[RES]], align 4
-// SIMD1-NEXT: [[TMP13:%.*]] = load i32, ptr getelementptr inbounds ([2 x [3 x %struct.S1]], ptr @arr_x, i64 0, i64 1, i64 1), align 4
+// SIMD1-NEXT: [[TMP13:%.*]] = load i32, ptr getelementptr inbounds ([3 x %struct.S1], ptr getelementptr inbounds ([2 x [3 x %struct.S1]], ptr @arr_x, i64 0, i64 1), i64 0, i64 1), align 4
// SIMD1-NEXT: [[TMP14:%.*]] = load i32, ptr [[RES]], align 4
// SIMD1-NEXT: [[ADD4:%.*]] = add nsw i32 [[TMP14]], [[TMP13]]
// SIMD1-NEXT: store i32 [[ADD4]], ptr [[RES]], align 4
@@ -2663,7 +2663,7 @@ int foobar() {
// SIMD1-NEXT: [[TMP6:%.*]] = load i32, ptr [[RES]], align 4
// SIMD1-NEXT: [[ADD2:%.*]] = add nsw i32 [[TMP6]], [[TMP5]]
// SIMD1-NEXT: store i32 [[ADD2]], ptr [[RES]], align 4
-// SIMD1-NEXT: [[TMP7:%.*]] = load i32, ptr getelementptr inbounds ([2 x [3 x %struct.S1]], ptr @arr_x, i64 0, i64 1, i64 1), align 4
+// SIMD1-NEXT: [[TMP7:%.*]] = load i32, ptr getelementptr inbounds ([3 x %struct.S1], ptr getelementptr inbounds ([2 x [3 x %struct.S1]], ptr @arr_x, i64 0, i64 1), i64 0, i64 1), align 4
// SIMD1-NEXT: [[TMP8:%.*]] = load i32, ptr [[RES]], align 4
// SIMD1-NEXT: [[ADD3:%.*]] = add nsw i32 [[TMP8]], [[TMP7]]
// SIMD1-NEXT: store i32 [[ADD3]], ptr [[RES]], align 4
@@ -3052,7 +3052,7 @@ int foobar() {
// SIMD2-NEXT: [[TMP12:%.*]] = load i32, ptr [[RES]], align 4, !dbg [[DBG187:![0-9]+]]
// SIMD2-NEXT: [[ADD3:%.*]] = add nsw i32 [[TMP12]], [[TMP11]], !dbg [[DBG187]]
// SIMD2-NEXT: store i32 [[ADD3]], ptr [[RES]], align 4, !dbg [[DBG187]]
-// SIMD2-NEXT: [[TMP13:%.*]] = load i32, ptr getelementptr inbounds ([2 x [3 x %struct.S1]], ptr @arr_x, i64 0, i64 1, i64 1), align 4, !dbg [[DBG188:![0-9]+]]
+// SIMD2-NEXT: [[TMP13:%.*]] = load i32, ptr getelementptr inbounds ([3 x %struct.S1], ptr getelementptr inbounds ([2 x [3 x %struct.S1]], ptr @arr_x, i64 0, i64 1), i64 0, i64 1), align 4, !dbg [[DBG188:![0-9]+]]
// SIMD2-NEXT: [[TMP14:%.*]] = load i32, ptr [[RES]], align 4, !dbg [[DBG189:![0-9]+]]
// SIMD2-NEXT: [[ADD4:%.*]] = add nsw i32 [[TMP14]], [[TMP13]], !dbg [[DBG189]]
// SIMD2-NEXT: store i32 [[ADD4]], ptr [[RES]], align 4, !dbg [[DBG189]]
@@ -3133,7 +3133,7 @@ int foobar() {
// SIMD2-NEXT: [[TMP6:%.*]] = load i32, ptr [[RES]], align 4, !dbg [[DBG222:![0-9]+]]
// SIMD2-NEXT: [[ADD2:%.*]] = add nsw i32 [[TMP6]], [[TMP5]], !dbg [[DBG222]]
// SIMD2-NEXT: store i32 [[ADD2]], ptr [[RES]], align 4, !dbg [[DBG222]]
-// SIMD2-NEXT: [[TMP7:%.*]] = load i32, ptr getelementptr inbounds ([2 x [3 x %struct.S1]], ptr @arr_x, i64 0, i64 1, i64 1), align 4, !dbg [[DBG223:![0-9]+]]
+// SIMD2-NEXT: [[TMP7:%.*]] = load i32, ptr getelementptr inbounds ([3 x %struct.S1], ptr getelementptr inbounds ([2 x [3 x %struct.S1]], ptr @arr_x, i64 0, i64 1), i64 0, i64 1), align 4, !dbg [[DBG223:![0-9]+]]
// SIMD2-NEXT: [[TMP8:%.*]] = load i32, ptr [[RES]], align 4, !dbg [[DBG224:![0-9]+]]
// SIMD2-NEXT: [[ADD3:%.*]] = add nsw i32 [[TMP8]], [[TMP7]], !dbg [[DBG224]]
// SIMD2-NEXT: store i32 [[ADD3]], ptr [[RES]], align 4, !dbg [[DBG224]]
@@ -5707,7 +5707,7 @@ int foobar() {
// SIMD3-NEXT: [[TMP12:%.*]] = load i32, ptr [[RES]], align 4
// SIMD3-NEXT: [[ADD3:%.*]] = add nsw i32 [[TMP12]], [[TMP11]]
// SIMD3-NEXT: store i32 [[ADD3]], ptr [[RES]], align 4
-// SIMD3-NEXT: [[TMP13:%.*]] = load i32, ptr getelementptr inbounds ([2 x [3 x %struct.S1]], ptr @arr_x, i64 0, i64 1, i64 1), align 4
+// SIMD3-NEXT: [[TMP13:%.*]] = load i32, ptr getelementptr inbounds ([3 x %struct.S1], ptr getelementptr inbounds ([2 x [3 x %struct.S1]], ptr @arr_x, i64 0, i64 1), i64 0, i64 1), align 4
// SIMD3-NEXT: [[TMP14:%.*]] = load i32, ptr [[RES]], align 4
// SIMD3-NEXT: [[ADD4:%.*]] = add nsw i32 [[TMP14]], [[TMP13]]
// SIMD3-NEXT: store i32 [[ADD4]], ptr [[RES]], align 4
@@ -5784,7 +5784,7 @@ int foobar() {
// SIMD3-NEXT: [[TMP6:%.*]] = load i32, ptr [[RES]], align 4
// SIMD3-NEXT: [[ADD2:%.*]] = add nsw i32 [[TMP6]], [[TMP5]]
// SIMD3-NEXT: store i32 [[ADD2]], ptr [[RES]], align 4
-// SIMD3-NEXT: [[TMP7:%.*]] = load i32, ptr getelementptr inbounds ([2 x [3 x %struct.S1]], ptr @arr_x, i64 0, i64 1, i64 1), align 4
+// SIMD3-NEXT: [[TMP7:%.*]] = load i32, ptr getelementptr inbounds ([3 x %struct.S1], ptr getelementptr inbounds ([2 x [3 x %struct.S1]], ptr @arr_x, i64 0, i64 1), i64 0, i64 1), align 4
// SIMD3-NEXT: [[TMP8:%.*]] = load i32, ptr [[RES]], align 4
// SIMD3-NEXT: [[ADD3:%.*]] = add nsw i32 [[TMP8]], [[TMP7]]
// SIMD3-NEXT: store i32 [[ADD3]], ptr [[RES]], align 4
@@ -6173,7 +6173,7 @@ int foobar() {
// SIMD4-NEXT: [[TMP12:%.*]] = load i32, ptr [[RES]], align 4, !dbg [[DBG187:![0-9]+]]
// SIMD4-NEXT: [[ADD3:%.*]] = add nsw i32 [[TMP12]], [[TMP11]], !dbg [[DBG187]]
// SIMD4-NEXT: store i32 [[ADD3]], ptr [[RES]], align 4, !dbg [[DBG187]]
-// SIMD4-NEXT: [[TMP13:%.*]] = load i32, ptr getelementptr inbounds ([2 x [3 x %struct.S1]], ptr @arr_x, i64 0, i64 1, i64 1), align 4, !dbg [[DBG188:![0-9]+]]
+// SIMD4-NEXT: [[TMP13:%.*]] = load i32, ptr getelementptr inbounds ([3 x %struct.S1], ptr getelementptr inbounds ([2 x [3 x %struct.S1]], ptr @arr_x, i64 0, i64 1), i64 0, i64 1), align 4, !dbg [[DBG188:![0-9]+]]
// SIMD4-NEXT: [[TMP14:%.*]] = load i32, ptr [[RES]], align 4, !dbg [[DBG189:![0-9]+]]
// SIMD4-NEXT: [[ADD4:%.*]] = add nsw i32 [[TMP14]], [[TMP13]], !dbg [[DBG189]]
// SIMD4-NEXT: store i32 [[ADD4]], ptr [[RES]], align 4, !dbg [[DBG189]]
@@ -6254,7 +6254,7 @@ int foobar() {
// SIMD4-NEXT: [[TMP6:%.*]] = load i32, ptr [[RES]], align 4, !dbg [[DBG222:![0-9]+]]
// SIMD4-NEXT: [[ADD2:%.*]] = add nsw i32 [[TMP6]], [[TMP5]], !dbg [[DBG222]]
// SIMD4-NEXT: store i32 [[ADD2]], ptr [[RES]], align 4, !dbg [[DBG222]]
-// SIMD4-NEXT: [[TMP7:%.*]] = load i32, ptr getelementptr inbounds ([2 x [3 x %struct.S1]], ptr @arr_x, i64 0, i64 1, i64 1), align 4, !dbg [[DBG223:![0-9]+]]
+// SIMD4-NEXT: [[TMP7:%.*]] = load i32, ptr getelementptr inbounds ([3 x %struct.S1], ptr getelementptr inbounds ([2 x [3 x %struct.S1]], ptr @arr_x, i64 0, i64 1), i64 0, i64 1), align 4, !dbg [[DBG223:![0-9]+]]
// SIMD4-NEXT: [[TMP8:%.*]] = load i32, ptr [[RES]], align 4, !dbg [[DBG224:![0-9]+]]
// SIMD4-NEXT: [[ADD3:%.*]] = add nsw i32 [[TMP8]], [[TMP7]], !dbg [[DBG224]]
// SIMD4-NEXT: store i32 [[ADD3]], ptr [[RES]], align 4, !dbg [[DBG224]]
diff --git a/llvm/include/llvm/IR/ConstantFold.h b/llvm/include/llvm/IR/ConstantFold.h
index 9b3c8a0e5a632..42043d365b2d3 100644
--- a/llvm/include/llvm/IR/ConstantFold.h
+++ b/llvm/include/llvm/IR/ConstantFold.h
@@ -52,7 +52,7 @@ namespace llvm {
Constant *V2);
Constant *ConstantFoldCompareInstruction(CmpInst::Predicate Predicate,
Constant *C1, Constant *C2);
- Constant *ConstantFoldGetElementPtr(Type *Ty, Constant *C, bool InBounds,
+ Constant *ConstantFoldGetElementPtr(Type *Ty, Constant *C,
std::optional<ConstantRange> InRange,
ArrayRef<Value *> Idxs);
} // End llvm namespace
diff --git a/llvm/lib/Analysis/InstructionSimplify.cpp b/llvm/lib/Analysis/InstructionSimplify.cpp
index 8b2aa6b9f18b0..00895860ca49b 100644
--- a/llvm/lib/Analysis/InstructionSimplify.cpp
+++ b/llvm/lib/Analysis/InstructionSimplify.cpp
@@ -5068,9 +5068,8 @@ static Value *simplifyGEPInst(Type *SrcTy, Value *Ptr,
return nullptr;
if (!ConstantExpr::isSupportedGetElementPtr(SrcTy))
- // TODO(gep_nowrap): Pass on the whole GEPNoWrapFlags.
- return ConstantFoldGetElementPtr(SrcTy, cast<Constant>(Ptr),
- NW.isInBounds(), std::nullopt, Indices);
+ return ConstantFoldGetElementPtr(SrcTy, cast<Constant>(Ptr), std::nullopt,
+ Indices);
auto *CE =
ConstantExpr::getGetElementPtr(SrcTy, cast<Constant>(Ptr), Indices, NW);
diff --git a/llvm/lib/IR/ConstantFold.cpp b/llvm/lib/IR/ConstantFold.cpp
index 77a833610a3a9..34bcf36ec212c 100644
--- a/llvm/lib/IR/ConstantFold.cpp
+++ b/llvm/lib/IR/ConstantFold.cpp
@@ -1404,35 +1404,7 @@ Constant *llvm::ConstantFoldCompareInstruction(CmpInst::Predicate Predicate,
return nullptr;
}
-// Combine Indices - If the source pointer to this getelementptr instruction
-// is a getelementptr instruction, combine the indices of the two
-// getelementptr instructions into a single instruction.
-static Constant *foldGEPOfGEP(GEPOperator *GEP, Type *PointeeTy, bool InBounds,
- ArrayRef<Value *> Idxs) {
- if (PointeeTy != GEP->getResultElementType())
- return nullptr;
-
- // Leave inrange handling to DL-aware constant folding.
- if (GEP->getInRange())
- return nullptr;
-
- // Only handle simple case with leading zero index. We cannot perform an
- // actual addition as we don't know the correct index type size to use.
- Constant *Idx0 = cast<Constant>(Idxs[0]);
- if (!Idx0->isNullValue())
- return nullptr;
-
- SmallVector<Value*, 16> NewIndices;
- NewIndices.reserve(Idxs.size() + GEP->getNumIndices());
- NewIndices.append(GEP->idx_begin(), GEP->idx_end());
- NewIndices.append(Idxs.begin() + 1, Idxs.end());
- return ConstantExpr::getGetElementPtr(
- GEP->getSourceElementType(), cast<Constant>(GEP->getPointerOperand()),
- NewIndices, InBounds && GEP->isInBounds());
-}
-
Constant *llvm::ConstantFoldGetElementPtr(Type *PointeeTy, Constant *C,
- bool InBounds,
std::optional<ConstantRange> InRange,
ArrayRef<Value *> Idxs) {
if (Idxs.empty()) return C;
@@ -1462,10 +1434,5 @@ Constant *llvm::ConstantFoldGetElementPtr(Type *PointeeTy, Constant *C,
cast<VectorType>(GEPTy)->getElementCount(), C)
: C;
- if (ConstantExpr *CE = dyn_cast<ConstantExpr>(C))
- if (auto *GEP = dyn_cast<GEPOperator>(CE))
- if (Constant *C = foldGEPOfGEP(GEP, PointeeTy, InBounds, Idxs))
- return C;
-
return nullptr;
}
diff --git a/llvm/lib/IR/Constants.cpp b/llvm/lib/IR/Constants.cpp
index a76be441875a1..d07907372f0e4 100644
--- a/llvm/lib/IR/Constants.cpp
+++ b/llvm/lib/IR/Constants.cpp
@@ -2438,8 +2438,7 @@ Constant *ConstantExpr::getGetElementPtr(Type *Ty, Constant *C,
assert(Ty && "Must specify element type");
assert(isSupportedGetElementPtr(Ty) && "Element type is unsupported!");
- if (Constant *FC =
- ConstantFoldGetElementPtr(Ty, C, NW.isInBounds(), InRange, Idxs))
+ if (Constant *FC = ConstantFoldGetElementPtr(Ty, C, InRange, Idxs))
return FC; // Fold a few common cases.
assert(GetElementPtrInst::getIndexedType(Ty, Idxs) && "GEP indices invalid!");
|
@llvm/pr-subscribers-llvm-ir Author: Nikita Popov (nikic) ChangesThis is a followup to #93823 and drops the DataLayout-unaware GEP of GEP fold entirely. All cases are now left to the DataLayout-aware constant folder, which will fold everything to a single i8 GEP. We didn't have any test coverage for this fold in LLVM, but some Clang tests change. Full diff: https://github.com/llvm/llvm-project/pull/95126.diff 10 Files Affected:
diff --git a/clang/test/CodeGenCUDA/managed-var.cu b/clang/test/CodeGenCUDA/managed-var.cu
index 5206acc62fe00..07e1a1e692c75 100644
--- a/clang/test/CodeGenCUDA/managed-var.cu
+++ b/clang/test/CodeGenCUDA/managed-var.cu
@@ -127,9 +127,10 @@ __device__ __host__ float load2() {
// HOST-LABEL: define {{.*}}@_Z5load3v()
// HOST: %ld.managed = load ptr, ptr @v2, align 16
-// HOST: %0 = getelementptr inbounds [100 x %struct.vec], ptr %ld.managed, i64 0, i64 1, i32 1
-// HOST: %1 = load float, ptr %0, align 4
-// HOST: ret float %1
+// HOST: %0 = getelementptr inbounds [100 x %struct.vec], ptr %ld.managed, i64 0, i64 1
+// HOST: %1 = getelementptr inbounds %struct.vec, ptr %0, i32 0, i32 1
+// HOST: %2 = load float, ptr %1, align 4
+// HOST: ret float %2
float load3() {
return v2[1].y;
}
@@ -139,10 +140,11 @@ float load3() {
// HOST: %0 = getelementptr inbounds [100 x %struct.vec], ptr %ld.managed, i64 0, i64 1
// HOST: %1 = ptrtoint ptr %0 to i64
// HOST: %ld.managed1 = load ptr, ptr @v2, align 16
-// HOST: %2 = getelementptr inbounds [100 x %struct.vec], ptr %ld.managed1, i64 0, i64 1, i32 1
-// HOST: %3 = ptrtoint ptr %2 to i64
-// HOST: %4 = sub i64 %3, %1
-// HOST: %sub.ptr.div = sdiv exact i64 %4, 4
+// HOST: %2 = getelementptr inbounds [100 x %struct.vec], ptr %ld.managed1, i64 0, i64 1
+// HOST: %3 = getelementptr inbounds %struct.vec, ptr %2, i32 0, i32 1
+// HOST: %4 = ptrtoint ptr %3 to i64
+// HOST: %5 = sub i64 %4, %1
+// HOST: %sub.ptr.div = sdiv exact i64 %5, 4
// HOST: %conv = sitofp i64 %sub.ptr.div to float
// HOST: ret float %conv
float addr_taken2() {
diff --git a/clang/test/CodeGenCXX/2011-12-19-init-list-ctor.cpp b/clang/test/CodeGenCXX/2011-12-19-init-list-ctor.cpp
index 4033adc8f0390..14557829268ef 100644
--- a/clang/test/CodeGenCXX/2011-12-19-init-list-ctor.cpp
+++ b/clang/test/CodeGenCXX/2011-12-19-init-list-ctor.cpp
@@ -21,6 +21,6 @@ struct S {
// CHECK: store i32 0, ptr @arr
// CHECK: call void @_ZN1AC1EPKc(ptr {{[^,]*}} getelementptr inbounds (%struct.S, ptr @arr, i32 0, i32 1), ptr noundef @.str)
// CHECK: store i32 1, ptr getelementptr inbounds (%struct.S, ptr @arr, i64 1)
-// CHECK: call void @_ZN1AC1EPKc(ptr {{[^,]*}} getelementptr inbounds (%struct.S, ptr @arr, i64 1, i32 1), ptr noundef @.str.1)
+// CHECK: call void @_ZN1AC1EPKc(ptr {{[^,]*}} getelementptr inbounds (%struct.S, ptr getelementptr inbounds (%struct.S, ptr @arr, i64 1), i32 0, i32 1), ptr noundef @.str.1)
// CHECK: store i32 2, ptr getelementptr inbounds (%struct.S, ptr @arr, i64 2)
-// CHECK: call void @_ZN1AC1EPKc(ptr {{[^,]*}} getelementptr inbounds (%struct.S, ptr @arr, i64 2, i32 1), ptr noundef @.str.2)
+// CHECK: call void @_ZN1AC1EPKc(ptr {{[^,]*}} getelementptr inbounds (%struct.S, ptr getelementptr inbounds (%struct.S, ptr @arr, i64 2), i32 0, i32 1), ptr noundef @.str.2)
diff --git a/clang/test/CodeGenCXX/cxx0x-initializer-stdinitializerlist-pr12086.cpp b/clang/test/CodeGenCXX/cxx0x-initializer-stdinitializerlist-pr12086.cpp
index 6fbe4c7fd17a7..caa92f47a93c2 100644
--- a/clang/test/CodeGenCXX/cxx0x-initializer-stdinitializerlist-pr12086.cpp
+++ b/clang/test/CodeGenCXX/cxx0x-initializer-stdinitializerlist-pr12086.cpp
@@ -79,12 +79,12 @@ std::initializer_list<std::initializer_list<int>> nested = {
// CHECK-DYNAMIC-BL: store i32 {{.*}}, ptr getelementptr inbounds (i32, ptr @_ZGR6nested1_, i64 1)
// CHECK-DYNAMIC-BL: store ptr @_ZGR6nested1_,
// CHECK-DYNAMIC-BL: ptr getelementptr inbounds ({{.*}}, ptr @_ZGR6nested_, i64 1), align 8
-// CHECK-DYNAMIC-BL: store i64 2, ptr getelementptr inbounds ({{.*}}, ptr @_ZGR6nested_, i64 1, i32 1), align 8
+// CHECK-DYNAMIC-BL: store i64 2, ptr getelementptr inbounds ({{.*}}, ptr getelementptr inbounds ({{.*}}, ptr @_ZGR6nested_, i64 1), i32 0, i32 1), align 8
// CHECK-DYNAMIC-BL: store i32 5, ptr @_ZGR6nested2_
// CHECK-DYNAMIC-BL: store i32 {{.*}}, ptr getelementptr inbounds (i32, ptr @_ZGR6nested2_, i64 1)
// CHECK-DYNAMIC-BL: store ptr @_ZGR6nested2_,
// CHECK-DYNAMIC-BL: ptr getelementptr inbounds ({{.*}}, ptr @_ZGR6nested_, i64 2), align 8
-// CHECK-DYNAMIC-BL: store i64 2, ptr getelementptr inbounds ({{.*}}, ptr @_ZGR6nested_, i64 2, i32 1), align 8
+// CHECK-DYNAMIC-BL: store i64 2, ptr getelementptr inbounds ({{.*}}, ptr getelementptr inbounds ({{.*}}, ptr @_ZGR6nested_, i64 2), i32 0, i32 1), align 8
// CHECK-DYNAMIC-BL: store ptr @_ZGR6nested_,
// CHECK-DYNAMIC-BL: ptr @nested, align 8
// CHECK-DYNAMIC-BL: store i64 3, ptr getelementptr inbounds ({{.*}}, ptr @nested, i32 0, i32 1), align 8
@@ -119,13 +119,13 @@ std::initializer_list<std::initializer_list<int>> nested = {
// CHECK-DYNAMIC-BE: store ptr @_ZGR6nested1_,
// CHECK-DYNAMIC-BE: ptr getelementptr inbounds ({{.*}}, ptr @_ZGR6nested_, i64 1), align 8
// CHECK-DYNAMIC-BE: store ptr getelementptr inbounds ([2 x i32], ptr @_ZGR6nested1_, i64 0, i64 2),
-// CHECK-DYNAMIC-BE: ptr getelementptr inbounds ({{.*}}, ptr @_ZGR6nested_, i64 1, i32 1), align 8
+// CHECK-DYNAMIC-BE: ptr getelementptr inbounds ({{.*}}, ptr getelementptr inbounds ({{.*}}, ptr @_ZGR6nested_, i64 1), i32 0, i32 1), align 8
// CHECK-DYNAMIC-BE: store i32 5, ptr @_ZGR6nested2_
// CHECK-DYNAMIC-BE: store i32 {{.*}}, ptr getelementptr inbounds (i32, ptr @_ZGR6nested2_, i64 1)
// CHECK-DYNAMIC-BE: store ptr @_ZGR6nested2_,
// CHECK-DYNAMIC-BE: ptr getelementptr inbounds ({{.*}}, ptr @_ZGR6nested_, i64 2), align 8
// CHECK-DYNAMIC-BE: store ptr getelementptr inbounds ([2 x i32], ptr @_ZGR6nested2_, i64 0, i64 2),
-// CHECK-DYNAMIC-BE: ptr getelementptr inbounds ({{.*}}, ptr @_ZGR6nested_, i64 2, i32 1), align 8
+// CHECK-DYNAMIC-BE: ptr getelementptr inbounds ({{.*}}, ptr getelementptr inbounds ({{.*}}, ptr @_ZGR6nested_, i64 2), i32 0, i32 1), align 8
// CHECK-DYNAMIC-BE: store ptr @_ZGR6nested_,
// CHECK-DYNAMIC-BE: ptr @nested, align 8
// CHECK-DYNAMIC-BE: store ptr getelementptr inbounds ([3 x {{.*}}], ptr @_ZGR6nested_, i64 0, i64 3),
diff --git a/clang/test/CodeGenCXX/cxx0x-initializer-stdinitializerlist.cpp b/clang/test/CodeGenCXX/cxx0x-initializer-stdinitializerlist.cpp
index 3d0cf968bfd54..ef05f0334cd75 100644
--- a/clang/test/CodeGenCXX/cxx0x-initializer-stdinitializerlist.cpp
+++ b/clang/test/CodeGenCXX/cxx0x-initializer-stdinitializerlist.cpp
@@ -365,12 +365,12 @@ namespace partly_constant {
//
// Second init list.
// CHECK: store ptr {{.*}}@[[PARTLY_CONSTANT_SECOND]]{{.*}}, ptr getelementptr inbounds ({{.*}}, ptr {{.*}}@[[PARTLY_CONSTANT_INNER]]{{.*}}, i64 1)
- // CHECK: store i64 2, ptr getelementptr inbounds ({{.*}}, ptr {{.*}}@[[PARTLY_CONSTANT_INNER]]{{.*}}, i64 1, i32 1)
+ // CHECK: store i64 2, ptr getelementptr inbounds ({{.*}}, ptr getelementptr inbounds ({{.*}}, ptr {{.*}}@[[PARTLY_CONSTANT_INNER]]{{.*}}, i64 1), i32 0, i32 1)
//
// Third init list.
// CHECK-NOT: @[[PARTLY_CONSTANT_THIRD]],
// CHECK: store ptr {{.*}}@[[PARTLY_CONSTANT_THIRD]]{{.*}}, ptr getelementptr inbounds ({{.*}}, ptr {{.*}}@[[PARTLY_CONSTANT_INNER]]{{.*}}, i64 2)
- // CHECK: store i64 4, ptr getelementptr inbounds ({{.*}}, ptr {{.*}}@[[PARTLY_CONSTANT_INNER]]{{.*}}, i64 2, i32 1)
+ // CHECK: store i64 4, ptr getelementptr inbounds ({{.*}}, ptr getelementptr inbounds ({{.*}}, ptr {{.*}}@[[PARTLY_CONSTANT_INNER]]{{.*}}, i64 2), i32 0, i32 1)
// CHECK-NOT: @[[PARTLY_CONSTANT_THIRD]],
//
// Outer init list.
diff --git a/clang/test/CodeGenCXX/ms-inline-asm-fields.cpp b/clang/test/CodeGenCXX/ms-inline-asm-fields.cpp
index 403e9c8427760..e3441d0b4614e 100644
--- a/clang/test/CodeGenCXX/ms-inline-asm-fields.cpp
+++ b/clang/test/CodeGenCXX/ms-inline-asm-fields.cpp
@@ -24,7 +24,7 @@ extern "C" int test_param_field(A p) {
extern "C" int test_namespace_global() {
// CHECK: define{{.*}} i32 @test_namespace_global()
-// CHECK: call i32 asm sideeffect inteldialect "mov eax, $1", "{{.*}}"(ptr elementtype(i32) getelementptr inbounds (%struct.A, ptr @_ZN4asdf8a_globalE, i32 0, i32 2, i32 1))
+// CHECK: call i32 asm sideeffect inteldialect "mov eax, $1", "{{.*}}"(ptr elementtype(i32) getelementptr inbounds (%"struct.A::B", ptr getelementptr inbounds (%struct.A, ptr @_ZN4asdf8a_globalE, i32 0, i32 2), i32 0, i32 1))
// CHECK: ret i32
__asm mov eax, asdf::a_global.a3.b2
}
diff --git a/clang/test/OpenMP/threadprivate_codegen.cpp b/clang/test/OpenMP/threadprivate_codegen.cpp
index 2ee328008cc4b..7a6269954d39e 100644
--- a/clang/test/OpenMP/threadprivate_codegen.cpp
+++ b/clang/test/OpenMP/threadprivate_codegen.cpp
@@ -2586,7 +2586,7 @@ int foobar() {
// SIMD1-NEXT: [[TMP12:%.*]] = load i32, ptr [[RES]], align 4
// SIMD1-NEXT: [[ADD3:%.*]] = add nsw i32 [[TMP12]], [[TMP11]]
// SIMD1-NEXT: store i32 [[ADD3]], ptr [[RES]], align 4
-// SIMD1-NEXT: [[TMP13:%.*]] = load i32, ptr getelementptr inbounds ([2 x [3 x %struct.S1]], ptr @arr_x, i64 0, i64 1, i64 1), align 4
+// SIMD1-NEXT: [[TMP13:%.*]] = load i32, ptr getelementptr inbounds ([3 x %struct.S1], ptr getelementptr inbounds ([2 x [3 x %struct.S1]], ptr @arr_x, i64 0, i64 1), i64 0, i64 1), align 4
// SIMD1-NEXT: [[TMP14:%.*]] = load i32, ptr [[RES]], align 4
// SIMD1-NEXT: [[ADD4:%.*]] = add nsw i32 [[TMP14]], [[TMP13]]
// SIMD1-NEXT: store i32 [[ADD4]], ptr [[RES]], align 4
@@ -2663,7 +2663,7 @@ int foobar() {
// SIMD1-NEXT: [[TMP6:%.*]] = load i32, ptr [[RES]], align 4
// SIMD1-NEXT: [[ADD2:%.*]] = add nsw i32 [[TMP6]], [[TMP5]]
// SIMD1-NEXT: store i32 [[ADD2]], ptr [[RES]], align 4
-// SIMD1-NEXT: [[TMP7:%.*]] = load i32, ptr getelementptr inbounds ([2 x [3 x %struct.S1]], ptr @arr_x, i64 0, i64 1, i64 1), align 4
+// SIMD1-NEXT: [[TMP7:%.*]] = load i32, ptr getelementptr inbounds ([3 x %struct.S1], ptr getelementptr inbounds ([2 x [3 x %struct.S1]], ptr @arr_x, i64 0, i64 1), i64 0, i64 1), align 4
// SIMD1-NEXT: [[TMP8:%.*]] = load i32, ptr [[RES]], align 4
// SIMD1-NEXT: [[ADD3:%.*]] = add nsw i32 [[TMP8]], [[TMP7]]
// SIMD1-NEXT: store i32 [[ADD3]], ptr [[RES]], align 4
@@ -3052,7 +3052,7 @@ int foobar() {
// SIMD2-NEXT: [[TMP12:%.*]] = load i32, ptr [[RES]], align 4, !dbg [[DBG187:![0-9]+]]
// SIMD2-NEXT: [[ADD3:%.*]] = add nsw i32 [[TMP12]], [[TMP11]], !dbg [[DBG187]]
// SIMD2-NEXT: store i32 [[ADD3]], ptr [[RES]], align 4, !dbg [[DBG187]]
-// SIMD2-NEXT: [[TMP13:%.*]] = load i32, ptr getelementptr inbounds ([2 x [3 x %struct.S1]], ptr @arr_x, i64 0, i64 1, i64 1), align 4, !dbg [[DBG188:![0-9]+]]
+// SIMD2-NEXT: [[TMP13:%.*]] = load i32, ptr getelementptr inbounds ([3 x %struct.S1], ptr getelementptr inbounds ([2 x [3 x %struct.S1]], ptr @arr_x, i64 0, i64 1), i64 0, i64 1), align 4, !dbg [[DBG188:![0-9]+]]
// SIMD2-NEXT: [[TMP14:%.*]] = load i32, ptr [[RES]], align 4, !dbg [[DBG189:![0-9]+]]
// SIMD2-NEXT: [[ADD4:%.*]] = add nsw i32 [[TMP14]], [[TMP13]], !dbg [[DBG189]]
// SIMD2-NEXT: store i32 [[ADD4]], ptr [[RES]], align 4, !dbg [[DBG189]]
@@ -3133,7 +3133,7 @@ int foobar() {
// SIMD2-NEXT: [[TMP6:%.*]] = load i32, ptr [[RES]], align 4, !dbg [[DBG222:![0-9]+]]
// SIMD2-NEXT: [[ADD2:%.*]] = add nsw i32 [[TMP6]], [[TMP5]], !dbg [[DBG222]]
// SIMD2-NEXT: store i32 [[ADD2]], ptr [[RES]], align 4, !dbg [[DBG222]]
-// SIMD2-NEXT: [[TMP7:%.*]] = load i32, ptr getelementptr inbounds ([2 x [3 x %struct.S1]], ptr @arr_x, i64 0, i64 1, i64 1), align 4, !dbg [[DBG223:![0-9]+]]
+// SIMD2-NEXT: [[TMP7:%.*]] = load i32, ptr getelementptr inbounds ([3 x %struct.S1], ptr getelementptr inbounds ([2 x [3 x %struct.S1]], ptr @arr_x, i64 0, i64 1), i64 0, i64 1), align 4, !dbg [[DBG223:![0-9]+]]
// SIMD2-NEXT: [[TMP8:%.*]] = load i32, ptr [[RES]], align 4, !dbg [[DBG224:![0-9]+]]
// SIMD2-NEXT: [[ADD3:%.*]] = add nsw i32 [[TMP8]], [[TMP7]], !dbg [[DBG224]]
// SIMD2-NEXT: store i32 [[ADD3]], ptr [[RES]], align 4, !dbg [[DBG224]]
@@ -5707,7 +5707,7 @@ int foobar() {
// SIMD3-NEXT: [[TMP12:%.*]] = load i32, ptr [[RES]], align 4
// SIMD3-NEXT: [[ADD3:%.*]] = add nsw i32 [[TMP12]], [[TMP11]]
// SIMD3-NEXT: store i32 [[ADD3]], ptr [[RES]], align 4
-// SIMD3-NEXT: [[TMP13:%.*]] = load i32, ptr getelementptr inbounds ([2 x [3 x %struct.S1]], ptr @arr_x, i64 0, i64 1, i64 1), align 4
+// SIMD3-NEXT: [[TMP13:%.*]] = load i32, ptr getelementptr inbounds ([3 x %struct.S1], ptr getelementptr inbounds ([2 x [3 x %struct.S1]], ptr @arr_x, i64 0, i64 1), i64 0, i64 1), align 4
// SIMD3-NEXT: [[TMP14:%.*]] = load i32, ptr [[RES]], align 4
// SIMD3-NEXT: [[ADD4:%.*]] = add nsw i32 [[TMP14]], [[TMP13]]
// SIMD3-NEXT: store i32 [[ADD4]], ptr [[RES]], align 4
@@ -5784,7 +5784,7 @@ int foobar() {
// SIMD3-NEXT: [[TMP6:%.*]] = load i32, ptr [[RES]], align 4
// SIMD3-NEXT: [[ADD2:%.*]] = add nsw i32 [[TMP6]], [[TMP5]]
// SIMD3-NEXT: store i32 [[ADD2]], ptr [[RES]], align 4
-// SIMD3-NEXT: [[TMP7:%.*]] = load i32, ptr getelementptr inbounds ([2 x [3 x %struct.S1]], ptr @arr_x, i64 0, i64 1, i64 1), align 4
+// SIMD3-NEXT: [[TMP7:%.*]] = load i32, ptr getelementptr inbounds ([3 x %struct.S1], ptr getelementptr inbounds ([2 x [3 x %struct.S1]], ptr @arr_x, i64 0, i64 1), i64 0, i64 1), align 4
// SIMD3-NEXT: [[TMP8:%.*]] = load i32, ptr [[RES]], align 4
// SIMD3-NEXT: [[ADD3:%.*]] = add nsw i32 [[TMP8]], [[TMP7]]
// SIMD3-NEXT: store i32 [[ADD3]], ptr [[RES]], align 4
@@ -6173,7 +6173,7 @@ int foobar() {
// SIMD4-NEXT: [[TMP12:%.*]] = load i32, ptr [[RES]], align 4, !dbg [[DBG187:![0-9]+]]
// SIMD4-NEXT: [[ADD3:%.*]] = add nsw i32 [[TMP12]], [[TMP11]], !dbg [[DBG187]]
// SIMD4-NEXT: store i32 [[ADD3]], ptr [[RES]], align 4, !dbg [[DBG187]]
-// SIMD4-NEXT: [[TMP13:%.*]] = load i32, ptr getelementptr inbounds ([2 x [3 x %struct.S1]], ptr @arr_x, i64 0, i64 1, i64 1), align 4, !dbg [[DBG188:![0-9]+]]
+// SIMD4-NEXT: [[TMP13:%.*]] = load i32, ptr getelementptr inbounds ([3 x %struct.S1], ptr getelementptr inbounds ([2 x [3 x %struct.S1]], ptr @arr_x, i64 0, i64 1), i64 0, i64 1), align 4, !dbg [[DBG188:![0-9]+]]
// SIMD4-NEXT: [[TMP14:%.*]] = load i32, ptr [[RES]], align 4, !dbg [[DBG189:![0-9]+]]
// SIMD4-NEXT: [[ADD4:%.*]] = add nsw i32 [[TMP14]], [[TMP13]], !dbg [[DBG189]]
// SIMD4-NEXT: store i32 [[ADD4]], ptr [[RES]], align 4, !dbg [[DBG189]]
@@ -6254,7 +6254,7 @@ int foobar() {
// SIMD4-NEXT: [[TMP6:%.*]] = load i32, ptr [[RES]], align 4, !dbg [[DBG222:![0-9]+]]
// SIMD4-NEXT: [[ADD2:%.*]] = add nsw i32 [[TMP6]], [[TMP5]], !dbg [[DBG222]]
// SIMD4-NEXT: store i32 [[ADD2]], ptr [[RES]], align 4, !dbg [[DBG222]]
-// SIMD4-NEXT: [[TMP7:%.*]] = load i32, ptr getelementptr inbounds ([2 x [3 x %struct.S1]], ptr @arr_x, i64 0, i64 1, i64 1), align 4, !dbg [[DBG223:![0-9]+]]
+// SIMD4-NEXT: [[TMP7:%.*]] = load i32, ptr getelementptr inbounds ([3 x %struct.S1], ptr getelementptr inbounds ([2 x [3 x %struct.S1]], ptr @arr_x, i64 0, i64 1), i64 0, i64 1), align 4, !dbg [[DBG223:![0-9]+]]
// SIMD4-NEXT: [[TMP8:%.*]] = load i32, ptr [[RES]], align 4, !dbg [[DBG224:![0-9]+]]
// SIMD4-NEXT: [[ADD3:%.*]] = add nsw i32 [[TMP8]], [[TMP7]], !dbg [[DBG224]]
// SIMD4-NEXT: store i32 [[ADD3]], ptr [[RES]], align 4, !dbg [[DBG224]]
diff --git a/llvm/include/llvm/IR/ConstantFold.h b/llvm/include/llvm/IR/ConstantFold.h
index 9b3c8a0e5a632..42043d365b2d3 100644
--- a/llvm/include/llvm/IR/ConstantFold.h
+++ b/llvm/include/llvm/IR/ConstantFold.h
@@ -52,7 +52,7 @@ namespace llvm {
Constant *V2);
Constant *ConstantFoldCompareInstruction(CmpInst::Predicate Predicate,
Constant *C1, Constant *C2);
- Constant *ConstantFoldGetElementPtr(Type *Ty, Constant *C, bool InBounds,
+ Constant *ConstantFoldGetElementPtr(Type *Ty, Constant *C,
std::optional<ConstantRange> InRange,
ArrayRef<Value *> Idxs);
} // End llvm namespace
diff --git a/llvm/lib/Analysis/InstructionSimplify.cpp b/llvm/lib/Analysis/InstructionSimplify.cpp
index 8b2aa6b9f18b0..00895860ca49b 100644
--- a/llvm/lib/Analysis/InstructionSimplify.cpp
+++ b/llvm/lib/Analysis/InstructionSimplify.cpp
@@ -5068,9 +5068,8 @@ static Value *simplifyGEPInst(Type *SrcTy, Value *Ptr,
return nullptr;
if (!ConstantExpr::isSupportedGetElementPtr(SrcTy))
- // TODO(gep_nowrap): Pass on the whole GEPNoWrapFlags.
- return ConstantFoldGetElementPtr(SrcTy, cast<Constant>(Ptr),
- NW.isInBounds(), std::nullopt, Indices);
+ return ConstantFoldGetElementPtr(SrcTy, cast<Constant>(Ptr), std::nullopt,
+ Indices);
auto *CE =
ConstantExpr::getGetElementPtr(SrcTy, cast<Constant>(Ptr), Indices, NW);
diff --git a/llvm/lib/IR/ConstantFold.cpp b/llvm/lib/IR/ConstantFold.cpp
index 77a833610a3a9..34bcf36ec212c 100644
--- a/llvm/lib/IR/ConstantFold.cpp
+++ b/llvm/lib/IR/ConstantFold.cpp
@@ -1404,35 +1404,7 @@ Constant *llvm::ConstantFoldCompareInstruction(CmpInst::Predicate Predicate,
return nullptr;
}
-// Combine Indices - If the source pointer to this getelementptr instruction
-// is a getelementptr instruction, combine the indices of the two
-// getelementptr instructions into a single instruction.
-static Constant *foldGEPOfGEP(GEPOperator *GEP, Type *PointeeTy, bool InBounds,
- ArrayRef<Value *> Idxs) {
- if (PointeeTy != GEP->getResultElementType())
- return nullptr;
-
- // Leave inrange handling to DL-aware constant folding.
- if (GEP->getInRange())
- return nullptr;
-
- // Only handle simple case with leading zero index. We cannot perform an
- // actual addition as we don't know the correct index type size to use.
- Constant *Idx0 = cast<Constant>(Idxs[0]);
- if (!Idx0->isNullValue())
- return nullptr;
-
- SmallVector<Value*, 16> NewIndices;
- NewIndices.reserve(Idxs.size() + GEP->getNumIndices());
- NewIndices.append(GEP->idx_begin(), GEP->idx_end());
- NewIndices.append(Idxs.begin() + 1, Idxs.end());
- return ConstantExpr::getGetElementPtr(
- GEP->getSourceElementType(), cast<Constant>(GEP->getPointerOperand()),
- NewIndices, InBounds && GEP->isInBounds());
-}
-
Constant *llvm::ConstantFoldGetElementPtr(Type *PointeeTy, Constant *C,
- bool InBounds,
std::optional<ConstantRange> InRange,
ArrayRef<Value *> Idxs) {
if (Idxs.empty()) return C;
@@ -1462,10 +1434,5 @@ Constant *llvm::ConstantFoldGetElementPtr(Type *PointeeTy, Constant *C,
cast<VectorType>(GEPTy)->getElementCount(), C)
: C;
- if (ConstantExpr *CE = dyn_cast<ConstantExpr>(C))
- if (auto *GEP = dyn_cast<GEPOperator>(CE))
- if (Constant *C = foldGEPOfGEP(GEP, PointeeTy, InBounds, Idxs))
- return C;
-
return nullptr;
}
diff --git a/llvm/lib/IR/Constants.cpp b/llvm/lib/IR/Constants.cpp
index a76be441875a1..d07907372f0e4 100644
--- a/llvm/lib/IR/Constants.cpp
+++ b/llvm/lib/IR/Constants.cpp
@@ -2438,8 +2438,7 @@ Constant *ConstantExpr::getGetElementPtr(Type *Ty, Constant *C,
assert(Ty && "Must specify element type");
assert(isSupportedGetElementPtr(Ty) && "Element type is unsupported!");
- if (Constant *FC =
- ConstantFoldGetElementPtr(Ty, C, NW.isInBounds(), InRange, Idxs))
+ if (Constant *FC = ConstantFoldGetElementPtr(Ty, C, InRange, Idxs))
return FC; // Fold a few common cases.
assert(GetElementPtrInst::getIndexedType(Ty, Idxs) && "GEP indices invalid!");
|
@llvm/pr-subscribers-clang Author: Nikita Popov (nikic) ChangesThis is a followup to #93823 and drops the DataLayout-unaware GEP of GEP fold entirely. All cases are now left to the DataLayout-aware constant folder, which will fold everything to a single i8 GEP. We didn't have any test coverage for this fold in LLVM, but some Clang tests change. Full diff: https://github.com/llvm/llvm-project/pull/95126.diff 10 Files Affected:
diff --git a/clang/test/CodeGenCUDA/managed-var.cu b/clang/test/CodeGenCUDA/managed-var.cu
index 5206acc62fe00..07e1a1e692c75 100644
--- a/clang/test/CodeGenCUDA/managed-var.cu
+++ b/clang/test/CodeGenCUDA/managed-var.cu
@@ -127,9 +127,10 @@ __device__ __host__ float load2() {
// HOST-LABEL: define {{.*}}@_Z5load3v()
// HOST: %ld.managed = load ptr, ptr @v2, align 16
-// HOST: %0 = getelementptr inbounds [100 x %struct.vec], ptr %ld.managed, i64 0, i64 1, i32 1
-// HOST: %1 = load float, ptr %0, align 4
-// HOST: ret float %1
+// HOST: %0 = getelementptr inbounds [100 x %struct.vec], ptr %ld.managed, i64 0, i64 1
+// HOST: %1 = getelementptr inbounds %struct.vec, ptr %0, i32 0, i32 1
+// HOST: %2 = load float, ptr %1, align 4
+// HOST: ret float %2
float load3() {
return v2[1].y;
}
@@ -139,10 +140,11 @@ float load3() {
// HOST: %0 = getelementptr inbounds [100 x %struct.vec], ptr %ld.managed, i64 0, i64 1
// HOST: %1 = ptrtoint ptr %0 to i64
// HOST: %ld.managed1 = load ptr, ptr @v2, align 16
-// HOST: %2 = getelementptr inbounds [100 x %struct.vec], ptr %ld.managed1, i64 0, i64 1, i32 1
-// HOST: %3 = ptrtoint ptr %2 to i64
-// HOST: %4 = sub i64 %3, %1
-// HOST: %sub.ptr.div = sdiv exact i64 %4, 4
+// HOST: %2 = getelementptr inbounds [100 x %struct.vec], ptr %ld.managed1, i64 0, i64 1
+// HOST: %3 = getelementptr inbounds %struct.vec, ptr %2, i32 0, i32 1
+// HOST: %4 = ptrtoint ptr %3 to i64
+// HOST: %5 = sub i64 %4, %1
+// HOST: %sub.ptr.div = sdiv exact i64 %5, 4
// HOST: %conv = sitofp i64 %sub.ptr.div to float
// HOST: ret float %conv
float addr_taken2() {
diff --git a/clang/test/CodeGenCXX/2011-12-19-init-list-ctor.cpp b/clang/test/CodeGenCXX/2011-12-19-init-list-ctor.cpp
index 4033adc8f0390..14557829268ef 100644
--- a/clang/test/CodeGenCXX/2011-12-19-init-list-ctor.cpp
+++ b/clang/test/CodeGenCXX/2011-12-19-init-list-ctor.cpp
@@ -21,6 +21,6 @@ struct S {
// CHECK: store i32 0, ptr @arr
// CHECK: call void @_ZN1AC1EPKc(ptr {{[^,]*}} getelementptr inbounds (%struct.S, ptr @arr, i32 0, i32 1), ptr noundef @.str)
// CHECK: store i32 1, ptr getelementptr inbounds (%struct.S, ptr @arr, i64 1)
-// CHECK: call void @_ZN1AC1EPKc(ptr {{[^,]*}} getelementptr inbounds (%struct.S, ptr @arr, i64 1, i32 1), ptr noundef @.str.1)
+// CHECK: call void @_ZN1AC1EPKc(ptr {{[^,]*}} getelementptr inbounds (%struct.S, ptr getelementptr inbounds (%struct.S, ptr @arr, i64 1), i32 0, i32 1), ptr noundef @.str.1)
// CHECK: store i32 2, ptr getelementptr inbounds (%struct.S, ptr @arr, i64 2)
-// CHECK: call void @_ZN1AC1EPKc(ptr {{[^,]*}} getelementptr inbounds (%struct.S, ptr @arr, i64 2, i32 1), ptr noundef @.str.2)
+// CHECK: call void @_ZN1AC1EPKc(ptr {{[^,]*}} getelementptr inbounds (%struct.S, ptr getelementptr inbounds (%struct.S, ptr @arr, i64 2), i32 0, i32 1), ptr noundef @.str.2)
diff --git a/clang/test/CodeGenCXX/cxx0x-initializer-stdinitializerlist-pr12086.cpp b/clang/test/CodeGenCXX/cxx0x-initializer-stdinitializerlist-pr12086.cpp
index 6fbe4c7fd17a7..caa92f47a93c2 100644
--- a/clang/test/CodeGenCXX/cxx0x-initializer-stdinitializerlist-pr12086.cpp
+++ b/clang/test/CodeGenCXX/cxx0x-initializer-stdinitializerlist-pr12086.cpp
@@ -79,12 +79,12 @@ std::initializer_list<std::initializer_list<int>> nested = {
// CHECK-DYNAMIC-BL: store i32 {{.*}}, ptr getelementptr inbounds (i32, ptr @_ZGR6nested1_, i64 1)
// CHECK-DYNAMIC-BL: store ptr @_ZGR6nested1_,
// CHECK-DYNAMIC-BL: ptr getelementptr inbounds ({{.*}}, ptr @_ZGR6nested_, i64 1), align 8
-// CHECK-DYNAMIC-BL: store i64 2, ptr getelementptr inbounds ({{.*}}, ptr @_ZGR6nested_, i64 1, i32 1), align 8
+// CHECK-DYNAMIC-BL: store i64 2, ptr getelementptr inbounds ({{.*}}, ptr getelementptr inbounds ({{.*}}, ptr @_ZGR6nested_, i64 1), i32 0, i32 1), align 8
// CHECK-DYNAMIC-BL: store i32 5, ptr @_ZGR6nested2_
// CHECK-DYNAMIC-BL: store i32 {{.*}}, ptr getelementptr inbounds (i32, ptr @_ZGR6nested2_, i64 1)
// CHECK-DYNAMIC-BL: store ptr @_ZGR6nested2_,
// CHECK-DYNAMIC-BL: ptr getelementptr inbounds ({{.*}}, ptr @_ZGR6nested_, i64 2), align 8
-// CHECK-DYNAMIC-BL: store i64 2, ptr getelementptr inbounds ({{.*}}, ptr @_ZGR6nested_, i64 2, i32 1), align 8
+// CHECK-DYNAMIC-BL: store i64 2, ptr getelementptr inbounds ({{.*}}, ptr getelementptr inbounds ({{.*}}, ptr @_ZGR6nested_, i64 2), i32 0, i32 1), align 8
// CHECK-DYNAMIC-BL: store ptr @_ZGR6nested_,
// CHECK-DYNAMIC-BL: ptr @nested, align 8
// CHECK-DYNAMIC-BL: store i64 3, ptr getelementptr inbounds ({{.*}}, ptr @nested, i32 0, i32 1), align 8
@@ -119,13 +119,13 @@ std::initializer_list<std::initializer_list<int>> nested = {
// CHECK-DYNAMIC-BE: store ptr @_ZGR6nested1_,
// CHECK-DYNAMIC-BE: ptr getelementptr inbounds ({{.*}}, ptr @_ZGR6nested_, i64 1), align 8
// CHECK-DYNAMIC-BE: store ptr getelementptr inbounds ([2 x i32], ptr @_ZGR6nested1_, i64 0, i64 2),
-// CHECK-DYNAMIC-BE: ptr getelementptr inbounds ({{.*}}, ptr @_ZGR6nested_, i64 1, i32 1), align 8
+// CHECK-DYNAMIC-BE: ptr getelementptr inbounds ({{.*}}, ptr getelementptr inbounds ({{.*}}, ptr @_ZGR6nested_, i64 1), i32 0, i32 1), align 8
// CHECK-DYNAMIC-BE: store i32 5, ptr @_ZGR6nested2_
// CHECK-DYNAMIC-BE: store i32 {{.*}}, ptr getelementptr inbounds (i32, ptr @_ZGR6nested2_, i64 1)
// CHECK-DYNAMIC-BE: store ptr @_ZGR6nested2_,
// CHECK-DYNAMIC-BE: ptr getelementptr inbounds ({{.*}}, ptr @_ZGR6nested_, i64 2), align 8
// CHECK-DYNAMIC-BE: store ptr getelementptr inbounds ([2 x i32], ptr @_ZGR6nested2_, i64 0, i64 2),
-// CHECK-DYNAMIC-BE: ptr getelementptr inbounds ({{.*}}, ptr @_ZGR6nested_, i64 2, i32 1), align 8
+// CHECK-DYNAMIC-BE: ptr getelementptr inbounds ({{.*}}, ptr getelementptr inbounds ({{.*}}, ptr @_ZGR6nested_, i64 2), i32 0, i32 1), align 8
// CHECK-DYNAMIC-BE: store ptr @_ZGR6nested_,
// CHECK-DYNAMIC-BE: ptr @nested, align 8
// CHECK-DYNAMIC-BE: store ptr getelementptr inbounds ([3 x {{.*}}], ptr @_ZGR6nested_, i64 0, i64 3),
diff --git a/clang/test/CodeGenCXX/cxx0x-initializer-stdinitializerlist.cpp b/clang/test/CodeGenCXX/cxx0x-initializer-stdinitializerlist.cpp
index 3d0cf968bfd54..ef05f0334cd75 100644
--- a/clang/test/CodeGenCXX/cxx0x-initializer-stdinitializerlist.cpp
+++ b/clang/test/CodeGenCXX/cxx0x-initializer-stdinitializerlist.cpp
@@ -365,12 +365,12 @@ namespace partly_constant {
//
// Second init list.
// CHECK: store ptr {{.*}}@[[PARTLY_CONSTANT_SECOND]]{{.*}}, ptr getelementptr inbounds ({{.*}}, ptr {{.*}}@[[PARTLY_CONSTANT_INNER]]{{.*}}, i64 1)
- // CHECK: store i64 2, ptr getelementptr inbounds ({{.*}}, ptr {{.*}}@[[PARTLY_CONSTANT_INNER]]{{.*}}, i64 1, i32 1)
+ // CHECK: store i64 2, ptr getelementptr inbounds ({{.*}}, ptr getelementptr inbounds ({{.*}}, ptr {{.*}}@[[PARTLY_CONSTANT_INNER]]{{.*}}, i64 1), i32 0, i32 1)
//
// Third init list.
// CHECK-NOT: @[[PARTLY_CONSTANT_THIRD]],
// CHECK: store ptr {{.*}}@[[PARTLY_CONSTANT_THIRD]]{{.*}}, ptr getelementptr inbounds ({{.*}}, ptr {{.*}}@[[PARTLY_CONSTANT_INNER]]{{.*}}, i64 2)
- // CHECK: store i64 4, ptr getelementptr inbounds ({{.*}}, ptr {{.*}}@[[PARTLY_CONSTANT_INNER]]{{.*}}, i64 2, i32 1)
+ // CHECK: store i64 4, ptr getelementptr inbounds ({{.*}}, ptr getelementptr inbounds ({{.*}}, ptr {{.*}}@[[PARTLY_CONSTANT_INNER]]{{.*}}, i64 2), i32 0, i32 1)
// CHECK-NOT: @[[PARTLY_CONSTANT_THIRD]],
//
// Outer init list.
diff --git a/clang/test/CodeGenCXX/ms-inline-asm-fields.cpp b/clang/test/CodeGenCXX/ms-inline-asm-fields.cpp
index 403e9c8427760..e3441d0b4614e 100644
--- a/clang/test/CodeGenCXX/ms-inline-asm-fields.cpp
+++ b/clang/test/CodeGenCXX/ms-inline-asm-fields.cpp
@@ -24,7 +24,7 @@ extern "C" int test_param_field(A p) {
extern "C" int test_namespace_global() {
// CHECK: define{{.*}} i32 @test_namespace_global()
-// CHECK: call i32 asm sideeffect inteldialect "mov eax, $1", "{{.*}}"(ptr elementtype(i32) getelementptr inbounds (%struct.A, ptr @_ZN4asdf8a_globalE, i32 0, i32 2, i32 1))
+// CHECK: call i32 asm sideeffect inteldialect "mov eax, $1", "{{.*}}"(ptr elementtype(i32) getelementptr inbounds (%"struct.A::B", ptr getelementptr inbounds (%struct.A, ptr @_ZN4asdf8a_globalE, i32 0, i32 2), i32 0, i32 1))
// CHECK: ret i32
__asm mov eax, asdf::a_global.a3.b2
}
diff --git a/clang/test/OpenMP/threadprivate_codegen.cpp b/clang/test/OpenMP/threadprivate_codegen.cpp
index 2ee328008cc4b..7a6269954d39e 100644
--- a/clang/test/OpenMP/threadprivate_codegen.cpp
+++ b/clang/test/OpenMP/threadprivate_codegen.cpp
@@ -2586,7 +2586,7 @@ int foobar() {
// SIMD1-NEXT: [[TMP12:%.*]] = load i32, ptr [[RES]], align 4
// SIMD1-NEXT: [[ADD3:%.*]] = add nsw i32 [[TMP12]], [[TMP11]]
// SIMD1-NEXT: store i32 [[ADD3]], ptr [[RES]], align 4
-// SIMD1-NEXT: [[TMP13:%.*]] = load i32, ptr getelementptr inbounds ([2 x [3 x %struct.S1]], ptr @arr_x, i64 0, i64 1, i64 1), align 4
+// SIMD1-NEXT: [[TMP13:%.*]] = load i32, ptr getelementptr inbounds ([3 x %struct.S1], ptr getelementptr inbounds ([2 x [3 x %struct.S1]], ptr @arr_x, i64 0, i64 1), i64 0, i64 1), align 4
// SIMD1-NEXT: [[TMP14:%.*]] = load i32, ptr [[RES]], align 4
// SIMD1-NEXT: [[ADD4:%.*]] = add nsw i32 [[TMP14]], [[TMP13]]
// SIMD1-NEXT: store i32 [[ADD4]], ptr [[RES]], align 4
@@ -2663,7 +2663,7 @@ int foobar() {
// SIMD1-NEXT: [[TMP6:%.*]] = load i32, ptr [[RES]], align 4
// SIMD1-NEXT: [[ADD2:%.*]] = add nsw i32 [[TMP6]], [[TMP5]]
// SIMD1-NEXT: store i32 [[ADD2]], ptr [[RES]], align 4
-// SIMD1-NEXT: [[TMP7:%.*]] = load i32, ptr getelementptr inbounds ([2 x [3 x %struct.S1]], ptr @arr_x, i64 0, i64 1, i64 1), align 4
+// SIMD1-NEXT: [[TMP7:%.*]] = load i32, ptr getelementptr inbounds ([3 x %struct.S1], ptr getelementptr inbounds ([2 x [3 x %struct.S1]], ptr @arr_x, i64 0, i64 1), i64 0, i64 1), align 4
// SIMD1-NEXT: [[TMP8:%.*]] = load i32, ptr [[RES]], align 4
// SIMD1-NEXT: [[ADD3:%.*]] = add nsw i32 [[TMP8]], [[TMP7]]
// SIMD1-NEXT: store i32 [[ADD3]], ptr [[RES]], align 4
@@ -3052,7 +3052,7 @@ int foobar() {
// SIMD2-NEXT: [[TMP12:%.*]] = load i32, ptr [[RES]], align 4, !dbg [[DBG187:![0-9]+]]
// SIMD2-NEXT: [[ADD3:%.*]] = add nsw i32 [[TMP12]], [[TMP11]], !dbg [[DBG187]]
// SIMD2-NEXT: store i32 [[ADD3]], ptr [[RES]], align 4, !dbg [[DBG187]]
-// SIMD2-NEXT: [[TMP13:%.*]] = load i32, ptr getelementptr inbounds ([2 x [3 x %struct.S1]], ptr @arr_x, i64 0, i64 1, i64 1), align 4, !dbg [[DBG188:![0-9]+]]
+// SIMD2-NEXT: [[TMP13:%.*]] = load i32, ptr getelementptr inbounds ([3 x %struct.S1], ptr getelementptr inbounds ([2 x [3 x %struct.S1]], ptr @arr_x, i64 0, i64 1), i64 0, i64 1), align 4, !dbg [[DBG188:![0-9]+]]
// SIMD2-NEXT: [[TMP14:%.*]] = load i32, ptr [[RES]], align 4, !dbg [[DBG189:![0-9]+]]
// SIMD2-NEXT: [[ADD4:%.*]] = add nsw i32 [[TMP14]], [[TMP13]], !dbg [[DBG189]]
// SIMD2-NEXT: store i32 [[ADD4]], ptr [[RES]], align 4, !dbg [[DBG189]]
@@ -3133,7 +3133,7 @@ int foobar() {
// SIMD2-NEXT: [[TMP6:%.*]] = load i32, ptr [[RES]], align 4, !dbg [[DBG222:![0-9]+]]
// SIMD2-NEXT: [[ADD2:%.*]] = add nsw i32 [[TMP6]], [[TMP5]], !dbg [[DBG222]]
// SIMD2-NEXT: store i32 [[ADD2]], ptr [[RES]], align 4, !dbg [[DBG222]]
-// SIMD2-NEXT: [[TMP7:%.*]] = load i32, ptr getelementptr inbounds ([2 x [3 x %struct.S1]], ptr @arr_x, i64 0, i64 1, i64 1), align 4, !dbg [[DBG223:![0-9]+]]
+// SIMD2-NEXT: [[TMP7:%.*]] = load i32, ptr getelementptr inbounds ([3 x %struct.S1], ptr getelementptr inbounds ([2 x [3 x %struct.S1]], ptr @arr_x, i64 0, i64 1), i64 0, i64 1), align 4, !dbg [[DBG223:![0-9]+]]
// SIMD2-NEXT: [[TMP8:%.*]] = load i32, ptr [[RES]], align 4, !dbg [[DBG224:![0-9]+]]
// SIMD2-NEXT: [[ADD3:%.*]] = add nsw i32 [[TMP8]], [[TMP7]], !dbg [[DBG224]]
// SIMD2-NEXT: store i32 [[ADD3]], ptr [[RES]], align 4, !dbg [[DBG224]]
@@ -5707,7 +5707,7 @@ int foobar() {
// SIMD3-NEXT: [[TMP12:%.*]] = load i32, ptr [[RES]], align 4
// SIMD3-NEXT: [[ADD3:%.*]] = add nsw i32 [[TMP12]], [[TMP11]]
// SIMD3-NEXT: store i32 [[ADD3]], ptr [[RES]], align 4
-// SIMD3-NEXT: [[TMP13:%.*]] = load i32, ptr getelementptr inbounds ([2 x [3 x %struct.S1]], ptr @arr_x, i64 0, i64 1, i64 1), align 4
+// SIMD3-NEXT: [[TMP13:%.*]] = load i32, ptr getelementptr inbounds ([3 x %struct.S1], ptr getelementptr inbounds ([2 x [3 x %struct.S1]], ptr @arr_x, i64 0, i64 1), i64 0, i64 1), align 4
// SIMD3-NEXT: [[TMP14:%.*]] = load i32, ptr [[RES]], align 4
// SIMD3-NEXT: [[ADD4:%.*]] = add nsw i32 [[TMP14]], [[TMP13]]
// SIMD3-NEXT: store i32 [[ADD4]], ptr [[RES]], align 4
@@ -5784,7 +5784,7 @@ int foobar() {
// SIMD3-NEXT: [[TMP6:%.*]] = load i32, ptr [[RES]], align 4
// SIMD3-NEXT: [[ADD2:%.*]] = add nsw i32 [[TMP6]], [[TMP5]]
// SIMD3-NEXT: store i32 [[ADD2]], ptr [[RES]], align 4
-// SIMD3-NEXT: [[TMP7:%.*]] = load i32, ptr getelementptr inbounds ([2 x [3 x %struct.S1]], ptr @arr_x, i64 0, i64 1, i64 1), align 4
+// SIMD3-NEXT: [[TMP7:%.*]] = load i32, ptr getelementptr inbounds ([3 x %struct.S1], ptr getelementptr inbounds ([2 x [3 x %struct.S1]], ptr @arr_x, i64 0, i64 1), i64 0, i64 1), align 4
// SIMD3-NEXT: [[TMP8:%.*]] = load i32, ptr [[RES]], align 4
// SIMD3-NEXT: [[ADD3:%.*]] = add nsw i32 [[TMP8]], [[TMP7]]
// SIMD3-NEXT: store i32 [[ADD3]], ptr [[RES]], align 4
@@ -6173,7 +6173,7 @@ int foobar() {
// SIMD4-NEXT: [[TMP12:%.*]] = load i32, ptr [[RES]], align 4, !dbg [[DBG187:![0-9]+]]
// SIMD4-NEXT: [[ADD3:%.*]] = add nsw i32 [[TMP12]], [[TMP11]], !dbg [[DBG187]]
// SIMD4-NEXT: store i32 [[ADD3]], ptr [[RES]], align 4, !dbg [[DBG187]]
-// SIMD4-NEXT: [[TMP13:%.*]] = load i32, ptr getelementptr inbounds ([2 x [3 x %struct.S1]], ptr @arr_x, i64 0, i64 1, i64 1), align 4, !dbg [[DBG188:![0-9]+]]
+// SIMD4-NEXT: [[TMP13:%.*]] = load i32, ptr getelementptr inbounds ([3 x %struct.S1], ptr getelementptr inbounds ([2 x [3 x %struct.S1]], ptr @arr_x, i64 0, i64 1), i64 0, i64 1), align 4, !dbg [[DBG188:![0-9]+]]
// SIMD4-NEXT: [[TMP14:%.*]] = load i32, ptr [[RES]], align 4, !dbg [[DBG189:![0-9]+]]
// SIMD4-NEXT: [[ADD4:%.*]] = add nsw i32 [[TMP14]], [[TMP13]], !dbg [[DBG189]]
// SIMD4-NEXT: store i32 [[ADD4]], ptr [[RES]], align 4, !dbg [[DBG189]]
@@ -6254,7 +6254,7 @@ int foobar() {
// SIMD4-NEXT: [[TMP6:%.*]] = load i32, ptr [[RES]], align 4, !dbg [[DBG222:![0-9]+]]
// SIMD4-NEXT: [[ADD2:%.*]] = add nsw i32 [[TMP6]], [[TMP5]], !dbg [[DBG222]]
// SIMD4-NEXT: store i32 [[ADD2]], ptr [[RES]], align 4, !dbg [[DBG222]]
-// SIMD4-NEXT: [[TMP7:%.*]] = load i32, ptr getelementptr inbounds ([2 x [3 x %struct.S1]], ptr @arr_x, i64 0, i64 1, i64 1), align 4, !dbg [[DBG223:![0-9]+]]
+// SIMD4-NEXT: [[TMP7:%.*]] = load i32, ptr getelementptr inbounds ([3 x %struct.S1], ptr getelementptr inbounds ([2 x [3 x %struct.S1]], ptr @arr_x, i64 0, i64 1), i64 0, i64 1), align 4, !dbg [[DBG223:![0-9]+]]
// SIMD4-NEXT: [[TMP8:%.*]] = load i32, ptr [[RES]], align 4, !dbg [[DBG224:![0-9]+]]
// SIMD4-NEXT: [[ADD3:%.*]] = add nsw i32 [[TMP8]], [[TMP7]], !dbg [[DBG224]]
// SIMD4-NEXT: store i32 [[ADD3]], ptr [[RES]], align 4, !dbg [[DBG224]]
diff --git a/llvm/include/llvm/IR/ConstantFold.h b/llvm/include/llvm/IR/ConstantFold.h
index 9b3c8a0e5a632..42043d365b2d3 100644
--- a/llvm/include/llvm/IR/ConstantFold.h
+++ b/llvm/include/llvm/IR/ConstantFold.h
@@ -52,7 +52,7 @@ namespace llvm {
Constant *V2);
Constant *ConstantFoldCompareInstruction(CmpInst::Predicate Predicate,
Constant *C1, Constant *C2);
- Constant *ConstantFoldGetElementPtr(Type *Ty, Constant *C, bool InBounds,
+ Constant *ConstantFoldGetElementPtr(Type *Ty, Constant *C,
std::optional<ConstantRange> InRange,
ArrayRef<Value *> Idxs);
} // End llvm namespace
diff --git a/llvm/lib/Analysis/InstructionSimplify.cpp b/llvm/lib/Analysis/InstructionSimplify.cpp
index 8b2aa6b9f18b0..00895860ca49b 100644
--- a/llvm/lib/Analysis/InstructionSimplify.cpp
+++ b/llvm/lib/Analysis/InstructionSimplify.cpp
@@ -5068,9 +5068,8 @@ static Value *simplifyGEPInst(Type *SrcTy, Value *Ptr,
return nullptr;
if (!ConstantExpr::isSupportedGetElementPtr(SrcTy))
- // TODO(gep_nowrap): Pass on the whole GEPNoWrapFlags.
- return ConstantFoldGetElementPtr(SrcTy, cast<Constant>(Ptr),
- NW.isInBounds(), std::nullopt, Indices);
+ return ConstantFoldGetElementPtr(SrcTy, cast<Constant>(Ptr), std::nullopt,
+ Indices);
auto *CE =
ConstantExpr::getGetElementPtr(SrcTy, cast<Constant>(Ptr), Indices, NW);
diff --git a/llvm/lib/IR/ConstantFold.cpp b/llvm/lib/IR/ConstantFold.cpp
index 77a833610a3a9..34bcf36ec212c 100644
--- a/llvm/lib/IR/ConstantFold.cpp
+++ b/llvm/lib/IR/ConstantFold.cpp
@@ -1404,35 +1404,7 @@ Constant *llvm::ConstantFoldCompareInstruction(CmpInst::Predicate Predicate,
return nullptr;
}
-// Combine Indices - If the source pointer to this getelementptr instruction
-// is a getelementptr instruction, combine the indices of the two
-// getelementptr instructions into a single instruction.
-static Constant *foldGEPOfGEP(GEPOperator *GEP, Type *PointeeTy, bool InBounds,
- ArrayRef<Value *> Idxs) {
- if (PointeeTy != GEP->getResultElementType())
- return nullptr;
-
- // Leave inrange handling to DL-aware constant folding.
- if (GEP->getInRange())
- return nullptr;
-
- // Only handle simple case with leading zero index. We cannot perform an
- // actual addition as we don't know the correct index type size to use.
- Constant *Idx0 = cast<Constant>(Idxs[0]);
- if (!Idx0->isNullValue())
- return nullptr;
-
- SmallVector<Value*, 16> NewIndices;
- NewIndices.reserve(Idxs.size() + GEP->getNumIndices());
- NewIndices.append(GEP->idx_begin(), GEP->idx_end());
- NewIndices.append(Idxs.begin() + 1, Idxs.end());
- return ConstantExpr::getGetElementPtr(
- GEP->getSourceElementType(), cast<Constant>(GEP->getPointerOperand()),
- NewIndices, InBounds && GEP->isInBounds());
-}
-
Constant *llvm::ConstantFoldGetElementPtr(Type *PointeeTy, Constant *C,
- bool InBounds,
std::optional<ConstantRange> InRange,
ArrayRef<Value *> Idxs) {
if (Idxs.empty()) return C;
@@ -1462,10 +1434,5 @@ Constant *llvm::ConstantFoldGetElementPtr(Type *PointeeTy, Constant *C,
cast<VectorType>(GEPTy)->getElementCount(), C)
: C;
- if (ConstantExpr *CE = dyn_cast<ConstantExpr>(C))
- if (auto *GEP = dyn_cast<GEPOperator>(CE))
- if (Constant *C = foldGEPOfGEP(GEP, PointeeTy, InBounds, Idxs))
- return C;
-
return nullptr;
}
diff --git a/llvm/lib/IR/Constants.cpp b/llvm/lib/IR/Constants.cpp
index a76be441875a1..d07907372f0e4 100644
--- a/llvm/lib/IR/Constants.cpp
+++ b/llvm/lib/IR/Constants.cpp
@@ -2438,8 +2438,7 @@ Constant *ConstantExpr::getGetElementPtr(Type *Ty, Constant *C,
assert(Ty && "Must specify element type");
assert(isSupportedGetElementPtr(Ty) && "Element type is unsupported!");
- if (Constant *FC =
- ConstantFoldGetElementPtr(Ty, C, NW.isInBounds(), InRange, Idxs))
+ if (Constant *FC = ConstantFoldGetElementPtr(Ty, C, InRange, Idxs))
return FC; // Fold a few common cases.
assert(GetElementPtrInst::getIndexedType(Ty, Idxs) && "GEP indices invalid!");
|
See 51e459a:
Can you double check this? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
I believe this broke our flang+openmp+offload bot: https://lab.llvm.org/staging/#/builders/140/builds/10168 |
This reverts commit 3b3b839. This broke the flang+openmp+offload buildbot, as reported in #95126 (comment).
That's for the heads-up, I've reverted this patch in cece0a1. I would appreciate some help with debugging this, I don't think I can run this test locally. Probably just having the pre-optimization LLVM IR/bitcode would be enough. (Though with offload, there is probably more than one module involved?) |
Thank you @nikic. |
Hey, @jplehr notified me of this issue (thank you), I've had a little look at it and it's related to an issue this PR fixes: #94541 which you are currently a reviewer of ironically (as I use and alter slightly a piece of functionality you ported, so thank you very much for that)! A slightly funny convergence there :-) Currently when generating the kernel we try to replace uses of values with their appropriate kernel input argument, in certain cases these are Constant's which poses an issue as we can't replace them with the Input arguments which are themselves not Constant I believe. So we can only replace instructions, to do so we transform the Constants to Instructions where necessary, the current function that's in place (a small naive function from myself) will do this within the kernel to a small extent, it doesn't handle nesting's very well for example, and apparently IR this patch will produce in the test cases that are failing. However, the PR I have opened above uses the more robust and comprehensive piece of functionality you made/ported via the function So two ways forward I can see, depending on how urgent/annoying it is for this PR to be stalled:
I don't mind which route we go down, if we opt to disable the tests, I can land a commit shortly to do so, unless either of you @nikic @jplehr would prefer to do so :-) |
@agozillon Thanks for looking into this! This change isn't urgent, so I'm happy to wait for your PR to go in first. |
@nikic sounds good, if that changes don't hesitate to ping me and I'll be happy to deactivate the tests for a while (or feel free to do so yourself, just ping to tell me they're off so I know to turn them on again when the PR lands!) |
Just landed the PR, so it should be good to land this PR now without causing issues with the flang+openmp+offload buildbot, however, if any other issues do arise, I am happy to look into them! But from testing this PR in conjunction with the one I just landed locally, it does seem like everything should be fine. |
Reapplying without changes. The flang+openmp buildbot failure should be addressed by #94541. ----- This is a followup to #93823 and drops the DataLayout-unaware GEP of GEP fold entirely. All cases are now left to the DataLayout-aware constant folder, which will fold everything to a single i8 GEP. We didn't have any test coverage for this fold in LLVM, but some Clang tests change.
Reapplying without changes. The flang+openmp buildbot failure should be addressed by llvm#94541. ----- This is a followup to llvm#93823 and drops the DataLayout-unaware GEP of GEP fold entirely. All cases are now left to the DataLayout-aware constant folder, which will fold everything to a single i8 GEP. We didn't have any test coverage for this fold in LLVM, but some Clang tests change.
This is a followup to #93823 and drops the DataLayout-unaware GEP of GEP fold entirely. All cases are now left to the DataLayout-aware constant folder, which will fold everything to a single i8 GEP.
We didn't have any test coverage for this fold in LLVM, but some Clang tests change.