Skip to content

Local: Handle noalias_addrspace in combineMetadata #103938

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Nov 26, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 4 additions & 3 deletions llvm/include/llvm/IR/ConstantRangeList.h
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ class [[nodiscard]] ConstantRangeList {
ConstantRangeList(ArrayRef<ConstantRange> RangesRef) {
assert(isOrderedRanges(RangesRef));
for (const ConstantRange &R : RangesRef) {
assert(R.getBitWidth() == getBitWidth());
assert(empty() || R.getBitWidth() == getBitWidth());
Ranges.push_back(R);
}
}
Expand All @@ -59,8 +59,9 @@ class [[nodiscard]] ConstantRangeList {
/// Return true if this list contains no members.
bool empty() const { return Ranges.empty(); }

/// Get the bit width of this ConstantRangeList.
uint32_t getBitWidth() const { return 64; }
/// Get the bit width of this ConstantRangeList. It is invalid to call this
/// with an empty range.
uint32_t getBitWidth() const { return Ranges.front().getBitWidth(); }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the range list is empty, should this continue to return 64? Otherwise at best it'll assert and at worst it'll return garbage. An optional could also represent this scenario.

If there's a new precondition that you can only call getBitWidth on a non-empty range, that should be documented.


/// Return the number of ranges in this ConstantRangeList.
size_t size() const { return Ranges.size(); }
Expand Down
1 change: 1 addition & 0 deletions llvm/include/llvm/IR/Metadata.h
Original file line number Diff line number Diff line change
Expand Up @@ -1456,6 +1456,7 @@ class MDNode : public Metadata {
static MDNode *getMostGenericTBAA(MDNode *A, MDNode *B);
static MDNode *getMostGenericFPMath(MDNode *A, MDNode *B);
static MDNode *getMostGenericRange(MDNode *A, MDNode *B);
static MDNode *getMostGenericNoaliasAddrspace(MDNode *A, MDNode *B);
static MDNode *getMostGenericAliasScope(MDNode *A, MDNode *B);
static MDNode *getMostGenericAlignmentOrDereferenceable(MDNode *A, MDNode *B);
/// Merge !prof metadata from two instructions.
Expand Down
15 changes: 9 additions & 6 deletions llvm/lib/IR/ConstantRangeList.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -39,12 +39,14 @@ void ConstantRangeList::insert(const ConstantRange &NewRange) {
return;
assert(!NewRange.isFullSet() && "Do not support full set");
assert(NewRange.getLower().slt(NewRange.getUpper()));
assert(getBitWidth() == NewRange.getBitWidth());
// Handle common cases.
if (empty() || Ranges.back().getUpper().slt(NewRange.getLower())) {
Ranges.push_back(NewRange);
return;
}

assert(getBitWidth() == NewRange.getBitWidth());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I notice we're moving a bunch of asserts around. I think that needs explaining in the commit message. However, it feels like something's wrong.

Does insert perhaps need to be updated so that it inserts 32-bit values depending on the range type? Would that fix us having mixed 64- and 32-bit range values? Is that even possible right now - what if we want to insert into an empty range? We can't know what bitwidth to use.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An empty range has no bitwidth. I don't think it's worth adding an explicit field for bit width to give a bit width to the empty range

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that an empty range has no bitwidth. I missed that ConstantRangeList only deals with the insertion of pre-existing ConstantRanges. I think it's okay on balance to require users to only insert/union/intersect ConstantRanges of matching bitwidths.

I think a good way of expressing this would be to keep the asserts where they are but have getBitWidth return a value that represents "none" or "empty" which always compares == true with bitwidths of other non-empty ranges. That's probably not worth the effort, though. What you have now is probably okay.


if (NewRange.getUpper().slt(Ranges.front().getLower())) {
Ranges.insert(Ranges.begin(), NewRange);
return;
Expand Down Expand Up @@ -142,14 +144,15 @@ void ConstantRangeList::subtract(const ConstantRange &SubRange) {

ConstantRangeList
ConstantRangeList::unionWith(const ConstantRangeList &CRL) const {
assert(getBitWidth() == CRL.getBitWidth() &&
"ConstantRangeList bitwidths don't agree!");
// Handle common cases.
if (empty())
return CRL;
if (CRL.empty())
return *this;

assert(getBitWidth() == CRL.getBitWidth() &&
"ConstantRangeList bitwidths don't agree!");

ConstantRangeList Result;
size_t i = 0, j = 0;
// "PreviousRange" tracks the lowest unioned range that is being processed.
Expand Down Expand Up @@ -192,15 +195,15 @@ ConstantRangeList::unionWith(const ConstantRangeList &CRL) const {

ConstantRangeList
ConstantRangeList::intersectWith(const ConstantRangeList &CRL) const {
assert(getBitWidth() == CRL.getBitWidth() &&
"ConstantRangeList bitwidths don't agree!");

// Handle common cases.
if (empty())
return *this;
if (CRL.empty())
return CRL;

assert(getBitWidth() == CRL.getBitWidth() &&
"ConstantRangeList bitwidths don't agree!");

ConstantRangeList Result;
size_t i = 0, j = 0;
while (i < size() && j < CRL.size()) {
Expand Down
38 changes: 38 additions & 0 deletions llvm/lib/IR/Metadata.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,7 @@
#include "llvm/IR/BasicBlock.h"
#include "llvm/IR/Constant.h"
#include "llvm/IR/ConstantRange.h"
#include "llvm/IR/ConstantRangeList.h"
#include "llvm/IR/Constants.h"
#include "llvm/IR/DebugInfoMetadata.h"
#include "llvm/IR/DebugLoc.h"
Expand Down Expand Up @@ -1353,6 +1354,43 @@ MDNode *MDNode::getMostGenericRange(MDNode *A, MDNode *B) {
return MDNode::get(A->getContext(), MDs);
}

MDNode *MDNode::getMostGenericNoaliasAddrspace(MDNode *A, MDNode *B) {
if (!A || !B)
return nullptr;

if (A == B)
return A;

SmallVector<ConstantRange> RangeListA, RangeListB;
for (unsigned I = 0, E = A->getNumOperands() / 2; I != E; ++I) {
auto *LowA = mdconst::extract<ConstantInt>(A->getOperand(2 * I + 0));
auto *HighA = mdconst::extract<ConstantInt>(A->getOperand(2 * I + 1));
RangeListA.push_back(ConstantRange(LowA->getValue(), HighA->getValue()));
}

for (unsigned I = 0, E = B->getNumOperands() / 2; I != E; ++I) {
auto *LowB = mdconst::extract<ConstantInt>(B->getOperand(2 * I + 0));
auto *HighB = mdconst::extract<ConstantInt>(B->getOperand(2 * I + 1));
RangeListB.push_back(ConstantRange(LowB->getValue(), HighB->getValue()));
}

ConstantRangeList CRLA(RangeListA);
ConstantRangeList CRLB(RangeListB);
ConstantRangeList Result = CRLA.intersectWith(CRLB);
if (Result.empty())
return nullptr;

SmallVector<Metadata *> MDs;
for (const ConstantRange &CR : Result) {
MDs.push_back(ConstantAsMetadata::get(
ConstantInt::get(A->getContext(), CR.getLower())));
MDs.push_back(ConstantAsMetadata::get(
ConstantInt::get(A->getContext(), CR.getUpper())));
}

return MDNode::get(A->getContext(), MDs);
}

MDNode *MDNode::getMostGenericAlignmentOrDereferenceable(MDNode *A, MDNode *B) {
if (!A || !B)
return nullptr;
Expand Down
8 changes: 7 additions & 1 deletion llvm/lib/Transforms/Utils/Local.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -3391,6 +3391,11 @@ void llvm::combineMetadata(Instruction *K, const Instruction *J,
if (DoesKMove)
K->setMetadata(Kind, MDNode::getMergedProfMetadata(KMD, JMD, K, J));
break;
case LLVMContext::MD_noalias_addrspace:
if (DoesKMove)
K->setMetadata(Kind,
MDNode::getMostGenericNoaliasAddrspace(JMD, KMD));
break;
}
}
// Set !invariant.group from J if J has it. If both instructions have it
Expand Down Expand Up @@ -3432,7 +3437,8 @@ void llvm::combineMetadataForCSE(Instruction *K, const Instruction *J,
LLVMContext::MD_prof,
LLVMContext::MD_nontemporal,
LLVMContext::MD_noundef,
LLVMContext::MD_mmra};
LLVMContext::MD_mmra,
LLVMContext::MD_noalias_addrspace};
combineMetadata(K, J, KnownIDs, KDominatesJ);
}

Expand Down
73 changes: 73 additions & 0 deletions llvm/test/Transforms/EarlyCSE/noalias-addrspace.ll
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 5
; RUN: opt -passes='early-cse<memssa>' -S < %s | FileCheck %s

declare void @use(i1)
declare void @use.ptr(ptr) memory(read)

define void @load_first_noalias_addrspace(ptr %p) {
; CHECK-LABEL: define void @load_first_noalias_addrspace(
; CHECK-SAME: ptr [[P:%.*]]) {
; CHECK-NEXT: [[V1:%.*]] = load ptr, ptr [[P]], align 8, !nonnull [[META0:![0-9]+]], !noundef [[META0]], !noalias.addrspace [[META1:![0-9]+]]
; CHECK-NEXT: call void @use.ptr(ptr [[V1]])
; CHECK-NEXT: call void @use.ptr(ptr [[V1]])
; CHECK-NEXT: ret void
;
%v1 = load ptr, ptr %p, !nonnull !{}, !noundef !{}, !noalias.addrspace !0
call void @use.ptr(ptr %v1)
%v2 = load ptr, ptr %p
call void @use.ptr(ptr %v2)
ret void
}

define void @load_both_same_noalias_addrspace(ptr %p) {
; CHECK-LABEL: define void @load_both_same_noalias_addrspace(
; CHECK-SAME: ptr [[P:%.*]]) {
; CHECK-NEXT: [[V1:%.*]] = load ptr, ptr [[P]], align 8, !nonnull [[META0]], !noundef [[META0]], !noalias.addrspace [[META1]]
; CHECK-NEXT: call void @use.ptr(ptr [[V1]])
; CHECK-NEXT: call void @use.ptr(ptr [[V1]])
; CHECK-NEXT: ret void
;
%v1 = load ptr, ptr %p, !nonnull !{}, !noundef !{}, !noalias.addrspace !0
call void @use.ptr(ptr %v1)
%v2 = load ptr, ptr %p, !noalias.addrspace !0
call void @use.ptr(ptr %v2)
ret void
}

define void @load_both_disjoint_noalias_addrspace(ptr %p) {
; CHECK-LABEL: define void @load_both_disjoint_noalias_addrspace(
; CHECK-SAME: ptr [[P:%.*]]) {
; CHECK-NEXT: [[V1:%.*]] = load ptr, ptr [[P]], align 8, !nonnull [[META0]], !noundef [[META0]], !noalias.addrspace [[META1]]
; CHECK-NEXT: call void @use.ptr(ptr [[V1]])
; CHECK-NEXT: call void @use.ptr(ptr [[V1]])
; CHECK-NEXT: ret void
;
%v1 = load ptr, ptr %p, !nonnull !{}, !noundef !{}, !noalias.addrspace !0
call void @use.ptr(ptr %v1)
%v2 = load ptr, ptr %p, !noalias.addrspace !1
call void @use.ptr(ptr %v2)
ret void
}

define void @load_both_overlap_noalias_addrspace(ptr %p) {
; CHECK-LABEL: define void @load_both_overlap_noalias_addrspace(
; CHECK-SAME: ptr [[P:%.*]]) {
; CHECK-NEXT: [[V1:%.*]] = load ptr, ptr [[P]], align 8, !nonnull [[META0]], !noundef [[META0]], !noalias.addrspace [[META1]]
; CHECK-NEXT: call void @use.ptr(ptr [[V1]])
; CHECK-NEXT: call void @use.ptr(ptr [[V1]])
; CHECK-NEXT: ret void
;
%v1 = load ptr, ptr %p, !nonnull !{}, !noundef !{}, !noalias.addrspace !0
call void @use.ptr(ptr %v1)
%v2 = load ptr, ptr %p, !noalias.addrspace !2
call void @use.ptr(ptr %v2)
ret void
}

!0 = !{i32 5, i32 6}
!1 = !{i32 7, i32 8}
!2 = !{i32 5, i32 7}
;.
; CHECK: [[META0]] = !{}
; CHECK: [[META1]] = !{i32 5, i32 6}
;.
51 changes: 49 additions & 2 deletions llvm/test/Transforms/SimplifyCFG/hoist-with-metadata.ll
Original file line number Diff line number Diff line change
Expand Up @@ -319,7 +319,7 @@ out:
define void @hoist_noalias_addrspace_both(i1 %c, ptr %p, i64 %val) {
; CHECK-LABEL: @hoist_noalias_addrspace_both(
; CHECK-NEXT: if:
; CHECK-NEXT: [[T:%.*]] = atomicrmw add ptr [[P:%.*]], i64 [[VAL:%.*]] seq_cst, align 8
; CHECK-NEXT: [[T:%.*]] = atomicrmw add ptr [[P:%.*]], i64 [[VAL:%.*]] seq_cst, align 8, !noalias.addrspace [[META7:![0-9]+]]
; CHECK-NEXT: ret void
;
if:
Expand Down Expand Up @@ -361,7 +361,7 @@ out:
define void @hoist_noalias_addrspace_switch(i64 %i, ptr %p, i64 %val) {
; CHECK-LABEL: @hoist_noalias_addrspace_switch(
; CHECK-NEXT: out:
; CHECK-NEXT: [[T:%.*]] = atomicrmw add ptr [[P:%.*]], i64 [[VAL:%.*]] seq_cst, align 8
; CHECK-NEXT: [[T:%.*]] = atomicrmw add ptr [[P:%.*]], i64 [[VAL:%.*]] seq_cst, align 8, !noalias.addrspace [[META7]]
; CHECK-NEXT: ret void
;
switch i64 %i, label %bb0 [
Expand All @@ -381,6 +381,48 @@ out:
ret void
}

define void @hoist_noalias_addrspace_switch_multiple(i64 %i, ptr %p, i64 %val) {
; CHECK-LABEL: @hoist_noalias_addrspace_switch_multiple(
; CHECK-NEXT: out:
; CHECK-NEXT: [[T:%.*]] = atomicrmw add ptr [[P:%.*]], i64 [[VAL:%.*]] seq_cst, align 8, !noalias.addrspace [[META8:![0-9]+]]
; CHECK-NEXT: ret void
;
switch i64 %i, label %bb0 [
i64 1, label %bb1
i64 2, label %bb2
]
bb0:
%t = atomicrmw add ptr %p, i64 %val seq_cst, !noalias.addrspace !7
br label %out
bb1:
%e = atomicrmw add ptr %p, i64 %val seq_cst, !noalias.addrspace !8
br label %out
bb2:
%f = atomicrmw add ptr %p, i64 %val seq_cst, !noalias.addrspace !9
br label %out
out:
ret void
}

; !noalias_addrspace is not safe to speculate as it causes immediate undefined behavior.
define ptr @speculate_noalias_addrspace(i1 %c, ptr dereferenceable(8) align 8 %p) {
; CHECK-LABEL: @speculate_noalias_addrspace(
; CHECK-NEXT: entry:
; CHECK-NEXT: [[V:%.*]] = load ptr, ptr [[P:%.*]], align 8, !nonnull [[META2]]
; CHECK-NEXT: [[SPEC_SELECT:%.*]] = select i1 [[C:%.*]], ptr [[V]], ptr null
; CHECK-NEXT: ret ptr [[SPEC_SELECT]]
;
entry:
br i1 %c, label %if, label %join

if:
%v = load ptr, ptr %p, !nonnull !{}, !noundef !{}, !noalias.addrspace !4
br label %join

join:
%phi = phi ptr [ %v, %if ], [ null, %entry ]
ret ptr %phi
}

!0 = !{ i8 0, i8 1 }
!1 = !{ i8 3, i8 5 }
Expand All @@ -389,6 +431,9 @@ out:
!4 = !{i32 5, i32 6}
!5 = !{i32 5, i32 7}
!6 = !{i32 4, i32 8}
!7 = !{i32 4, i32 8, i32 20, i32 31}
!8 = !{i32 2, i32 5}
!9 = !{i32 2, i32 5, i32 22, i32 42, i32 45, i32 50}

;.
; CHECK: [[RNG0]] = !{i8 0, i8 1, i8 3, i8 5}
Expand All @@ -398,4 +443,6 @@ out:
; CHECK: [[RNG4]] = !{i32 0, i32 10}
; CHECK: [[META5]] = !{i64 4}
; CHECK: [[META6]] = !{float 2.500000e+00}
; CHECK: [[META7]] = !{i32 5, i32 6}
; CHECK: [[META8]] = !{i32 4, i32 5}
;.
Loading