Skip to content

Commit 0aab344

Browse files
author
Simon Moll
committed
[Clang] Allow "ext_vector_type" applied to Booleans
This is the `ext_vector_type` alternative to D81083. This patch extends Clang to allow 'bool' as a valid vector element type (attribute ext_vector_type) in C/C++. This is intended as the canonical type for SIMD masks and facilitates clean vector intrinsic declarations. Vectors of i1 are supported on IR level and below down to many SIMD ISAs, such as AVX512, ARM SVE (fixed vector length) and the VE target (NEC SX-Aurora TSUBASA). The RFC on cfe-dev: https://lists.llvm.org/pipermail/cfe-dev/2020-May/065434.html Reviewed By: erichkeane Differential Revision: https://reviews.llvm.org/D88905
1 parent 9e24f0f commit 0aab344

22 files changed

+485
-72
lines changed

clang/docs/LanguageExtensions.rst

Lines changed: 65 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -445,6 +445,67 @@ NEON vector types are created using ``neon_vector_type`` and
445445
return v;
446446
}
447447

448+
GCC vector types are created using the ``vector_size(N)`` attribute. The
449+
argument ``N`` specifies the number of bytes that will be allocated for an
450+
object of this type. The size has to be multiple of the size of the vector
451+
element type. For example:
452+
453+
.. code-block:: c++
454+
455+
// OK: This declares a vector type with four 'int' elements
456+
typedef int int4 __attribute__((vector_size(4 * sizeof(int))));
457+
458+
// ERROR: '11' is not a multiple of sizeof(int)
459+
typedef int int_impossible __attribute__((vector_size(11)));
460+
461+
int4 foo(int4 a) {
462+
int4 v;
463+
v = a;
464+
return v;
465+
}
466+
467+
468+
Boolean Vectors
469+
---------------
470+
471+
Clang also supports the ext_vector_type attribute with boolean element types in
472+
C and C++. For example:
473+
474+
.. code-block:: c++
475+
476+
// legal for Clang, error for GCC:
477+
typedef bool bool4 __attribute__((ext_vector_type(4)));
478+
// Objects of bool4 type hold 8 bits, sizeof(bool4) == 1
479+
480+
bool4 foo(bool4 a) {
481+
bool4 v;
482+
v = a;
483+
return v;
484+
}
485+
486+
Boolean vectors are a Clang extension of the ext vector type. Boolean vectors
487+
are intended, though not guaranteed, to map to vector mask registers. The size
488+
parameter of a boolean vector type is the number of bits in the vector. The
489+
boolean vector is dense and each bit in the boolean vector is one vector
490+
element.
491+
492+
The semantics of boolean vectors borrows from C bit-fields with the following
493+
differences:
494+
495+
* Distinct boolean vectors are always distinct memory objects (there is no
496+
packing).
497+
* Only the operators `?:`, `!`, `~`, `|`, `&`, `^` and comparison are allowed on
498+
boolean vectors.
499+
* Casting a scalar bool value to a boolean vector type means broadcasting the
500+
scalar value onto all lanes (same as general ext_vector_type).
501+
* It is not possible to access or swizzle elements of a boolean vector
502+
(different than general ext_vector_type).
503+
504+
The size and alignment are both the number of bits rounded up to the next power
505+
of two, but the alignment is at most the maximum vector alignment of the
506+
target.
507+
508+
448509
Vector Literals
449510
---------------
450511

@@ -496,6 +557,7 @@ C-style cast yes yes yes no
496557
reinterpret_cast yes no yes no
497558
static_cast yes no yes no
498559
const_cast no no no no
560+
address &v[i] no no no [#]_ no
499561
============================== ======= ======= ============= =======
500562

501563
See also :ref:`langext-__builtin_shufflevector`, :ref:`langext-__builtin_convertvector`.
@@ -505,6 +567,9 @@ See also :ref:`langext-__builtin_shufflevector`, :ref:`langext-__builtin_convert
505567
it's only available in C++ and uses normal bool conversions (that is, != 0).
506568
If it's an extension (OpenCL) vector, it's only available in C and OpenCL C.
507569
And it selects base on signedness of the condition operands (OpenCL v1.1 s6.3.9).
570+
.. [#] Clang does not allow the address of an element to be taken while GCC
571+
allows this. This is intentional for vectors with a boolean element type and
572+
not implemented otherwise.
508573
509574
Vector Builtins
510575
---------------

clang/include/clang/AST/Type.h

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2048,6 +2048,7 @@ class alignas(8) Type : public ExtQualsTypeCommonBase {
20482048
bool isComplexIntegerType() const; // GCC _Complex integer type.
20492049
bool isVectorType() const; // GCC vector type.
20502050
bool isExtVectorType() const; // Extended vector type.
2051+
bool isExtVectorBoolType() const; // Extended vector type with bool element.
20512052
bool isMatrixType() const; // Matrix type.
20522053
bool isConstantMatrixType() const; // Constant matrix type.
20532054
bool isDependentAddressSpaceType() const; // value-dependent address space qualifier
@@ -6809,6 +6810,12 @@ inline bool Type::isExtVectorType() const {
68096810
return isa<ExtVectorType>(CanonicalType);
68106811
}
68116812

6813+
inline bool Type::isExtVectorBoolType() const {
6814+
if (!isExtVectorType())
6815+
return false;
6816+
return cast<ExtVectorType>(CanonicalType)->getElementType()->isBooleanType();
6817+
}
6818+
68126819
inline bool Type::isMatrixType() const {
68136820
return isa<MatrixType>(CanonicalType);
68146821
}

clang/include/clang/Sema/Sema.h

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11932,7 +11932,8 @@ class Sema final {
1193211932
/// type checking for vector binary operators.
1193311933
QualType CheckVectorOperands(ExprResult &LHS, ExprResult &RHS,
1193411934
SourceLocation Loc, bool IsCompAssign,
11935-
bool AllowBothBool, bool AllowBoolConversion);
11935+
bool AllowBothBool, bool AllowBoolConversion,
11936+
bool AllowBoolOperation, bool ReportInvalid);
1193611937
QualType GetSignedVectorType(QualType V);
1193711938
QualType CheckVectorCompareOperands(ExprResult &LHS, ExprResult &RHS,
1193811939
SourceLocation Loc,

clang/lib/AST/ASTContext.cpp

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1982,8 +1982,11 @@ TypeInfo ASTContext::getTypeInfoImpl(const Type *T) const {
19821982
case Type::Vector: {
19831983
const auto *VT = cast<VectorType>(T);
19841984
TypeInfo EltInfo = getTypeInfo(VT->getElementType());
1985-
Width = EltInfo.Width * VT->getNumElements();
1986-
Align = Width;
1985+
Width = VT->isExtVectorBoolType() ? VT->getNumElements()
1986+
: EltInfo.Width * VT->getNumElements();
1987+
// Enforce at least byte alignment.
1988+
Align = std::max<unsigned>(8, Width);
1989+
19871990
// If the alignment is not a power of 2, round up to the next power of 2.
19881991
// This happens for non-power-of-2 length vectors.
19891992
if (Align & (Align-1)) {

clang/lib/CodeGen/CGDebugInfo.cpp

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3036,6 +3036,23 @@ llvm::DIType *CGDebugInfo::CreateTypeDefinition(const ObjCInterfaceType *Ty,
30363036

30373037
llvm::DIType *CGDebugInfo::CreateType(const VectorType *Ty,
30383038
llvm::DIFile *Unit) {
3039+
if (Ty->isExtVectorBoolType()) {
3040+
// Boolean ext_vector_type(N) are special because their real element type
3041+
// (bits of bit size) is not their Clang element type (_Bool of size byte).
3042+
// For now, we pretend the boolean vector were actually a vector of bytes
3043+
// (where each byte represents 8 bits of the actual vector).
3044+
// FIXME Debug info should actually represent this proper as a vector mask
3045+
// type.
3046+
auto &Ctx = CGM.getContext();
3047+
uint64_t Size = CGM.getContext().getTypeSize(Ty);
3048+
uint64_t NumVectorBytes = Size / Ctx.getCharWidth();
3049+
3050+
// Construct the vector of 'char' type.
3051+
QualType CharVecTy = Ctx.getVectorType(Ctx.CharTy, NumVectorBytes,
3052+
VectorType::GenericVector);
3053+
return CreateType(CharVecTy->getAs<VectorType>(), Unit);
3054+
}
3055+
30393056
llvm::DIType *ElementTy = getOrCreateType(Ty->getElementType(), Unit);
30403057
int64_t Count = Ty->getNumElements();
30413058

clang/lib/CodeGen/CGExpr.cpp

Lines changed: 71 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -1707,27 +1707,42 @@ llvm::Value *CodeGenFunction::EmitLoadOfScalar(Address Addr, bool Volatile,
17071707
LValueBaseInfo BaseInfo,
17081708
TBAAAccessInfo TBAAInfo,
17091709
bool isNontemporal) {
1710-
if (!CGM.getCodeGenOpts().PreserveVec3Type) {
1711-
// For better performance, handle vector loads differently.
1712-
if (Ty->isVectorType()) {
1713-
const llvm::Type *EltTy = Addr.getElementType();
1714-
1715-
const auto *VTy = cast<llvm::FixedVectorType>(EltTy);
1716-
1717-
// Handle vectors of size 3 like size 4 for better performance.
1718-
if (VTy->getNumElements() == 3) {
1719-
1720-
// Bitcast to vec4 type.
1721-
auto *vec4Ty = llvm::FixedVectorType::get(VTy->getElementType(), 4);
1722-
Address Cast = Builder.CreateElementBitCast(Addr, vec4Ty, "castToVec4");
1723-
// Now load value.
1724-
llvm::Value *V = Builder.CreateLoad(Cast, Volatile, "loadVec4");
1725-
1726-
// Shuffle vector to get vec3.
1727-
V = Builder.CreateShuffleVector(V, ArrayRef<int>{0, 1, 2},
1728-
"extractVec");
1729-
return EmitFromMemory(V, Ty);
1730-
}
1710+
if (const auto *ClangVecTy = Ty->getAs<VectorType>()) {
1711+
// Boolean vectors use `iN` as storage type.
1712+
if (ClangVecTy->isExtVectorBoolType()) {
1713+
llvm::Type *ValTy = ConvertType(Ty);
1714+
unsigned ValNumElems =
1715+
cast<llvm::FixedVectorType>(ValTy)->getNumElements();
1716+
// Load the `iP` storage object (P is the padded vector size).
1717+
auto *RawIntV = Builder.CreateLoad(Addr, Volatile, "load_bits");
1718+
const auto *RawIntTy = RawIntV->getType();
1719+
assert(RawIntTy->isIntegerTy() && "compressed iN storage for bitvectors");
1720+
// Bitcast iP --> <P x i1>.
1721+
auto *PaddedVecTy = llvm::FixedVectorType::get(
1722+
Builder.getInt1Ty(), RawIntTy->getPrimitiveSizeInBits());
1723+
llvm::Value *V = Builder.CreateBitCast(RawIntV, PaddedVecTy);
1724+
// Shuffle <P x i1> --> <N x i1> (N is the actual bit size).
1725+
V = emitBoolVecConversion(V, ValNumElems, "extractvec");
1726+
1727+
return EmitFromMemory(V, Ty);
1728+
}
1729+
1730+
// Handle vectors of size 3 like size 4 for better performance.
1731+
const llvm::Type *EltTy = Addr.getElementType();
1732+
const auto *VTy = cast<llvm::FixedVectorType>(EltTy);
1733+
1734+
if (!CGM.getCodeGenOpts().PreserveVec3Type && VTy->getNumElements() == 3) {
1735+
1736+
// Bitcast to vec4 type.
1737+
llvm::VectorType *vec4Ty =
1738+
llvm::FixedVectorType::get(VTy->getElementType(), 4);
1739+
Address Cast = Builder.CreateElementBitCast(Addr, vec4Ty, "castToVec4");
1740+
// Now load value.
1741+
llvm::Value *V = Builder.CreateLoad(Cast, Volatile, "loadVec4");
1742+
1743+
// Shuffle vector to get vec3.
1744+
V = Builder.CreateShuffleVector(V, ArrayRef<int>{0, 1, 2}, "extractVec");
1745+
return EmitFromMemory(V, Ty);
17311746
}
17321747
}
17331748

@@ -1778,6 +1793,17 @@ llvm::Value *CodeGenFunction::EmitFromMemory(llvm::Value *Value, QualType Ty) {
17781793
"wrong value rep of bool");
17791794
return Builder.CreateTrunc(Value, Builder.getInt1Ty(), "tobool");
17801795
}
1796+
if (Ty->isExtVectorBoolType()) {
1797+
const auto *RawIntTy = Value->getType();
1798+
// Bitcast iP --> <P x i1>.
1799+
auto *PaddedVecTy = llvm::FixedVectorType::get(
1800+
Builder.getInt1Ty(), RawIntTy->getPrimitiveSizeInBits());
1801+
auto *V = Builder.CreateBitCast(Value, PaddedVecTy);
1802+
// Shuffle <P x i1> --> <N x i1> (N is the actual bit size).
1803+
llvm::Type *ValTy = ConvertType(Ty);
1804+
unsigned ValNumElems = cast<llvm::FixedVectorType>(ValTy)->getNumElements();
1805+
return emitBoolVecConversion(V, ValNumElems, "extractvec");
1806+
}
17811807

17821808
return Value;
17831809
}
@@ -1822,11 +1848,19 @@ void CodeGenFunction::EmitStoreOfScalar(llvm::Value *Value, Address Addr,
18221848
LValueBaseInfo BaseInfo,
18231849
TBAAAccessInfo TBAAInfo,
18241850
bool isInit, bool isNontemporal) {
1825-
if (!CGM.getCodeGenOpts().PreserveVec3Type) {
1826-
// Handle vectors differently to get better performance.
1827-
if (Ty->isVectorType()) {
1828-
llvm::Type *SrcTy = Value->getType();
1829-
auto *VecTy = dyn_cast<llvm::VectorType>(SrcTy);
1851+
llvm::Type *SrcTy = Value->getType();
1852+
if (const auto *ClangVecTy = Ty->getAs<VectorType>()) {
1853+
auto *VecTy = dyn_cast<llvm::FixedVectorType>(SrcTy);
1854+
if (VecTy && ClangVecTy->isExtVectorBoolType()) {
1855+
auto *MemIntTy =
1856+
cast<llvm::IntegerType>(Addr.getType()->getPointerElementType());
1857+
// Expand to the memory bit width.
1858+
unsigned MemNumElems = MemIntTy->getPrimitiveSizeInBits();
1859+
// <N x i1> --> <P x i1>.
1860+
Value = emitBoolVecConversion(Value, MemNumElems, "insertvec");
1861+
// <P x i1> --> iP.
1862+
Value = Builder.CreateBitCast(Value, MemIntTy);
1863+
} else if (!CGM.getCodeGenOpts().PreserveVec3Type) {
18301864
// Handle vec3 special.
18311865
if (VecTy && cast<llvm::FixedVectorType>(VecTy)->getNumElements() == 3) {
18321866
// Our source is a vec3, do a shuffle vector to make it a vec4.
@@ -2063,8 +2097,19 @@ void CodeGenFunction::EmitStoreThroughLValue(RValue Src, LValue Dst,
20632097
// Read/modify/write the vector, inserting the new element.
20642098
llvm::Value *Vec = Builder.CreateLoad(Dst.getVectorAddress(),
20652099
Dst.isVolatileQualified());
2100+
auto *IRStoreTy = dyn_cast<llvm::IntegerType>(Vec->getType());
2101+
if (IRStoreTy) {
2102+
auto *IRVecTy = llvm::FixedVectorType::get(
2103+
Builder.getInt1Ty(), IRStoreTy->getPrimitiveSizeInBits());
2104+
Vec = Builder.CreateBitCast(Vec, IRVecTy);
2105+
// iN --> <N x i1>.
2106+
}
20662107
Vec = Builder.CreateInsertElement(Vec, Src.getScalarVal(),
20672108
Dst.getVectorIdx(), "vecins");
2109+
if (IRStoreTy) {
2110+
// <N x i1> --> <iN>.
2111+
Vec = Builder.CreateBitCast(Vec, IRStoreTy);
2112+
}
20682113
Builder.CreateStore(Vec, Dst.getVectorAddress(),
20692114
Dst.isVolatileQualified());
20702115
return;

clang/lib/CodeGen/CGExprScalar.cpp

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2147,7 +2147,6 @@ Value *ScalarExprEmitter::VisitCastExpr(CastExpr *CE) {
21472147
DestLV.setTBAAInfo(TBAAAccessInfo::getMayAliasInfo());
21482148
return EmitLoadOfLValue(DestLV, CE->getExprLoc());
21492149
}
2150-
21512150
return Builder.CreateBitCast(Src, DstTy);
21522151
}
21532152
case CK_AddressSpaceConversion: {
@@ -4818,6 +4817,10 @@ Value *ScalarExprEmitter::VisitAsTypeExpr(AsTypeExpr *E) {
48184817
? cast<llvm::FixedVectorType>(DstTy)->getNumElements()
48194818
: 0;
48204819

4820+
// Use bit vector expansion for ext_vector_type boolean vectors.
4821+
if (E->getType()->isExtVectorBoolType())
4822+
return CGF.emitBoolVecConversion(Src, NumElementsDst, "astype");
4823+
48214824
// Going from vec3 to non-vec3 is a special case and requires a shuffle
48224825
// vector to get a vec4, then a bitcast if the target type is different.
48234826
if (NumElementsSrc == 3 && NumElementsDst != 3) {

clang/lib/CodeGen/CodeGenFunction.cpp

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2765,3 +2765,19 @@ CodeGenFunction::emitCondLikelihoodViaExpectIntrinsic(llvm::Value *Cond,
27652765
}
27662766
llvm_unreachable("Unknown Likelihood");
27672767
}
2768+
2769+
llvm::Value *CodeGenFunction::emitBoolVecConversion(llvm::Value *SrcVec,
2770+
unsigned NumElementsDst,
2771+
const llvm::Twine &Name) {
2772+
auto *SrcTy = cast<llvm::FixedVectorType>(SrcVec->getType());
2773+
unsigned NumElementsSrc = SrcTy->getNumElements();
2774+
if (NumElementsSrc == NumElementsDst)
2775+
return SrcVec;
2776+
2777+
std::vector<int> ShuffleMask(NumElementsDst, -1);
2778+
for (unsigned MaskIdx = 0;
2779+
MaskIdx < std::min<>(NumElementsDst, NumElementsSrc); ++MaskIdx)
2780+
ShuffleMask[MaskIdx] = MaskIdx;
2781+
2782+
return Builder.CreateShuffleVector(SrcVec, ShuffleMask, Name);
2783+
}

clang/lib/CodeGen/CodeGenFunction.h

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4649,6 +4649,11 @@ class CodeGenFunction : public CodeGenTypeCache {
46494649
/// Set the codegen fast-math flags.
46504650
void SetFastMathFlags(FPOptions FPFeatures);
46514651

4652+
// Truncate or extend a boolean vector to the requested number of elements.
4653+
llvm::Value *emitBoolVecConversion(llvm::Value *SrcVec,
4654+
unsigned NumElementsDst,
4655+
const llvm::Twine &Name = "");
4656+
46524657
private:
46534658
llvm::MDNode *getRangeForLoadFromType(QualType Ty);
46544659
void EmitReturnOfRValue(RValue RV, QualType Ty);

clang/lib/CodeGen/CodeGenTypes.cpp

Lines changed: 14 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -98,6 +98,14 @@ llvm::Type *CodeGenTypes::ConvertTypeForMem(QualType T, bool ForBitField) {
9898

9999
llvm::Type *R = ConvertType(T);
100100

101+
// Check for the boolean vector case.
102+
if (T->isExtVectorBoolType()) {
103+
auto *FixedVT = cast<llvm::FixedVectorType>(R);
104+
// Pad to at least one byte.
105+
uint64_t BytePadded = std::max<uint64_t>(FixedVT->getNumElements(), 8);
106+
return llvm::IntegerType::get(FixedVT->getContext(), BytePadded);
107+
}
108+
101109
// If this is a bool type, or a bit-precise integer type in a bitfield
102110
// representation, map this integer to the target-specified size.
103111
if ((ForBitField && T->isBitIntType()) ||
@@ -701,9 +709,12 @@ llvm::Type *CodeGenTypes::ConvertType(QualType T) {
701709
}
702710
case Type::ExtVector:
703711
case Type::Vector: {
704-
const VectorType *VT = cast<VectorType>(Ty);
705-
ResultType = llvm::FixedVectorType::get(ConvertType(VT->getElementType()),
706-
VT->getNumElements());
712+
const auto *VT = cast<VectorType>(Ty);
713+
// An ext_vector_type of Bool is really a vector of bits.
714+
llvm::Type *IRElemTy = VT->isExtVectorBoolType()
715+
? llvm::Type::getInt1Ty(getLLVMContext())
716+
: ConvertType(VT->getElementType());
717+
ResultType = llvm::FixedVectorType::get(IRElemTy, VT->getNumElements());
707718
break;
708719
}
709720
case Type::ConstantMatrix: {

0 commit comments

Comments
 (0)