-
Notifications
You must be signed in to change notification settings - Fork 13.5k
[FMV][AArch64] Unify aes with pmull and sve2-aes with sve2-pmull128. #111673
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
According to the Arm Architecture Reference Manual for A-profile architecture you can't have one feature without having the other: ID_AA64ZFR0_EL1.AES, bits [7:4] > FEAT_SVE_AES implements the functionality identified by the value 0b0001. > FEAT_SVE_PMULL128 implements the functionality identified by the value 0b0010. > The permitted values are 0b0000 and 0b0010. (The following was removed from the latest release of the specification, but it appears to be a mistake that was not intended to relax the architecture constraints. The discrepancy has been reported) ID_AA64ISAR0_EL1.AES, bits [7:4] > FEAT_AES implements the functionality identified by the value 0b0001. > FEAT_PMULL implements the functionality identified by the value 0b0010. > From Armv8, the permitted values are 0b0000 and 0b0010. Reviewed in ACLE as ARM-software/acle#352
@llvm/pr-subscribers-backend-aarch64 @llvm/pr-subscribers-clang Author: Alexandros Lamprineas (labrinea) ChangesAccording to the Arm Architecture Reference Manual for A-profile architecture you can't have one feature without having the other: ID_AA64ZFR0_EL1.AES, bits [7:4] > FEAT_SVE_AES implements the functionality identified by the value 0b0001. (The following was removed from the latest release of the specification, but it appears to be a mistake that was not intended to relax the architecture constraints. The discrepancy has been reported) ID_AA64ISAR0_EL1.AES, bits [7:4] > FEAT_AES implements the functionality identified by the value 0b0001. Reviewed in ACLE as ARM-software/acle#352 Patch is 31.90 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/111673.diff 12 Files Affected:
diff --git a/clang/test/CodeGen/aarch64-cpu-supports.c b/clang/test/CodeGen/aarch64-cpu-supports.c
index 823bf369df6fcc..016048d9c8fe2d 100644
--- a/clang/test/CodeGen/aarch64-cpu-supports.c
+++ b/clang/test/CodeGen/aarch64-cpu-supports.c
@@ -1,7 +1,24 @@
-// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py UTC_ARGS: --check-globals --version 2
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py UTC_ARGS: --check-globals --include-generated-funcs --global-value-regex ".*" --version 2
// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -emit-llvm -o - %s | FileCheck %s
+int main(void) {
+ if (__builtin_cpu_supports("sb"))
+ return 1;
+
+ if (__builtin_cpu_supports("sve2-aes+memtag"))
+ return 2;
+
+ if (__builtin_cpu_supports("sme2+ls64+wfxt"))
+ return 3;
+
+ if (__builtin_cpu_supports("avx2"))
+ return 4;
+
+ return 0;
+}
+//.
// CHECK: @__aarch64_cpu_features = external dso_local global { i64 }
+//.
// CHECK-LABEL: define dso_local i32 @main
// CHECK-SAME: () #[[ATTR0:[0-9]+]] {
// CHECK-NEXT: entry:
@@ -45,18 +62,7 @@
// CHECK-NEXT: [[TMP12:%.*]] = load i32, ptr [[RETVAL]], align 4
// CHECK-NEXT: ret i32 [[TMP12]]
//
-int main(void) {
- if (__builtin_cpu_supports("sb"))
- return 1;
-
- if (__builtin_cpu_supports("sve2-pmull128+memtag"))
- return 2;
-
- if (__builtin_cpu_supports("sme2+ls64+wfxt"))
- return 3;
-
- if (__builtin_cpu_supports("avx2"))
- return 4;
-
- return 0;
-}
+//.
+// CHECK: [[META0:![0-9]+]] = !{i32 1, !"wchar_size", i32 4}
+// CHECK: [[META1:![0-9]+]] = !{!"{{.*}}clang version {{.*}}"}
+//.
diff --git a/clang/test/CodeGen/aarch64-fmv-dependencies.c b/clang/test/CodeGen/aarch64-fmv-dependencies.c
index f4229a5d233970..5ceb7c7ff7bc03 100644
--- a/clang/test/CodeGen/aarch64-fmv-dependencies.c
+++ b/clang/test/CodeGen/aarch64-fmv-dependencies.c
@@ -3,7 +3,7 @@
// RUN: %clang --target=aarch64-linux-gnu --rtlib=compiler-rt -emit-llvm -S -o - %s | FileCheck %s
-// CHECK: define dso_local i32 @fmv._Maes() #[[ATTR0:[0-9]+]] {
+// CHECK: define dso_local i32 @fmv._Maes() #[[aes:[0-9]+]] {
__attribute__((target_version("aes"))) int fmv(void) { return 0; }
// CHECK: define dso_local i32 @fmv._Mbf16() #[[bf16_ebf16:[0-9]+]] {
@@ -84,9 +84,6 @@ __attribute__((target_version("memtag3"))) int fmv(void) { return 0; }
// CHECK: define dso_local i32 @fmv._Mmops() #[[mops:[0-9]+]] {
__attribute__((target_version("mops"))) int fmv(void) { return 0; }
-// CHECK: define dso_local i32 @fmv._Mpmull() #[[pmull:[0-9]+]] {
-__attribute__((target_version("pmull"))) int fmv(void) { return 0; }
-
// CHECK: define dso_local i32 @fmv._Mpredres() #[[predres:[0-9]+]] {
__attribute__((target_version("predres"))) int fmv(void) { return 0; }
@@ -153,15 +150,12 @@ __attribute__((target_version("sve-i8mm"))) int fmv(void) { return 0; }
// CHECK: define dso_local i32 @fmv._Msve2() #[[sve2:[0-9]+]] {
__attribute__((target_version("sve2"))) int fmv(void) { return 0; }
-// CHECK: define dso_local i32 @fmv._Msve2-aes() #[[sve2_aes_sve2_pmull128:[0-9]+]] {
+// CHECK: define dso_local i32 @fmv._Msve2-aes() #[[sve2_aes:[0-9]+]] {
__attribute__((target_version("sve2-aes"))) int fmv(void) { return 0; }
// CHECK: define dso_local i32 @fmv._Msve2-bitperm() #[[sve2_bitperm:[0-9]+]] {
__attribute__((target_version("sve2-bitperm"))) int fmv(void) { return 0; }
-// CHECK: define dso_local i32 @fmv._Msve2-pmull128() #[[sve2_aes_sve2_pmull128:[0-9]+]] {
-__attribute__((target_version("sve2-pmull128"))) int fmv(void) { return 0; }
-
// CHECK: define dso_local i32 @fmv._Msve2-sha3() #[[sve2_sha3:[0-9]+]] {
__attribute__((target_version("sve2-sha3"))) int fmv(void) { return 0; }
@@ -180,10 +174,11 @@ int caller() {
return fmv();
}
-// CHECK: attributes #[[ATTR0]] = { {{.*}} "target-features"="+fp-armv8,+neon,+outline-atomics,+v8a"
+// CHECK: attributes #[[aes]] = { {{.*}} "target-features"="+aes,+fp-armv8,+neon,+outline-atomics,+v8a"
// CHECK: attributes #[[bf16_ebf16]] = { {{.*}} "target-features"="+bf16,+fp-armv8,+neon,+outline-atomics,+v8a"
// CHECK: attributes #[[bti]] = { {{.*}} "target-features"="+bti,+fp-armv8,+neon,+outline-atomics,+v8a"
// CHECK: attributes #[[crc]] = { {{.*}} "target-features"="+crc,+fp-armv8,+neon,+outline-atomics,+v8a"
+// CHECK: attributes #[[ATTR0]] = { {{.*}} "target-features"="+fp-armv8,+neon,+outline-atomics,+v8a"
// CHECK: attributes #[[dit]] = { {{.*}} "target-features"="+dit,+fp-armv8,+neon,+outline-atomics,+v8a"
// CHECK: attributes #[[dotprod]] = { {{.*}} "target-features"="+dotprod,+fp-armv8,+neon,+outline-atomics,+v8a"
// CHECK: attributes #[[dpb]] = { {{.*}} "target-features"="+ccpp,+fp-armv8,+neon,+outline-atomics,+v8a"
@@ -202,7 +197,6 @@ int caller() {
// CHECK: attributes #[[lse]] = { {{.*}} "target-features"="+fp-armv8,+lse,+neon,+outline-atomics,+v8a"
// CHECK: attributes #[[memtag2]] = { {{.*}} "target-features"="+fp-armv8,+mte,+neon,+outline-atomics,+v8a"
// CHECK: attributes #[[mops]] = { {{.*}} "target-features"="+fp-armv8,+mops,+neon,+outline-atomics,+v8a"
-// CHECK: attributes #[[pmull]] = { {{.*}} "target-features"="+aes,+fp-armv8,+neon,+outline-atomics,+v8a"
// CHECK: attributes #[[predres]] = { {{.*}} "target-features"="+fp-armv8,+neon,+outline-atomics,+predres,+v8a"
// CHECK: attributes #[[rcpc]] = { {{.*}} "target-features"="+fp-armv8,+neon,+outline-atomics,+rcpc,+v8a"
// CHECK: attributes #[[rcpc3]] = { {{.*}} "target-features"="+fp-armv8,+neon,+outline-atomics,+rcpc,+rcpc3,+v8a"
@@ -221,7 +215,7 @@ int caller() {
// CHECK: attributes #[[sve_bf16_ebf16]] = { {{.*}} "target-features"="+bf16,+fp-armv8,+fullfp16,+neon,+outline-atomics,+sve,+v8a"
// CHECK: attributes #[[sve_i8mm]] = { {{.*}} "target-features"="+fp-armv8,+fullfp16,+i8mm,+neon,+outline-atomics,+sve,+v8a"
// CHECK: attributes #[[sve2]] = { {{.*}} "target-features"="+fp-armv8,+fullfp16,+neon,+outline-atomics,+sve,+sve2,+v8a"
-// CHECK: attributes #[[sve2_aes_sve2_pmull128]] = { {{.*}} "target-features"="+fp-armv8,+fullfp16,+neon,+outline-atomics,+sve,+sve2,+sve2-aes,+v8a"
+// CHECK: attributes #[[sve2_aes]] = { {{.*}} "target-features"="+aes,+fp-armv8,+fullfp16,+neon,+outline-atomics,+sve,+sve2,+sve2-aes,+v8a"
// CHECK: attributes #[[sve2_bitperm]] = { {{.*}} "target-features"="+fp-armv8,+fullfp16,+neon,+outline-atomics,+sve,+sve2,+sve2-bitperm,+v8a"
// CHECK: attributes #[[sve2_sha3]] = { {{.*}} "target-features"="+fp-armv8,+fullfp16,+neon,+outline-atomics,+sve,+sve2,+sve2-sha3,+v8a"
// CHECK: attributes #[[sve2_sm4]] = { {{.*}} "target-features"="+fp-armv8,+fullfp16,+neon,+outline-atomics,+sve,+sve2,+sve2-sm4,+v8a"
diff --git a/clang/test/CodeGen/attr-target-clones-aarch64.c b/clang/test/CodeGen/attr-target-clones-aarch64.c
index 292e544139e3ff..cda1bbc48da197 100644
--- a/clang/test/CodeGen/attr-target-clones-aarch64.c
+++ b/clang/test/CodeGen/attr-target-clones-aarch64.c
@@ -64,8 +64,8 @@ inline int __attribute__((target_clones("fp16", "sve2-bitperm+fcma", "default"))
// CHECK-NEXT: resolver_entry:
// CHECK-NEXT: call void @__init_cpu_features_resolver()
// CHECK-NEXT: [[TMP0:%.*]] = load i64, ptr @__aarch64_cpu_features, align 8
-// CHECK-NEXT: [[TMP1:%.*]] = and i64 [[TMP0]], 16512
-// CHECK-NEXT: [[TMP2:%.*]] = icmp eq i64 [[TMP1]], 16512
+// CHECK-NEXT: [[TMP1:%.*]] = and i64 [[TMP0]], 32896
+// CHECK-NEXT: [[TMP2:%.*]] = icmp eq i64 [[TMP1]], 32896
// CHECK-NEXT: [[TMP3:%.*]] = and i1 true, [[TMP2]]
// CHECK-NEXT: br i1 [[TMP3]], label [[RESOLVER_RETURN:%.*]], label [[RESOLVER_ELSE:%.*]]
// CHECK: resolver_return:
@@ -360,8 +360,8 @@ inline int __attribute__((target_clones("fp16", "sve2-bitperm+fcma", "default"))
// CHECK-NEXT: resolver_entry:
// CHECK-NEXT: call void @__init_cpu_features_resolver()
// CHECK-NEXT: [[TMP0:%.*]] = load i64, ptr @__aarch64_cpu_features, align 8
-// CHECK-NEXT: [[TMP1:%.*]] = and i64 [[TMP0]], 18014535948435456
-// CHECK-NEXT: [[TMP2:%.*]] = icmp eq i64 [[TMP1]], 18014535948435456
+// CHECK-NEXT: [[TMP1:%.*]] = and i64 [[TMP0]], 18014673387388928
+// CHECK-NEXT: [[TMP2:%.*]] = icmp eq i64 [[TMP1]], 18014673387388928
// CHECK-NEXT: [[TMP3:%.*]] = and i1 true, [[TMP2]]
// CHECK-NEXT: br i1 [[TMP3]], label [[RESOLVER_RETURN:%.*]], label [[RESOLVER_ELSE:%.*]]
// CHECK: resolver_return:
@@ -521,8 +521,8 @@ inline int __attribute__((target_clones("fp16", "sve2-bitperm+fcma", "default"))
// CHECK-MTE-BTI-NEXT: resolver_entry:
// CHECK-MTE-BTI-NEXT: call void @__init_cpu_features_resolver()
// CHECK-MTE-BTI-NEXT: [[TMP0:%.*]] = load i64, ptr @__aarch64_cpu_features, align 8
-// CHECK-MTE-BTI-NEXT: [[TMP1:%.*]] = and i64 [[TMP0]], 16512
-// CHECK-MTE-BTI-NEXT: [[TMP2:%.*]] = icmp eq i64 [[TMP1]], 16512
+// CHECK-MTE-BTI-NEXT: [[TMP1:%.*]] = and i64 [[TMP0]], 32896
+// CHECK-MTE-BTI-NEXT: [[TMP2:%.*]] = icmp eq i64 [[TMP1]], 32896
// CHECK-MTE-BTI-NEXT: [[TMP3:%.*]] = and i1 true, [[TMP2]]
// CHECK-MTE-BTI-NEXT: br i1 [[TMP3]], label [[RESOLVER_RETURN:%.*]], label [[RESOLVER_ELSE:%.*]]
// CHECK-MTE-BTI: resolver_return:
@@ -817,8 +817,8 @@ inline int __attribute__((target_clones("fp16", "sve2-bitperm+fcma", "default"))
// CHECK-MTE-BTI-NEXT: resolver_entry:
// CHECK-MTE-BTI-NEXT: call void @__init_cpu_features_resolver()
// CHECK-MTE-BTI-NEXT: [[TMP0:%.*]] = load i64, ptr @__aarch64_cpu_features, align 8
-// CHECK-MTE-BTI-NEXT: [[TMP1:%.*]] = and i64 [[TMP0]], 18014535948435456
-// CHECK-MTE-BTI-NEXT: [[TMP2:%.*]] = icmp eq i64 [[TMP1]], 18014535948435456
+// CHECK-MTE-BTI-NEXT: [[TMP1:%.*]] = and i64 [[TMP0]], 18014673387388928
+// CHECK-MTE-BTI-NEXT: [[TMP2:%.*]] = icmp eq i64 [[TMP1]], 18014673387388928
// CHECK-MTE-BTI-NEXT: [[TMP3:%.*]] = and i1 true, [[TMP2]]
// CHECK-MTE-BTI-NEXT: br i1 [[TMP3]], label [[RESOLVER_RETURN:%.*]], label [[RESOLVER_ELSE:%.*]]
// CHECK-MTE-BTI: resolver_return:
diff --git a/clang/test/CodeGen/attr-target-version.c b/clang/test/CodeGen/attr-target-version.c
index 22a53c82bfbf9f..ce4b6d3c71cef0 100644
--- a/clang/test/CodeGen/attr-target-version.c
+++ b/clang/test/CodeGen/attr-target-version.c
@@ -24,7 +24,7 @@ int foo() {
return fmv()+fmv_one()+fmv_two();
}
-inline int __attribute__((target_version("sha2+pmull+f64mm"))) fmv_inline(void) { return 1; }
+inline int __attribute__((target_version("sha2+aes+f64mm"))) fmv_inline(void) { return 1; }
inline int __attribute__((target_version("fp16+fcma+rdma+sme+ fp16 "))) fmv_inline(void) { return 2; }
inline int __attribute__((target_version("sha3+i8mm+f32mm"))) fmv_inline(void) { return 12; }
inline int __attribute__((target_version("dit+sve-ebf16"))) fmv_inline(void) { return 8; }
@@ -33,7 +33,7 @@ inline int __attribute__((target_version(" dpb2 + jscvt"))) fmv_inline(void) { r
inline int __attribute__((target_version("rcpc+frintts"))) fmv_inline(void) { return 3; }
inline int __attribute__((target_version("sve+sve-bf16"))) fmv_inline(void) { return 4; }
inline int __attribute__((target_version("sve2-aes+sve2-sha3"))) fmv_inline(void) { return 5; }
-inline int __attribute__((target_version("sve2+sve2-pmull128+sve2-bitperm"))) fmv_inline(void) { return 9; }
+inline int __attribute__((target_version("sve2+sve2-aes+sve2-bitperm"))) fmv_inline(void) { return 9; }
inline int __attribute__((target_version("sve2-sm4+memtag2"))) fmv_inline(void) { return 10; }
inline int __attribute__((target_version("memtag3+rcpc3+mops"))) fmv_inline(void) { return 11; }
inline int __attribute__((target_version("aes+dotprod"))) fmv_inline(void) { return 13; }
@@ -242,14 +242,14 @@ int caller(void) { return used_def_without_default_decl() + used_decl_without_de
//
// CHECK: Function Attrs: noinline nounwind optnone
// CHECK-LABEL: define {{[^@]+}}@fmv_two._Mfp
-// CHECK-SAME: () #[[ATTR5]] {
+// CHECK-SAME: () #[[ATTR12:[0-9]+]] {
// CHECK-NEXT: entry:
// CHECK-NEXT: ret i32 1
//
//
// CHECK: Function Attrs: noinline nounwind optnone
// CHECK-LABEL: define {{[^@]+}}@fmv_two._Msimd
-// CHECK-SAME: () #[[ATTR5]] {
+// CHECK-SAME: () #[[ATTR12]] {
// CHECK-NEXT: entry:
// CHECK-NEXT: ret i32 2
//
@@ -263,7 +263,7 @@ int caller(void) { return used_def_without_default_decl() + used_decl_without_de
//
// CHECK: Function Attrs: noinline nounwind optnone
// CHECK-LABEL: define {{[^@]+}}@fmv_two._Mfp16Msimd
-// CHECK-SAME: () #[[ATTR12:[0-9]+]] {
+// CHECK-SAME: () #[[ATTR13:[0-9]+]] {
// CHECK-NEXT: entry:
// CHECK-NEXT: ret i32 4
//
@@ -296,7 +296,7 @@ int caller(void) { return used_def_without_default_decl() + used_decl_without_de
//
// CHECK: Function Attrs: noinline nounwind optnone
// CHECK-LABEL: define {{[^@]+}}@fmv_c._Mssbs
-// CHECK-SAME: () #[[ATTR13:[0-9]+]] {
+// CHECK-SAME: () #[[ATTR14:[0-9]+]] {
// CHECK-NEXT: entry:
// CHECK-NEXT: ret void
//
@@ -354,14 +354,14 @@ int caller(void) { return used_def_without_default_decl() + used_decl_without_de
//
// CHECK: Function Attrs: noinline nounwind optnone
// CHECK-LABEL: define {{[^@]+}}@unused_with_forward_default_decl._Mmops
-// CHECK-SAME: () #[[ATTR15:[0-9]+]] {
+// CHECK-SAME: () #[[ATTR16:[0-9]+]] {
// CHECK-NEXT: entry:
// CHECK-NEXT: ret i32 0
//
//
// CHECK: Function Attrs: noinline nounwind optnone
// CHECK-LABEL: define {{[^@]+}}@unused_with_implicit_extern_forward_default_decl._Mdotprod
-// CHECK-SAME: () #[[ATTR16:[0-9]+]] {
+// CHECK-SAME: () #[[ATTR17:[0-9]+]] {
// CHECK-NEXT: entry:
// CHECK-NEXT: ret i32 0
//
@@ -375,7 +375,7 @@ int caller(void) { return used_def_without_default_decl() + used_decl_without_de
//
// CHECK: Function Attrs: noinline nounwind optnone
// CHECK-LABEL: define {{[^@]+}}@unused_with_default_def._Msve
-// CHECK-SAME: () #[[ATTR17:[0-9]+]] {
+// CHECK-SAME: () #[[ATTR18:[0-9]+]] {
// CHECK-NEXT: entry:
// CHECK-NEXT: ret i32 0
//
@@ -389,7 +389,7 @@ int caller(void) { return used_def_without_default_decl() + used_decl_without_de
//
// CHECK: Function Attrs: noinline nounwind optnone
// CHECK-LABEL: define {{[^@]+}}@unused_with_implicit_default_def._Mfp16
-// CHECK-SAME: () #[[ATTR12]] {
+// CHECK-SAME: () #[[ATTR13]] {
// CHECK-NEXT: entry:
// CHECK-NEXT: ret i32 0
//
@@ -410,14 +410,14 @@ int caller(void) { return used_def_without_default_decl() + used_decl_without_de
//
// CHECK: Function Attrs: noinline nounwind optnone
// CHECK-LABEL: define {{[^@]+}}@unused_with_implicit_forward_default_def._Mlse
-// CHECK-SAME: () #[[ATTR18:[0-9]+]] {
+// CHECK-SAME: () #[[ATTR19:[0-9]+]] {
// CHECK-NEXT: entry:
// CHECK-NEXT: ret i32 1
//
//
// CHECK: Function Attrs: noinline nounwind optnone
// CHECK-LABEL: define {{[^@]+}}@unused_without_default._Mrdm
-// CHECK-SAME: () #[[ATTR19:[0-9]+]] {
+// CHECK-SAME: () #[[ATTR20:[0-9]+]] {
// CHECK-NEXT: entry:
// CHECK-NEXT: ret i32 0
//
@@ -431,14 +431,14 @@ int caller(void) { return used_def_without_default_decl() + used_decl_without_de
//
// CHECK: Function Attrs: noinline nounwind optnone
// CHECK-LABEL: define {{[^@]+}}@used_def_without_default_decl._Mjscvt
-// CHECK-SAME: () #[[ATTR21:[0-9]+]] {
+// CHECK-SAME: () #[[ATTR22:[0-9]+]] {
// CHECK-NEXT: entry:
// CHECK-NEXT: ret i32 1
//
//
// CHECK: Function Attrs: noinline nounwind optnone
// CHECK-LABEL: define {{[^@]+}}@used_def_without_default_decl._Mrdm
-// CHECK-SAME: () #[[ATTR19]] {
+// CHECK-SAME: () #[[ATTR20]] {
// CHECK-NEXT: entry:
// CHECK-NEXT: ret i32 2
//
@@ -508,8 +508,8 @@ int caller(void) { return used_def_without_default_decl() + used_decl_without_de
// CHECK-NEXT: ret ptr @fmv._Mfp16fmlMmemtag
// CHECK: resolver_else8:
// CHECK-NEXT: [[TMP20:%.*]] = load i64, ptr @__aarch64_cpu_features, align 8
-// CHECK-NEXT: [[TMP21:%.*]] = and i64 [[TMP20]], 16640
-// CHECK-NEXT: [[TMP22:%.*]] = icmp eq i64 [[TMP21]], 16640
+// CHECK-NEXT: [[TMP21:%.*]] = and i64 [[TMP20]], 33024
+// CHECK-NEXT: [[TMP22:%.*]] = icmp eq i64 [[TMP21]], 33024
// CHECK-NEXT: [[TMP23:%.*]] = and i1 true, [[TMP22]]
// CHECK-NEXT: br i1 [[TMP23]], label [[RESOLVER_RETURN9:%.*]], label [[RESOLVER_ELSE10:%.*]]
// CHECK: resolver_return9:
@@ -618,7 +618,7 @@ int caller(void) { return used_def_without_default_decl() + used_decl_without_de
//
// CHECK: Function Attrs: noinline nounwind optnone
// CHECK-LABEL: define {{[^@]+}}@fmv_d._Msb
-// CHECK-SAME: () #[[ATTR23:[0-9]+]] {
+// CHECK-SAME: () #[[ATTR24:[0-9]+]] {
// CHECK-NEXT: entry:
// CHECK-NEXT: ret i32 0
//
@@ -659,92 +659,92 @@ int caller(void) { return used_def_without_default_decl() + used_decl_without_de
//
//
// CHECK: Function Attrs: noinline nounwind optnone
-// CHECK-LABEL: define {{[^@]+}}@fmv_inline._Mf64mmMpmullMsha2
-// CHECK-SAME: () #[[ATTR24:[0-9]+]] {
+// CHECK-LABEL: define {{[^@]+}}@fmv_inline._MaesMf64mmMsha2
+// CHECK-SAME: () #[[ATTR25:[0-9]+]] {
// CHECK-NEXT: entry:
// CHECK-NEXT: ret i32 1
//
//
// CHECK: Function Attrs: noinline nounwind optnone
// CHECK-LABEL: define {{[^@]+}}@fmv_inline._MfcmaMfp16MrdmMsme
-// CHECK-SAME: () #[[ATTR25:[0-9]+]] {
+// CHECK-SAME: () #[[ATTR26:[0-9]+]] {
// CHECK-NEXT: entry:
// CHECK-NEXT: ret i32 2
//
//
// CHECK: Function Attrs: noinline nounwind optnone
// CHECK-LABEL: define {{[^@]+}}@fmv_inline._Mf32mmMi8mmMsha3
-// CHECK-SAME: () #[[ATTR26:[0-9]+]] {
+// CHECK-SAME: () #[[ATTR27:[0-9]+]] {
// CHECK-NEXT: entry:
// CHECK-NEXT: ret i32 12
//
//
// CHECK: Function Attrs: noinline nounwind optnone
// CHECK-LABEL: define {{[^@]+}}@fmv_inline._MditMsve-ebf16
-// CHECK-SAME: () #[[ATTR27:[0-9]+]] {
+// CHECK-SAME: () #[[ATTR28:[0-9]+]] {
// CHECK-NEXT: entry:
// CHECK-NEXT: ret i32 8
//
//
// CHECK: Function Attrs: noinline nounwind optnone
// CHECK-LABEL: define {{[^@]+}}@fmv_inline._MdpbMrcpc2
-// CHECK-SAME: () #[[ATTR28:[0-9]+]] {
+// CHECK-SAME: () #[[ATTR29:[0-9]+]] {
// CHECK-NEXT: entry:
// CHECK-NEXT: ret i32 6
//
//
// CHECK: Function Attrs: noinline nounwind optnone
// CHECK-LABEL: define {{[^@]+}}@fmv_inline._Mdpb2Mjscvt
-// CHECK-SAME: () #[[ATTR29:[0-9]+]] {
+// CHECK-SAME: () #[[ATTR30:[0-9]+]] {
// CHECK-NEXT: entry:
// CHECK-NEXT: ret i32 7
//
//
// CHECK: Function Attrs: noinline nounwind optnone
// CHECK-LABEL: define {{[^@]+}}@fmv_inline._MfrinttsMrcpc
-// CHECK-SAME: () #[[ATTR30:[0-9]+]] {
+// CHECK-SAME: () #[[ATTR31:[0-9]+]] {
// CHECK-NEXT: entry:
// CHECK-NEXT: ret i32 3
//
//
// CHECK: Function Attrs: noinline nounwind optnone
// CHECK-LABEL: define {{[^@]+}}@fmv_inline._MsveMsve-bf16
-// CHECK-SAME: () #[[ATTR31:[0-9]+]] {
+// CHECK-SAME: () #[[ATTR32:[0-9]+]] {
// CHECK-NEXT: entry:
// CHECK-NEXT: ret i32 4
//
//
// CHECK: Function Attrs: noinline nounwind optnone
// CHECK-LABEL: define {{[^@]+}}@fmv_inline._Msve2-aesMsve2-sha3
-// CHECK-SAME: () #[[ATTR32:[0-9]+]] {
+// CHECK-SAME: () #[[ATTR33:[0-9]+]] {
// CHECK-NEXT: entry:
// CHECK-NEXT: ret i32 5
//
//
// CHECK: Function Attrs: noinline nounwind optnone
-// CHECK-LABEL: define {{[^@]+}}@fmv_inline._Msve2Msve2-bitpermMsve2-pmull128
-// CHECK-SAME: () #[[ATTR33:[0-9]+]] {
+// CHECK-LABEL: define {{[^@]+}}@fmv_inline._Msve2Msve2-aesMsve2-bitperm
+// CHECK-SAME: () #[[ATTR34:[0-9]+]] {
// CHECK-NEXT: entry:
// CHECK-NEXT: ret i32 9
//
//
// CHECK: Function Attrs: noinline nounwind optnone
// CHECK-LABEL: define {{[^@]+}}@fmv_inline._Mmemtag2Msve2-sm4
-// CHECK-SAME: () #[[ATTR34:[0-9]+]] {
+// CHECK-SAME: () #[[ATTR35:[0-9]+]] {
// CHECK-NEXT: entry:
// CHECK-NEXT: ret i32 10
//
//
// CHECK: Function Attrs: noinline nounwind optnone
// CHECK-LABEL: define {{[^@]+}}@fmv_inline._Mmemtag3MmopsMrcpc3
-// CHECK-SAME: () #[[ATTR35:[0-9]+]] {
+// CHECK-SAME: () #[[ATTR36:[0-9]+]] {
// CHECK-NEXT: entry:
// CHECK-NEXT: ret i32 11
//
//
// CHECK: Function Attrs: noinline nounwind optnone
// CHECK-LABEL: define {{[^@]+}}@fmv_inline._MaesMdotprod
-// CHECK-SAME: () #[[ATTR16]] {
+// CHECK-SAME: () #[[ATTR37:[0-9]+]] {
// CHECK-NEXT: entry:
// CHECK-NEXT: ret i32 13
//
@@ -758,14 +758,14 @@ int caller(void) { return used_def_without_default_decl() + used_decl_without_de
//
// CHECK: Function Attrs: noinline nounwind optnone
// CHECK-LABEL: define {{[^@]+}}@fmv_inline._MfpMsm4
-// CHECK-SAME: () #[[ATTR36:[0-9]+]] {
+// CHECK-SAME: () #[[ATTR38:[0-9]+]] {
// CHECK-NEXT: e...
[truncated]
|
@@ -87,9 +86,8 @@ def : FMVExtension<"sve-bf16", "FEAT_SVE_BF16", "+sve,+bf16,+fullfp16,+fp-armv8, | |||
def : FMVExtension<"sve-ebf16", "FEAT_SVE_EBF16", "+sve,+bf16,+fullfp16,+fp-armv8,+neon", 330>; | |||
def : FMVExtension<"sve-i8mm", "FEAT_SVE_I8MM", "+sve,+i8mm,+fullfp16,+fp-armv8,+neon", 340>; | |||
def : FMVExtension<"sve2", "FEAT_SVE2", "+sve2,+sve,+fullfp16,+fp-armv8,+neon", 370>; | |||
def : FMVExtension<"sve2-aes", "FEAT_SVE_AES", "+sve2,+sve,+sve2-aes,+fullfp16,+fp-armv8,+neon", 380>; | |||
def : FMVExtension<"sve2-aes", "FEAT_SVE_PMULL128", "+sve2,+sve,+aes,+sve2-aes,+fullfp16,+fp-armv8,+neon", 380>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like +sve2-aes
should not have been here before. It is correct that it is here now though.
@@ -59,7 +59,7 @@ enum CPUFeatures { | |||
FEAT_SVE_F32MM, | |||
FEAT_SVE_F64MM, | |||
FEAT_SVE2, | |||
FEAT_SVE_AES, | |||
RESERVED_FEAT_SVE_AES, // previously used and now ABI legacy |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice to have a more complete explanation of these reserved values somewhere in these files.
…lvm#111673) According to the Arm Architecture Reference Manual for A-profile architecture you can't have one feature without having the other: ID_AA64ZFR0_EL1.AES, bits [7:4] > FEAT_SVE_AES implements the functionality identified by the value 0b0001. > FEAT_SVE_PMULL128 implements the functionality identified by the value 0b0010. > The permitted values are 0b0000 and 0b0010. (The following was removed from the latest release of the specification, but it appears to be a mistake that was not intended to relax the architecture constraints. The discrepancy has been reported) ID_AA64ISAR0_EL1.AES, bits [7:4] > FEAT_AES implements the functionality identified by the value 0b0001. > FEAT_PMULL implements the functionality identified by the value 0b0010. > From Armv8, the permitted values are 0b0000 and 0b0010. Approved in ACLE as ARM-software/acle#352
According to the Arm Architecture Reference Manual for A-profile architecture you can't have one feature without having the other:
ID_AA64ZFR0_EL1.AES, bits [7:4]
(The following was removed from the latest release of the specification, but it appears to be a mistake that was not intended to relax the architecture constraints. The discrepancy has been reported)
ID_AA64ISAR0_EL1.AES, bits [7:4]
Approved in ACLE as ARM-software/acle#352