-
Notifications
You must be signed in to change notification settings - Fork 14.6k
Description
Problem
Compiling the following program using FMV with -rtlib=compiler-rt
and running it on cores supporting SVE2 results in the execution of the default function.
This indicates that the target_version("SVE")
attribute is not being correctly detected on SVE2 cores, because SVE2 contains SVE. However, the detection appears to be accurate on SVE-only cores.
#include <stdio.h>
#include <stdlib.h>
__attribute__ ((target_version("sve")))
void func1(void){
printf("SVE\n");
}
__attribute__ ((target_version("default")))
void func1(void){
printf("not SVE\n");
}
int main(int argc, char **argv){
func1();
return 0;
}
Expected behavior
On SVE2 cores and SVE cores:
The func1 with target_version("sve")
should be executed. And it should print SVE
.
Actual behavior
On SVE2 cores:
The func1 with target_version("default")
is executed instead. And it printed not SVE
.
On SVE cores:
The func1 with target_version("sve")
is executed correctly. And it printed SVE
.
Environment
At least Clang 18.1.2 and the latest main branch (commit b80d982).
Cause
The most likely cause is that the conditions to determine SVE support are incorrect.
They are used at __init_cpu_features_constructor
function which is generated when FMV features.
llvm-project/compiler-rt/lib/builtins/cpu_model/aarch64/fmv/mrs.inc
// ID_AA64PFR0_EL1.SVE != 0b0000
if (extractBits(ftr, 32, 4) != 0x0) {
// get ID_AA64ZFR0_EL1, that name supported
// if sve enabled only
getCPUFeature(S3_0_C0_C4_4, ftr);
// ID_AA64ZFR0_EL1.SVEver == 0b0000
if (extractBits(ftr, 0, 4) == 0x0)
setCPUFeature(FEAT_SVE);
// ID_AA64ZFR0_EL1.SVEver == 0b0001
if (extractBits(ftr, 0, 4) == 0x1)
setCPUFeature(FEAT_SVE2);
// ID_AA64ZFR0_EL1.BF16 != 0b0000
if (extractBits(ftr, 20, 4) != 0x0)
setCPUFeature(FEAT_SVE_BF16);
}
Possible Fix
Cores that support SVE2 also support SVE. The code should be modified as follows.
// ID_AA64PFR0_EL1.SVE != 0b0000
if (extractBits(ftr, 32, 4) != 0x0) {
// get ID_AA64ZFR0_EL1, that name supported
// if sve enabled only
getCPUFeature(S3_0_C0_C4_4, ftr);
- // ID_AA64ZFR0_EL1.SVEver == 0b0000
- if (extractBits(ftr, 0, 4) == 0x0)
setCPUFeature(FEAT_SVE);
// ID_AA64ZFR0_EL1.SVEver == 0b0001
if (extractBits(ftr, 0, 4) == 0x1)
setCPUFeature(FEAT_SVE2);
// ID_AA64ZFR0_EL1.BF16 != 0b0000
if (extractBits(ftr, 20, 4) != 0x0)
setCPUFeature(FEAT_SVE_BF16);
}
A similar issue is posted in the ACLE, but it is not yet fixed.