Skip to content

[AArch64][FMV] Incorrect SVE detection on SVE2 cores with FMV #93651

@kinoshita-fj

Description

@kinoshita-fj

Problem

Compiling the following program using FMV with -rtlib=compiler-rt and running it on cores supporting SVE2 results in the execution of the default function.
This indicates that the target_version("SVE") attribute is not being correctly detected on SVE2 cores, because SVE2 contains SVE. However, the detection appears to be accurate on SVE-only cores.

#include <stdio.h>
#include <stdlib.h>


__attribute__ ((target_version("sve")))
void func1(void){
        printf("SVE\n");
}

__attribute__ ((target_version("default")))
void func1(void){
        printf("not SVE\n");
}

int main(int argc, char **argv){
        func1();
        return 0;
}

Expected behavior

On SVE2 cores and SVE cores:

The func1 with target_version("sve") should be executed. And it should print SVE.

Actual behavior

On SVE2 cores:

The func1 with target_version("default") is executed instead. And it printed not SVE.

On SVE cores:

The func1 with target_version("sve") is executed correctly. And it printed SVE.

Environment

At least Clang 18.1.2 and the latest main branch (commit b80d982).

Cause

The most likely cause is that the conditions to determine SVE support are incorrect.

They are used at __init_cpu_features_constructor function which is generated when FMV features.

llvm-project/compiler-rt/lib/builtins/cpu_model/aarch64/fmv/mrs.inc

    // ID_AA64PFR0_EL1.SVE != 0b0000
    if (extractBits(ftr, 32, 4) != 0x0) {
      // get ID_AA64ZFR0_EL1, that name supported
      // if sve enabled only
      getCPUFeature(S3_0_C0_C4_4, ftr);
      // ID_AA64ZFR0_EL1.SVEver == 0b0000
      if (extractBits(ftr, 0, 4) == 0x0)
        setCPUFeature(FEAT_SVE);
      // ID_AA64ZFR0_EL1.SVEver == 0b0001
      if (extractBits(ftr, 0, 4) == 0x1)
        setCPUFeature(FEAT_SVE2);
      // ID_AA64ZFR0_EL1.BF16 != 0b0000
      if (extractBits(ftr, 20, 4) != 0x0)
        setCPUFeature(FEAT_SVE_BF16);
    }

Possible Fix

Cores that support SVE2 also support SVE. The code should be modified as follows.

    // ID_AA64PFR0_EL1.SVE != 0b0000
    if (extractBits(ftr, 32, 4) != 0x0) {
      // get ID_AA64ZFR0_EL1, that name supported
      // if sve enabled only
      getCPUFeature(S3_0_C0_C4_4, ftr);
-     // ID_AA64ZFR0_EL1.SVEver == 0b0000
-     if (extractBits(ftr, 0, 4) == 0x0)
        setCPUFeature(FEAT_SVE);
      // ID_AA64ZFR0_EL1.SVEver == 0b0001
      if (extractBits(ftr, 0, 4) == 0x1)
        setCPUFeature(FEAT_SVE2);
      // ID_AA64ZFR0_EL1.BF16 != 0b0000
      if (extractBits(ftr, 20, 4) != 0x0)
        setCPUFeature(FEAT_SVE_BF16);
    }

A similar issue is posted in the ACLE, but it is not yet fixed.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions