Skip to content

Remove __attribute__((target_clones)) requirement in function prototype #61219

@HFTrader

Description

@HFTrader

Let's say I have a library that I want to share with multiple targets

#include <array>
#include <cstdio>

using Vector = std::array<float, 2>;
using Matrix = std::array<float, 4>;

__attribute__((target_clones("default","arch=core2","arch=znver2")))
Vector multiply(const Matrix& m, const Vector& v) {
    Vector r;
    r[0] = v[0] * m[0] + v[1] * m[2];
    r[1] = v[0] * m[1] + v[1] * m[3];
    return r;
}

and I want to use that as below

#include <array>
#include <cstdio>

using Vector = std::array<float, 2>;
using Matrix = std::array<float, 4>;

Vector multiply(const Matrix& m, const Vector& v);

int main() {
    Matrix m{1,2,3,4};
    Vector v{1,2};
    Vector r = multiply(m,v);
    printf( "%f %f\n", r[0], r[1] );
}

Godbolt project: https://godbolt.org/z/3hd4MrzsG

GCC will be happy to compile and link the above two files together but Clang++ will complain about undefined references.

ld: CMakeFiles/example.dir/example.cpp.o: in function `main':
example.cpp:(.text+0x2a): undefined reference to `multiply(std::array<float, 4ul> const&, std::array<float, 2ul> const&)'

To make this work, I have to add all the targets in the prototype as well as in

#include <array>
#include <cstdio>

using Vector = std::array<float, 2>;
using Matrix = std::array<float, 4>;

__attribute__((target_clones("default","arch=core2","arch=znver2")))
Vector multiply(const Matrix& m, const Vector& v);

int main() {
    Matrix m{1,2,3,4};
    Vector v{1,2};
    Vector r = multiply(m,v);
    printf( "%f %f\n", r[0], r[1] );
}

Looking at the object files generated, it seeems that the main difference is that GCC generates an indirect link for the naked prototype while Clang generate an indirect link to .ifunc and that requires knowledge of the target attributes.

$ diff nm.gcc nm.clang
3,4d2
<                 U _GLOBAL_OFFSET_TABLE_
<                 U __stack_chk_fail
6,13c4,11
< i multiply(std::array<float, 4ul> const&, std::array<float, 2ul> const&)
< t multiply(std::array<float, 4ul> const&, std::array<float, 2ul> const&) [clone .arch_cascadelake.3]
< t _Z8multiplyRKSt5arrayIfLm4EERKS_IfLm2EE.arch_core2.0
< t multiply(std::array<float, 4ul> const&, std::array<float, 2ul> const&) [clone .arch_haswell.2]
< t multiply(std::array<float, 4ul> const&, std::array<float, 2ul> const&) [clone .arch_sandybridge.1]
< t _Z8multiplyRKSt5arrayIfLm4EERKS_IfLm2EE.arch_znver1.4
< t _Z8multiplyRKSt5arrayIfLm4EERKS_IfLm2EE.arch_znver2.5
< t multiply(std::array<float, 4ul> const&, std::array<float, 2ul> const&) [clone .default.6]
---
> T multiply(std::array<float, 4ul> const&, std::array<float, 2ul> const&) [clone .arch_cascadelake.3]
> T _Z8multiplyRKSt5arrayIfLm4EERKS_IfLm2EE.arch_core2.0
> T multiply(std::array<float, 4ul> const&, std::array<float, 2ul> const&) [clone .arch_haswell.2]
> T multiply(std::array<float, 4ul> const&, std::array<float, 2ul> const&) [clone .arch_sandybridge.1]
> T _Z8multiplyRKSt5arrayIfLm4EERKS_IfLm2EE.arch_znver1.4
> T _Z8multiplyRKSt5arrayIfLm4EERKS_IfLm2EE.arch_znver2.5
> T multiply(std::array<float, 4ul> const&, std::array<float, 2ul> const&) [clone .default.6]
> i multiply(std::array<float, 4ul> const&, std::array<float, 2ul> const&) [clone .ifunc]
20d17
< r std::piecewise_construct

The request is that Clang emits targets in a way that the target attributes won't be necessary in the function declaration/prototype.

Metadata

Metadata

Assignees

No one assigned

    Labels

    c++clang:codegenIR generation bugs: mangling, exceptions, etc.enhancementImproving things as opposed to bug fixing, e.g. new or missing feature

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions