Skip to content

powerpc: failure to optimize manual vec_nmsub implementation #129432

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
folkertdev opened this issue Mar 2, 2025 · 1 comment
Open

powerpc: failure to optimize manual vec_nmsub implementation #129432

folkertdev opened this issue Mar 2, 2025 · 1 comment

Comments

@folkertdev
Copy link

given this code

https://godbolt.org/z/3WxTM4Yao

#include <altivec.h>

vector float old(vector float a, vector float b, vector float c) {
    return vec_nmsub(a, b, c);
}

vector float new(vector float a, vector float b, vector float c) {
    return vec_neg(vec_madd(a, b, vec_neg(c)));
}

on newer powerpc cpus, these both generate the exact same assembly as expected:

        xvnmsubasp 36, 34, 35
        vmr     2, 4
        blr

however for older cpus, the non-intrinsic implementation fails to optimize

old:
        vnmsubfp 2, 2, 3, 4
        blr

new:
        vspltisb 5, -1
        vslw 5, 5, 5
        vsubfp 4, 5, 4
        vmaddfp 2, 2, 3, 4
        vsubfp 2, 5, 2
        blrasm

this came up here rust-lang/stdarch#1734

@llvmbot
Copy link
Member

llvmbot commented Mar 2, 2025

@llvm/issue-subscribers-backend-powerpc

Author: Folkert de Vries (folkertdev)

given this code

https://godbolt.org/z/3WxTM4Yao

#include &lt;altivec.h&gt;

vector float old(vector float a, vector float b, vector float c) {
    return vec_nmsub(a, b, c);
}

vector float new(vector float a, vector float b, vector float c) {
    return vec_neg(vec_madd(a, b, vec_neg(c)));
}

on newer powerpc cpus, these both generate the exact same assembly as expected:

        xvnmsubasp 36, 34, 35
        vmr     2, 4
        blr

however for older cpus, the non-intrinsic implementation fails to optimize

old:
        vnmsubfp 2, 2, 3, 4
        blr

new:
        vspltisb 5, -1
        vslw 5, 5, 5
        vsubfp 4, 5, 4
        vmaddfp 2, 2, 3, 4
        vsubfp 2, 5, 2
        blrasm

this came up here rust-lang/stdarch#1734

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants