Skip to content

LLVM ERROR: Cannot select: intrinsic %llvm.x86.aesni.aesenc #94326

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
smsxgli opened this issue Feb 24, 2022 · 24 comments
Closed

LLVM ERROR: Cannot select: intrinsic %llvm.x86.aesni.aesenc #94326

smsxgli opened this issue Feb 24, 2022 · 24 comments
Labels
A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. A-target-feature Area: Enabling/disabling target features like AVX, Neon, etc. C-bug Category: This is a bug. O-x86_64 Target: x86-64 processors (like x86_64-*) (also known as amd64 and x64)

Comments

@smsxgli
Copy link

smsxgli commented Feb 24, 2022

Related issue in LLVM

Compiling firefox with rustc and clang, failed during PGO while linking libxul.so with thin LTO

32:33.70 LLVM ERROR: Cannot select: intrinsic %llvm.x86.aesni.aesenc

I expected to see this happen: compile succeed

Instead, this happened: compile failed

Meta

build with stable rust, rustc version 1.58.1

And same error in freebsd can be found here

OS: arch linux.
llvm, clang and lld version: 13.0.1
CPU: Intel(R) Core(TM) i7-4790 CPU, haswell
CFLAGS: -march=native -mtune=native -O2 -pipe -fno-plt -fexceptions -fasynchronous-unwind-tables -Wp,-D_FORTIFY_SOURCE=2 -Wformat -Werror=format-security -fstack-protector-strong -fstack-clash-protection -fcf-protection
CXXFLAGS: $CFLAGS -Wp,-D_GLIBCXX_ASSERTIONS
LDFLAGS: -Wl,-O1,--sort-common,--as-needed,-z,relro,-z,now,-z,noexecstack
RUSTFLAGS: -C opt-level=2 -C target-cpu=native

Building with same configuration succeed with Intel(R) Core(TM) i7-11800H CPU tigerlake

The LLVM developer's opinion can be found here, please take a look.

@smsxgli smsxgli added the C-bug Category: This is a bug. label Feb 24, 2022
@workingjubilee
Copy link
Member

I believe this is happening due to the get_mut function on hashbrown depending on a hasher which is inlined, specifically it appears to be using
https://searchfox.org/mozilla-central/source/third_party/rust/ahash/src/aes_hash.rs
But the code appears to cfg on a feature test correctly, so I am not sure what is going on.
https://searchfox.org/mozilla-central/source/third_party/rust/ahash/src/operations.rs
We can see the IR for this instruction is in fact linked in stdarch:
https://github.com/rust-lang/stdarch/blob/6495bb0e33578443c21764655a4dd8b55399c008/crates/core_arch/src/x86/aes.rs#L21-L22

@smsxgli
Copy link
Author

smsxgli commented Feb 24, 2022

Well, in fact I am not very familiar with rust. So this may not be a bug of rustc ? Just a bug of a crate?

@workingjubilee
Copy link
Member

Possibly. Can't say for sure at the moment, it might be an error in how we steer inlining and feature annotations in general. Also, what's weird is that an i7-4790 CPU has AESNI support, so there's no clear reason why building it natively shouldn't work.

@smsxgli
Copy link
Author

smsxgli commented Feb 24, 2022

Got it. If more information needed, please let me know.

@workingjubilee workingjubilee added A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. O-x86_64 Target: x86-64 processors (like x86_64-*) (also known as amd64 and x64) labels Feb 24, 2022
@eddyb
Copy link
Member

eddyb commented Feb 24, 2022

Does this happen during the first build or the post-profiling one that does PGO?
-march=native -mtune=native/-C target-cpu=native seem suspicious with PGO, is every step ran locally on the same CPU? It seems like it would be very easy for those to get out of sync.

If PGO is not relevant, that's more worrying - but also, I'm surprised the LLVM devs didn't bring up PGO.

@smsxgli
Copy link
Author

smsxgli commented Feb 24, 2022

This occurs in second build, I mean the post-profiling one that does PGO, with thin LTO

@smsxgli
Copy link
Author

smsxgli commented Feb 24, 2022

And yes, every steps ran on the same CPU. I build firefox locally on my own PC.

@workingjubilee
Copy link
Member

Oh. I don't believe the Rust toolchain has anything to do with PGO or ThinLTO except sometimes turning it on via flags, am I wrong?

@smsxgli
Copy link
Author

smsxgli commented Feb 24, 2022

I think thin LTO is handled by lld, and PGO is handled by LLVM. And firefox is build with PGO, and first build without LTO, post-profiling one with thin LTO across clang and rustc. As I said, I am using arch linux, so if you need, I can post my build script PKGBUILD, which contains the build steps.

@smsxgli
Copy link
Author

smsxgli commented Feb 24, 2022

@eddyb
Copy link
Member

eddyb commented Feb 24, 2022

And yes, every steps ran on the same CPU.

What happens if you replace all occurrences of "native" with the appropriate value? (haswell, I assume?)
So -C target-cpu=haswell in RUSTFLAGS and -mcpu=haswell in CFLAGS.

Right now I'm expecting either native detection to silently fail (is any containerization done as part of the build, that could hide some CPU details somehow?) or something to be mismatched between Clang and rustc.

@smsxgli
Copy link
Author

smsxgli commented Feb 24, 2022

Ok, I will have a try.
During buiding, I used "clean buid" called in arch linux, using systemd-nspawn, I am afraid it's not containerization.
FreeBSD have the same problem according to this

@smsxgli
Copy link
Author

smsxgli commented Feb 24, 2022

And I mentioned, same environment (clang, rustc and so on with same version, and using systemd-nspawn) and same configuration (same CFLAGS, CXXFLAGS, RUSTFLAGS and LDFLAGS), this error will not raise on Intel(R) Core(TM) i7-11800H CPU tigerlake

@smsxgli
Copy link
Author

smsxgli commented Feb 25, 2022

Well, with stable rust version 1.59, as @eddyb suggested, changing RUSTFLAGS and CFLAGS fix this problem. Now I will test again with rust 1.58.1

@smsxgli
Copy link
Author

smsxgli commented Feb 25, 2022

Well, with rust 1.58.1 and -C target-cpu=haswell in RUSTFLAGS and -mcpu=haswell in CFLAGS, this problem disappeared and the build succeed. Very strange indeed.

@smsxgli
Copy link
Author

smsxgli commented Feb 25, 2022

Tried again and now I can confirm that with -mcpu=native in CFLAGS and -C target-cpu=native in RUSTFLAGS will lead to this LLVM error, but -mcpu=haswell in CFLAGS and -C target-cpu=haswell in RUSTFLAGS do not.

@smsxgli
Copy link
Author

smsxgli commented Feb 25, 2022

Finally I am sure RUSTFLAGS with -C target-cpu=native leads to this LLVM error. With -mcpu=native in CFLAGS and -C target-cpu=haswell in RUSTFLAGS build succeed.

@eddyb
Copy link
Member

eddyb commented Feb 25, 2022

Thanks! It would be useful to get an "expansion" of what -C target-cpu=native is considered to be, for both the original build and the PGO one - my best guess is that something about the PGO build confuses the autodetection?

Sadly, I'm not sure how to do this, as both builds are probably deeply nested within Firefox's build system.
Perhaps a RUSTC_WRAPPER script that runs rustc -C target-cpu=native --print=cfg >> rustc-cfg-log before proxying the original invocation to rustc?
(It would create a lot of repeats but I assume only the first and last copies really matter?)

@tmiasko
Copy link
Contributor

tmiasko commented Feb 26, 2022

The global attributes obtained from -Ctarget-cpu=native and -Ctarget-feature= are applied only to functions that have some target feature attributes of their own:

if !function_features.is_empty() {
let mut global_features = llvm_util::llvm_global_features(cx.tcx.sess);
global_features.extend(function_features.into_iter());
let features = global_features.join(",");
let val = CString::new(features).unwrap();
llvm::AddFunctionAttrStringValue(
llfn,
llvm::AttributePlace::Function,
cstr!("target-features"),
&val,
);
}

The global attributes are still in effect, since target machine configuration includes them. At the same time the global attributes are not persisted with LLVM bitcode, so if loaded with a different set of target features the bitcode will be misinterpreted.

For example, I can reproduce this issue by:

  1. Enabling aes target feature during LLVM bitcode generation.
  2. Organizing code so that function using aes intrinsic is inlined into a function without any target features (attributes are compatible with aes being enabled globally)
  3. Not enabling aes target feature during machine code generation.

cc @nagisa

@nagisa
Copy link
Member

nagisa commented Feb 26, 2022

-Ctarget-cpu=native causes the features to be set based on the cpuid-reported features, but we also encode a guessed specific target-cpu name (in this case haswell) in the IR at

pub fn apply_target_cpu_attr<'ll>(cx: &CodegenCx<'ll, '_>, llfn: &'ll Value) {
let target_cpu = SmallCStr::new(llvm_util::target_cpu(cx.tcx.sess));
llvm::AddFunctionAttrStringValue(
llfn,
llvm::AttributePlace::Function,
cstr!("target-cpu"),
target_cpu.as_c_str(),
);
}

and

// Always annotate functions with the target-cpu they are compiled for.
// Without this, ThinLTO won't inline Rust functions into Clang generated
// functions (because Clang annotates functions this way too).
apply_target_cpu_attr(cx, llfn);

I initially thought that perhaps the target-cpu per-function metadata may be overriding the target machine defaults implied by -Ctarget-cpu=native (in which case us setting the target-cpu function metadata would have been invalid, since target-cpu=haswell is not exactly target-cpu=native.) But that does not seem to be the case at all. Adding -mcpu=native to llc will cause the compilation to succeed.

In which case the only plausible explanation I have is that there's at least 1 specific point in the compilation process where the target machine isn't created with the native target-cpu, like it ought to. Given that the error occurs while linking I'm inclined to blame the linker invocation…? If you're using gcc or clang as a linker driver, maybe try passing -march=native to the linker invocation as well?

@tmiasko
Copy link
Contributor

tmiasko commented Feb 26, 2022

While the detected CPU is encoded as a function attribute, the target machine configuration additionally includes the features returned by LLVMGetHostCPUFeatures. Those features are neither implied by CPU (in general) nor encoded.

@tmiasko
Copy link
Contributor

tmiasko commented Mar 4, 2022

This should be fixed by #94579. @smsxgli it would be great if you could confirm if this fixes the issue for you as well. The toolchain containing the fix can be installed with rustup-toolchain-install-master tool:

$ rustup-toolchain-install-master cdc3d6a5be595fadb021df096cfc2149de5b686e 
$ cargo +cdc3d6a5be595fadb021df096cfc2149de5b686e build ...
$ rustc +cdc3d6a5be595fadb021df096cfc2149de5b686e ...

@smsxgli
Copy link
Author

smsxgli commented Mar 4, 2022

Thanks! I will try it later.

@workingjubilee workingjubilee added the A-target-feature Area: Enabling/disabling target features like AVX, Neon, etc. label Mar 4, 2023
@workingjubilee
Copy link
Member

I am closing this issue on the assumption that it is resolved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. A-target-feature Area: Enabling/disabling target features like AVX, Neon, etc. C-bug Category: This is a bug. O-x86_64 Target: x86-64 processors (like x86_64-*) (also known as amd64 and x64)
Projects
None yet
Development

No branches or pull requests

5 participants