-
Notifications
You must be signed in to change notification settings - Fork 13.5k
[compiler-rt] Add cpu model init for Windows. #111961
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[compiler-rt] Add cpu model init for Windows. #111961
Conversation
You can test this locally with the following command:git-clang-format --diff 984bca9d1faaa1fa5c694f8f2a5524b2374d204a 90ed73d7073c2d2d68b2ec38e7bb10a7fa6a83c7 --extensions c,inc,h -- compiler-rt/lib/builtins/cpu_model/aarch64/fmv/windows.inc compiler-rt/lib/builtins/cpu_model/aarch64.c compiler-rt/lib/builtins/cpu_model/cpu_model.h View the diff from clang-format here.diff --git a/compiler-rt/lib/builtins/cpu_model/aarch64/fmv/windows.inc b/compiler-rt/lib/builtins/cpu_model/aarch64/fmv/windows.inc
index 2ca18242fb..cdaf39d3a0 100644
--- a/compiler-rt/lib/builtins/cpu_model/aarch64/fmv/windows.inc
+++ b/compiler-rt/lib/builtins/cpu_model/aarch64/fmv/windows.inc
@@ -1,10 +1,10 @@
#define WIN32_LEAN_AND_MEAN
-#include <windows.h>
#include <processthreadsapi.h>
#include <stdint.h>
+#include <windows.h>
#ifndef PF_ARM_V82_DP_INSTRUCTIONS_AVAILABLE
-#define PF_ARM_V82_DP_INSTRUCTIONS_AVAILABLE 43
+#define PF_ARM_V82_DP_INSTRUCTIONS_AVAILABLE 43
#endif
#ifndef PF_ARM_V83_JSCVT_INSTRUCTIONS_AVAILABLE
#define PF_ARM_V83_JSCVT_INSTRUCTIONS_AVAILABLE 44
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks mostly reasonable to me. It would be good to spell out aarch64 in the PR subject somewhere.
@@ -31,7 +31,15 @@ | |||
// We're choosing init priority 90 to force our constructors to run before any | |||
// constructors in the end user application (starting at priority 101). This | |||
// value matches the libgcc choice for the same functions. | |||
#define CONSTRUCTOR_ATTRIBUTE __attribute__((constructor(90))) | |||
#ifdef _WIN64 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't it be more correct and generic to make this _WIN32
(which applies to both 32 and 64 bit windows)? I get that the current change is for aarch64 only, but this does seem to be a quite architecture independent file, and unless the distinction between 32 and 64 is needed (which it seldom is) I prefer using _WIN32
as define to look for.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ack.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment still seems to be unresolved.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah sorry, fixed in eda3898.
@@ -0,0 +1,42 @@ | |||
#ifndef _ARM64_ | |||
#define _ARM64_ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why this define here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
winnt.h complains about it if not defined and clang doesn't define it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's interesting - I would generally expect it to work to include those headers as is, with the compiler default defines.
Can you boil that down to a standalone repro, with a file that just includes <windows.h>
or similar (I guess the relevant one here is <processthreadsapi.h>
)? Or is the case that <processthreadsapi.h>
is one of the Windows headers that you can't include directly unless you've included <windows.h>
first? In that case, I'd prefer including that first, rather than adding our own extra defines. (If you want to, you can define WIN32_LEAN_AND_MEAN
to reduce the overhead of including the whole <windows.h>
.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
windows.h
defines it so let's do that.
setCPUFeature(FEAT_JSCVT); | ||
|
||
if (IsProcessorFeaturePresent(PF_ARM_V83_LRCPC_INSTRUCTIONS_AVAILABLE)) | ||
setCPUFeature(FEAT_RCPC); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we do this unconditionally like this, we require a relatively new SDK for building. PF_ARM_V83_LRCPC_INSTRUCTIONS_AVAILABLE
is only visible in WinSDK since 10.0.22621.0, and in mingw-w64 headers since mingw-w64/mingw-w64@4dd72e2 (since a year). If we enclose it, at least the last one, in an #ifdef PF_ARM_V83_LRCPC_INSTRUCTIONS_AVAILABLE
, we wouldn't break people building with slightly older headers.
Looking forward, there's a bunch of more aarch64 feature constants that you can look for since WinSDK 10.0.26100, and quite recently in mingw-w64. They aren't documented in the public MS documentation yet, but they're mostly quite self-documenting - see mingw-w64/mingw-w64@f97814b for a list of them. If we look for those features, we definitely should wrap each of them in an #ifdef
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I didn't notice the headers change recently :D
Now we have a bunch of features, I refactored the code a bit.
#define CONSTRUCTOR_ATTRIBUTE __attribute__((constructor(90))) | ||
#ifdef _WIN64 | ||
// Contructor that replaces the ifunc runs currently with prio 10, see | ||
// the LowerIFuncPass. The resolver of FMV depends on the cpu features so set |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wondered what this was about, if it was some feature in the compiler I was unaware of - but then I saw the other PR filed at the same time :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks!
#include <stdint.h> | ||
|
||
#ifndef PF_ARM_V82_DP_INSTRUCTIONS_AVAILABLE | ||
#define PF_ARM_V82_DP_INSTRUCTIONS_AVAILABLE -1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there some reason not to just use the actual numbers from the SDK, instead of -1? The new numbers should work even if you're using an old SDK. And then we don't need to worry about accidentally reporting features are missing if someone builds with an old SDK.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wasn't really sure if it is a good idea or not. Thanks for convincing me.
No further comments from me, I'll let the others finish up the review in case they're ok with it too. |
I can see a few unused macro definitions, they don't seem to cause any harm. Other than that it looks good to me. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for addressing the comments!
On Windows there is no platform support for ifunc but we could lower them to global function pointers. This also enables FMV for Windows with Clang and Compiler-rt. Depends on #111961
No description provided.