-
Notifications
You must be signed in to change notification settings - Fork 18k
internal/cpu: VEX prefixed instructions require OSXSAVE #41022
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Any useful code that uses FMA with ymm registers also needs to check for AVX (e.g. mov instructions to ymm). The HasAVX is only true if HasOSXSAVE is true. Example from standard library: Line 11 in 03ef105
I think FMA can also be used with xmm registers on Windows Vista which does not require HasOSXSAVE (but I dont have a machine to check). Also here a check for SSE/SSE2 would be needed in addition at any rate. |
Another example from standard library: Line 520 in 758ac37
go/src/cmd/compile/internal/gc/ssa.go Lines 3584 to 3591 in 758ac37
I managed to run latest Go release (1.14/1.15) on unsupported Windows OS (XP/Vista), found that math.FMA crashed with STATUS_ILLEGAL_INSTRUCTION. |
The above code uses FMA with an xmm register (does not require OSXSAVE) not with an ymm (AVX) register and is only used by amd64 which has SSE2 and xmm register support as a requirement. It should therefore be fine unless Vista doesnt support SSE2 which would cause other problems unrelated to FMA. go/src/cmd/compile/internal/gc/ssa.go Line 3617 in 758ac37
Note that Windows XP/Vista is not supported by Go 1.14 and 1.15: To understand the actual problem instead of the proposed solution please give the following information:
|
Environment
CPU-Z reported that FMA is available to guest. But old Windows does not support it. See also https://support.sisoftware.co.uk/knowledgebase.php?article=70. I understand that my use case is not supported. I found Windows 7 and newer does not have any problem. stack trace
go env
|
FMA can be used without using AVX registers which as far as I understand does not need OSXSAVE. So I think requiring OSXSAVE here is not the right fix as this would disallow code to use FMA with xmm (SSE) registers while the OS does not support saving ymm (AVX) registers. I think to understand whats happening we need to disassemble go1.14.7 math_test.TestFMA and see what the instruction stream is that is the problem. Maybe VMWare pretends FMA is supported while it isnt but it doesnt seem to be an issue with OSXSAVE if this is reproducable to always error on the same PC. Please run these two commands in /go1.14.7/src/math/ and post the output: |
disasm
|
According to Intel's manual, |
Thanks for all the infos. I think we are on to something but it seems related to general SSE/SSE2 support and checking that it works at all (not just FMA). What is the state of CR4.OSFXSR on Vista? (Note that OSFXSR is not OSXSAVE) amd64 Go minimal requirement is SSE2. So if SSE/SSE2 is not supported yes Go does not work (with or without FMA). MOVUPS The action I would then see here is to check CR4.OSFXSR as well as SSE/SSE2 CPUID on go runtime start and if it is not set then stop right there with a warning message (similar to MMX on 386) independent of if FMA is used or not. |
It seems reading CR4 is priviledged so likely Go wont be able to read it. There seems to be a windows API way to do this: https://docs.microsoft.com/en-us/windows/win32/api/winbase/nf-winbase-getenabledxstatefeatures If there is no easy way to check if Windows supports SSE from Go code and all supported Windows versions of Go work I dont think there will be a fix to prevent Go starting. The issue is not related to FMA and could trigger in other Go code too. SSE/SSE2 itself is supported by all amd64 compatible CPUs so it can only be the OS that doesnt support them on amd64. |
Windows Server 2008 x64
Windows Server 2008 R2 x64
|
But what is CR4 on vista? At any rate Go amd64 is not supported when SSE and SSE2 are not supported. Adding more checks to FMA wont change that and the FMA instruction wasnt the fault here. |
AFAIK, Vista and Server 2008 (without R2 suffix) share the same(?) kernel. I can check cr4 on Vista if you insist. I think the behavior of MOVUPS (0F 10) on my hardware does not match Intel's manual. Somehow it generated #UD when OSXSAVE=0. |
I dont think there is anything that Go could do better here. OSXSAVE is not related to MOVUPS. OSXSAVE is only required for AVX. MOVUPS needs to be supported for Go to work as amd64 requirement for Go is support for SSE/SSE2 by CPU and OS. |
I think I misread something. I tried a simple C program with
I need to minimize reproducing TestFMA's failure. |
The instruction from the dump above that is faulting is: This uses SSE registers so SSE (required) + FMA (HasFMA) should be enought. There are no AVX registers involved that would require OSXSAVE. |
I reproduced this issue with
Debugged with x64dbg, stopped at
While go tool objdump shows
FMA fast path should not be used when OSXSAVE=0. |
But why should it not be used without OSXSAVE? The FMA instruction above uses xmm registers that are supported otherwise other SSE/SSE2 instructions wouldnt work either and the CPU says it supports FMA. Requiring OSXSAVE will just mask that Vista doesnt support FMA even if FMA CPUID is set to 1 and SSE is supported. I wasnt able to find any documentation that requires FMA with xmm registers to also have OSXSAVE supported. I think it is perfectly fine for the OS to store/restore xmm registers with FXSAVE and FXRSTOR which needs to be done for other SSE instructions at any rate. |
This behavior is documented in Intel 64 and IA-32 Architectures Software Developer's Manual Volume 2. See This instruction is using VEX-prefix, if I didn't misread, and requires CR4.OSXSAVE=1. |
You are right! (Thanks for indulging my questions) The issue is the VEX prefix and 64bit and protected mode. So I guess we need to guard all vex prefixed instructions extensions with an OSXSAVE checks to not set them true. I think this will apply to other instruction set additions like BMI too. The specific issue here is in internal/cpu and it needs to be fixed in x/sys/cpu too. |
FYI, I created another issue for the incorrect objdump of FMA instruction. #41043 |
Change https://golang.org/cl/274479 mentions this issue: |
What did you expect to see?
HasFMA should report false on operating systems that does not support XSAVE (HasOSXSAVE=false).
What did you see instead?
HasFMA=true on Windows Vista.
The text was updated successfully, but these errors were encountered: