Skip to content

Python 3.13 beta 2 build fails on Windows when using both --experimental-jit and --disable-gil #120326

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
cdgriffith opened this issue Jun 10, 2024 · 9 comments
Labels
3.13 bugs and security fixes 3.14 bugs and security fixes build The build process and cross-build OS-windows topic-free-threading topic-JIT type-bug An unexpected behavior, bug, or error

Comments

@cdgriffith
Copy link
Contributor

cdgriffith commented Jun 10, 2024

Bug report

Bug description:

PCbuild\build.bat --experimental-jit --disable-gil

commit: 14ff4c9

Repeating a lot of the same error:

C:\Users\redacted\Projects\cpython\Include\object.h(249,11): error G10448251: call to undeclared library function '__readgsqword' with type 'unsigned long long (unsigned long)'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration] [C:\Users\redacted\Projects\cpython\PCbuild\pythoncore.vcxproj]

build_output.txt

Both --experimental-jit and --disable-gil work on their own.

CPython versions tested on:

3.13

Operating systems tested on:

Windows

Linked PRs

@cdgriffith cdgriffith added the type-bug An unexpected behavior, bug, or error label Jun 10, 2024
@Eclips4 Eclips4 added build The build process and cross-build topic-free-threading topic-JIT labels Jun 10, 2024
@Eclips4
Copy link
Member

Eclips4 commented Jun 10, 2024

Thanks for the report!
Confirmed on current main and 3.13 branch.

@Eclips4 Eclips4 added OS-windows 3.13 bugs and security fixes 3.14 bugs and security fixes labels Jun 10, 2024
@brandtbucher
Copy link
Member

brandtbucher commented Jun 10, 2024

This can probably be fixed by including the header where that function is declared in Tools/jit/template.c (and possibly Python/jit.c).

If it's some sort of compiler intrinsic that isn't supported by Clang, then we might need to update _Py_ThreadId (or whatever it is that's using these intrinsics) to check for _PyJIT_ACTIVE, which is only defined in jitted templates.

Separately, I'm wondering if we should expand the JIT test matrix to include more free-threaded jobs. It seems like we have a lot of missed coverage here, unfortunately.

@brandtbucher
Copy link
Member

brandtbucher commented Jun 10, 2024

It seems like intrin.h could be what we're looking for on Windows. Anyone want to give it a shot?

@mdboom
Copy link
Contributor

mdboom commented Jun 11, 2024

Separately, I'm wondering if we should expand the JIT test matrix to include more free-threaded jobs. It seems like we have a lot of missed coverage here, unfortunately.

Alternatively, we could consider this combination unsupported for now (and explicitly raise a build-time error). Obviously eventually we want them to play nicely together, but maybe that would be a better experience in the short term until we have all of the CI, testing and fixes in place.

@zooba
Copy link
Member

zooba commented Jun 11, 2024

(and explicitly raise a build-time error)

I'd just print a warning. If we add a hard block, someone has to remember to remove it before anyone can test.

@brandtbucher
Copy link
Member

We already print a warning telling people to report errors like this. The JIT is thread-safe, and I’d like to keep it that way. Breaking the build would make it much harder to bisect where new errors were introduced down the line.

We already do a debug free-threaded JIT build for Linux in CI. Maybe we should just add others for Windows and macOS (they don’t run on most PRs).

@brandtbucher
Copy link
Member

But right now we have the next best thing, which is using beta testers as buildbots. ;)

@vstinner
Copy link
Member

On x86 (32-bit), the clang error message is: call to undeclared library function __readfsdword

  In file included from C:\victor\python\main\Tools\jit\template.c:1:
  In file included from C:\victor\python\main\Include\Python.h:63:
C:\victor\python\main\Include\object.h(184,11): error G10448251: call to undeclared library function '__readf 
sdword' with type 'unsigned long (unsigned long)'; ISO C99 and later do not support implicit function declara 
tions [-Wimplicit-function-declaration] [C:\victor\python\main\PCbuild\pythoncore.vcxproj]
    184 |     tid = __readfsdword(24);
        |           ^
  C:\victor\python\main\Include\object.h:184:11: note: include the header <intrin.h> or explicitly provide a  
  declaration for '__readfsdword'

On x64 (64-bit), the clang error message is: call to undeclared library function __readgsqword.

  In file included from C:\victor\python\main\Tools\jit\template.c:1:
  In file included from C:\victor\python\main\Include\Python.h:63:
C:\victor\python\main\Include\object.h(182,11): error G10448251: call to undeclared library function '__readg 
sqword' with type 'unsigned long long (unsigned long)'; ISO C99 and later do not support implicit function de 
clarations [-Wimplicit-function-declaration] [C:\victor\python\main\PCbuild\pythoncore.vcxproj]
    182 |     tid = __readgsqword(48);
        |           ^
  C:\victor\python\main\Include\object.h:182:11: note: include the header <intrin.h> or explicitly provide a  
  declaration for '__readgsqword'

clang was modified recently to fail to build on implicit function declaration (-Wimplicit-function-declaration). GCC will do the same change soon, context: https://fedoraproject.org/wiki/Changes/PortingToModernC#Summary

@vstinner
Copy link
Member

It seems like intrin.h could be what we're looking for on Windows. Anyone want to give it a shot?

clang is more strict than MSC when it comes to implicit function declaration.

The missing header was added by the change 939c201.

Since this change was merged in the main branch, the "Free-threading Windows (x86)" job failed 3 times on .\python.bat -m test.pythoninfo: the command fails with an exit code 1 without any output, whereas it runs for around 22 seconds. The test failure rate is around 3/18 (16%) which is low.

I spent two days trying to reproduce the issue locally (on my Windows 11 VM) and on the CI (using a test PR), but I failed to trigger the bug!?

I decided to backport the change on the 3.13 branch. The fix is needed, implicit function declaration is a bad issue. Maybe the 3.13 backport will make the issue more likely and so ease its debug. Or the bug will go ahead. I don't know.

For now, I close the issue. If someone sees the bug again, please reopen the issue.

On my test PR, I ran the job between 15 and 20 times in a row and all jobs succeeded!?

vstinner pushed a commit that referenced this issue Jun 12, 2024
…H-120329) (#120414)

gh-120326: Include <intrin.h> on Windows with Free Threading (GH-120329)
(cherry picked from commit 939c201)

Co-authored-by: Kirill Podoprigora <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.13 bugs and security fixes 3.14 bugs and security fixes build The build process and cross-build OS-windows topic-free-threading topic-JIT type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

No branches or pull requests

6 participants