From 3a9c89075a0080343a75829aea7f07ecf1444871 Mon Sep 17 00:00:00 2001 From: Savannah Ostrowski Date: Thu, 20 Jun 2024 20:39:44 -0700 Subject: [PATCH 1/9] Add link to slides and clarification about LLVM dep --- peps/pep-0744.rst | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/peps/pep-0744.rst b/peps/pep-0744.rst index da5d89cdfc8..6ae3ed93764 100644 --- a/peps/pep-0744.rst +++ b/peps/pep-0744.rst @@ -1,6 +1,7 @@ PEP: 744 Title: JIT Compilation -Author: Brandt Bucher +Author: Brandt Bucher , + Savannah Ostrowski , Discussions-To: https://discuss.python.org/t/pep-744-jit-compilation/50756 Status: Draft Type: Informational @@ -33,7 +34,8 @@ the following resources: JIT at the 2023 CPython Core Developer Sprint. It includes relevant background, a light technical introduction to the "copy-and-patch" technique used, and an open discussion of its design amongst the core developers - present. + present. Slides for this talk can be found on `GitHub + `__. - The `open access paper `__ originally describing copy-and-patch. @@ -534,6 +536,12 @@ executable. These issues are no longer present in the current design. Dependencies ------------ +At the time of writing, the JIT has a build-time dependency on LLVM. LLVM +is used to compile individual micro-op instructions into blobs of machine code, +which are then linked together to form the JIT's templates. These templates are +used to build CPython itself. The JIT has no runtime dependency on LLVM and is +therefore not at all exposed as a dependency to end users. + Building the JIT adds between 3 and 60 seconds to the build process, depending on platform. It is only rebuilt whenever the generated files become out-of-date, so only those who are actively developing the main interpreter loop will be From 9986b55f93f8cf64e7d4780bf6d4b0be584b246a Mon Sep 17 00:00:00 2001 From: Savannah Ostrowski Date: Thu, 20 Jun 2024 20:50:58 -0700 Subject: [PATCH 2/9] Fix line break --- peps/pep-0744.rst | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/peps/pep-0744.rst b/peps/pep-0744.rst index 6ae3ed93764..413ea8fcd0b 100644 --- a/peps/pep-0744.rst +++ b/peps/pep-0744.rst @@ -34,8 +34,7 @@ the following resources: JIT at the 2023 CPython Core Developer Sprint. It includes relevant background, a light technical introduction to the "copy-and-patch" technique used, and an open discussion of its design amongst the core developers - present. Slides for this talk can be found on `GitHub - `__. + present. Slides for this talk can be found on `GitHub `__. - The `open access paper `__ originally describing copy-and-patch. From 18f8c76183720e69a66c13aadb9b02c18ed744df Mon Sep 17 00:00:00 2001 From: Savannah Ostrowski Date: Thu, 11 Jul 2024 14:47:19 -0700 Subject: [PATCH 3/9] Address comments from Brandt --- .github/CODEOWNERS | 2 +- peps/pep-0744.rst | 38 ++++++++++++++++++++++++++++++++------ 2 files changed, 33 insertions(+), 7 deletions(-) diff --git a/.github/CODEOWNERS b/.github/CODEOWNERS index 06a7a38b565..36422a2e4f8 100644 --- a/.github/CODEOWNERS +++ b/.github/CODEOWNERS @@ -622,7 +622,7 @@ peps/pep-0740.rst @dstufft peps/pep-0741.rst @vstinner peps/pep-0742.rst @JelleZijlstra peps/pep-0743.rst @vstinner -peps/pep-0744.rst @brandtbucher +peps/pep-0744.rst @brandtbucher @savannahostrowski peps/pep-0745.rst @hugovk peps/pep-0746.rst @JelleZijlstra peps/pep-0749.rst @JelleZijlstra diff --git a/peps/pep-0744.rst b/peps/pep-0744.rst index 413ea8fcd0b..d4526d48715 100644 --- a/peps/pep-0744.rst +++ b/peps/pep-0744.rst @@ -133,12 +133,20 @@ Like the rest of the interpreter, the JIT compiler is generated at build time, and has no runtime dependencies. It supports a wide range of platforms (see the `Support`_ section below), and has comparatively low maintenance burden. In all, the current implementation is made up of about 900 lines of build-time Python -code and 500 lines of runtime C code. +code and 500 lines of runtime C code. The current approach is optimized for +runtime compilation and would likely not work well for ahead-of-time compilation. + This is not a direction currently being explored or pursued. Specification ============= -The JIT will become non-experimental once all of the following conditions are +Before discussing the requirements that need to be met before the JIT can be +considered non-experimental, it is important to clarify that it will always be +possible to build CPython without the JIT. The JIT is currently not part of the +default build configuration, and it is likely to remain that way for the +forseeable future (though official binaries may include it). + +That said, the JIT will become non-experimental once all of the following conditions are met: #. It provides a meaningful performance improvement for at least one popular @@ -501,9 +509,22 @@ Currently, the JIT is `about as fast as the existing specializing interpreter `__ on most platforms. Improving this is obviously a top priority at this point, since providing a significant performance gain is the entire motivation for -having a JIT at all. A number of proposed improvements are already underway, and -this ongoing work is being tracked in `GH-115802 -`__. +having a JIT at all. + +Presently, there are a both higher-level and lower-level optimizations being +explored which may improve overall speed. At a higher-level (i.e. at the Python +level), optimizations, like removing type checks, propagating constants and +inlining functions, are considered when it can be proven that it is safe to do +so. At this level, there is also opportunity for reasoning across micro-ops and +identifying opportunities for replacing micro-ops with more efficient equivalents. +Examples of this include smaller optimization like skipping reference counting +for known immortal values, as well as larger optimizations like replacing global +loads with constants if it can be proven that they have not been modified. At a +lower-level (i.e. machine code stage), it's possible to optimize across micro-ops +when LLVM is building JIT stencils by creating "superinstructions" from common +sequences from common pairs or triples of micro-op instructions. + +Ongoing work is being tracked in `GH-115802 `__. Memory ------ @@ -525,7 +546,12 @@ likely to be a real concern. Not much effort has been put into optimizing the JIT's memory usage yet, so these numbers likely represent a maximum that will be reduced over time. Improving this is a medium priority, and is being tracked in `GH-116017 -`__. +`__. However, we are currently +exploring how garbage collection of cold traces (i.e. traces that are no longer +executed) could be implemented, which could reduce the JIT's memory usage. We may +also consider exposing configurable parameters for limiting memory consumption in the +future. No official APIs will be exposed until the JIT meets the requirements to be +considering non-experimental. Earlier versions of the JIT had a more complicated memory allocation scheme which imposed a number of fragile limitations on the size and layout of the From cc2fd831170bca4492aa9c85766c75a8a201f8a8 Mon Sep 17 00:00:00 2001 From: Savannah Ostrowski Date: Thu, 11 Jul 2024 14:51:13 -0700 Subject: [PATCH 4/9] fix indentation --- peps/pep-0744.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/peps/pep-0744.rst b/peps/pep-0744.rst index d4526d48715..c634b3d32b7 100644 --- a/peps/pep-0744.rst +++ b/peps/pep-0744.rst @@ -134,8 +134,8 @@ and has no runtime dependencies. It supports a wide range of platforms (see the `Support`_ section below), and has comparatively low maintenance burden. In all, the current implementation is made up of about 900 lines of build-time Python code and 500 lines of runtime C code. The current approach is optimized for -runtime compilation and would likely not work well for ahead-of-time compilation. - This is not a direction currently being explored or pursued. +runtime compilation and would likely not work well for ahead-of-time compilation. +This is not a direction currently being explored or pursued. Specification ============= From 0712d868f64d63b5aa7a3f8b5db090b45e0f27c0 Mon Sep 17 00:00:00 2001 From: Savannah Ostrowski Date: Thu, 11 Jul 2024 16:43:58 -0700 Subject: [PATCH 5/9] Update pep-0744.rst Co-authored-by: Jelle Zijlstra --- peps/pep-0744.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/peps/pep-0744.rst b/peps/pep-0744.rst index c634b3d32b7..41ec0dc7522 100644 --- a/peps/pep-0744.rst +++ b/peps/pep-0744.rst @@ -144,7 +144,7 @@ Before discussing the requirements that need to be met before the JIT can be considered non-experimental, it is important to clarify that it will always be possible to build CPython without the JIT. The JIT is currently not part of the default build configuration, and it is likely to remain that way for the -forseeable future (though official binaries may include it). +foreseeable future (though official binaries may include it). That said, the JIT will become non-experimental once all of the following conditions are met: From 4cad5c4bcddf8efa720b802ec66a7e7081db4348 Mon Sep 17 00:00:00 2001 From: Savannah Ostrowski Date: Thu, 11 Jul 2024 16:44:04 -0700 Subject: [PATCH 6/9] Update pep-0744.rst Co-authored-by: Jelle Zijlstra --- peps/pep-0744.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/peps/pep-0744.rst b/peps/pep-0744.rst index 41ec0dc7522..5d50fafa929 100644 --- a/peps/pep-0744.rst +++ b/peps/pep-0744.rst @@ -517,7 +517,7 @@ level), optimizations, like removing type checks, propagating constants and inlining functions, are considered when it can be proven that it is safe to do so. At this level, there is also opportunity for reasoning across micro-ops and identifying opportunities for replacing micro-ops with more efficient equivalents. -Examples of this include smaller optimization like skipping reference counting +Examples of this include smaller optimizations like skipping reference counting for known immortal values, as well as larger optimizations like replacing global loads with constants if it can be proven that they have not been modified. At a lower-level (i.e. machine code stage), it's possible to optimize across micro-ops From 22ccda97a4b190432ec488628e0833b024ed1814 Mon Sep 17 00:00:00 2001 From: Savannah Ostrowski Date: Fri, 9 Aug 2024 04:09:58 +0000 Subject: [PATCH 7/9] address comments --- peps/pep-0744.rst | 50 +++++++++++++---------------------------------- 1 file changed, 14 insertions(+), 36 deletions(-) diff --git a/peps/pep-0744.rst b/peps/pep-0744.rst index 5d50fafa929..67b047e809a 100644 --- a/peps/pep-0744.rst +++ b/peps/pep-0744.rst @@ -99,10 +99,11 @@ physical hardware registers). Since much of this data varies even between identical runs of a program and the existing optimization pipeline makes heavy use of runtime profiling information, -it doesn't make much sense to compile these traces ahead of time. As has been -demonstrated for many other dynamic languages (`and even Python itself -`__), the most promising approach is to compile the -optimized micro-ops "just in time" for execution. +it doesn't make much sense to compile these traces ahead of time and would be a +substantial redesign of the existing specification and micro-op tracing infrastructure +that has already been implemented. As has been demonstrated for many other dynamic +languages (`and even Python itself `__), the most promising +approach is to compile the optimized micro-ops "just in time" for execution. Rationale ========= @@ -133,21 +134,15 @@ Like the rest of the interpreter, the JIT compiler is generated at build time, and has no runtime dependencies. It supports a wide range of platforms (see the `Support`_ section below), and has comparatively low maintenance burden. In all, the current implementation is made up of about 900 lines of build-time Python -code and 500 lines of runtime C code. The current approach is optimized for -runtime compilation and would likely not work well for ahead-of-time compilation. -This is not a direction currently being explored or pursued. +code and 500 lines of runtime C code. Specification ============= -Before discussing the requirements that need to be met before the JIT can be -considered non-experimental, it is important to clarify that it will always be -possible to build CPython without the JIT. The JIT is currently not part of the -default build configuration, and it is likely to remain that way for the -foreseeable future (though official binaries may include it). - -That said, the JIT will become non-experimental once all of the following conditions are -met: +The JIT is currently not part of the default build configuration, and it is +likely to remain that way for the foreseeable future (though official binaries +may include it). That said, the JIT will become non-experimental once all of +the following conditions are met: #. It provides a meaningful performance improvement for at least one popular platform (realistically, on the order of 5%). @@ -511,21 +506,6 @@ on most platforms. Improving this is obviously a top priority at this point, since providing a significant performance gain is the entire motivation for having a JIT at all. -Presently, there are a both higher-level and lower-level optimizations being -explored which may improve overall speed. At a higher-level (i.e. at the Python -level), optimizations, like removing type checks, propagating constants and -inlining functions, are considered when it can be proven that it is safe to do -so. At this level, there is also opportunity for reasoning across micro-ops and -identifying opportunities for replacing micro-ops with more efficient equivalents. -Examples of this include smaller optimizations like skipping reference counting -for known immortal values, as well as larger optimizations like replacing global -loads with constants if it can be proven that they have not been modified. At a -lower-level (i.e. machine code stage), it's possible to optimize across micro-ops -when LLVM is building JIT stencils by creating "superinstructions" from common -sequences from common pairs or triples of micro-op instructions. - -Ongoing work is being tracked in `GH-115802 `__. - Memory ------ @@ -546,12 +526,10 @@ likely to be a real concern. Not much effort has been put into optimizing the JIT's memory usage yet, so these numbers likely represent a maximum that will be reduced over time. Improving this is a medium priority, and is being tracked in `GH-116017 -`__. However, we are currently -exploring how garbage collection of cold traces (i.e. traces that are no longer -executed) could be implemented, which could reduce the JIT's memory usage. We may -also consider exposing configurable parameters for limiting memory consumption in the -future. No official APIs will be exposed until the JIT meets the requirements to be -considering non-experimental. +`__. We may consider +exposing configurable parameters for limiting memory consumption in the +future, but no official APIs will be exposed until the JIT meets the +requirements to be considered non-experimental. Earlier versions of the JIT had a more complicated memory allocation scheme which imposed a number of fragile limitations on the size and layout of the From f304c9f38d90f9cf75bc20c2eddc6fadca5c7d87 Mon Sep 17 00:00:00 2001 From: Savannah Ostrowski Date: Fri, 9 Aug 2024 04:11:07 +0000 Subject: [PATCH 8/9] add back accidental deleted line --- peps/pep-0744.rst | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/peps/pep-0744.rst b/peps/pep-0744.rst index 67b047e809a..5c773701fa4 100644 --- a/peps/pep-0744.rst +++ b/peps/pep-0744.rst @@ -504,7 +504,9 @@ Currently, the JIT is `about as fast as the existing specializing interpreter `__ on most platforms. Improving this is obviously a top priority at this point, since providing a significant performance gain is the entire motivation for -having a JIT at all. +having a JIT at all. A number of proposed improvements are already underway, and +this ongoing work is being tracked in `GH-115802 +` Memory ------ From d9e759301bf9b95a531cb575aed9080a43952fef Mon Sep 17 00:00:00 2001 From: Savannah Ostrowski Date: Fri, 9 Aug 2024 04:11:42 +0000 Subject: [PATCH 9/9] add back underscores --- peps/pep-0744.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/peps/pep-0744.rst b/peps/pep-0744.rst index 5c773701fa4..3825b61d874 100644 --- a/peps/pep-0744.rst +++ b/peps/pep-0744.rst @@ -506,7 +506,7 @@ on most platforms. Improving this is obviously a top priority at this point, since providing a significant performance gain is the entire motivation for having a JIT at all. A number of proposed improvements are already underway, and this ongoing work is being tracked in `GH-115802 -` +`__. Memory ------