-
-
Notifications
You must be signed in to change notification settings - Fork 32.9k
bpo-27129: Use instruction offsets, not byte offsets, in bytecode and internally. #25069
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bpo-27129: Use instruction offsets, not byte offsets, in bytecode and internally. #25069
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One more thing: I'd swap the two PyCode_Addr2Line
's in Python/ceval.c
with PyFrame_GetLineNumber
for improved readability. (Unless the extra function call impacts performance.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh... I didn't notice that when the wordcode work has been done. It's unfortunate that we didn't switch to instruction offset. I vaguely recall discussion about sizeof(_Py_CODEUNIT).
I support this change, here are some comments. My main worry is about PyCode_Addr2Line().
Python/traceback.c
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What a strange comment :-D I read it as "oh, Python 2.7 is too young, let's wait a little bit longer before we can use this shiny new function" !?
Objects/codeobject.c
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer to treat the second argument as an instruction offset, rather than a byte offset. Rather than changing PyCode_Addr2Line() calls. It sounds unfortunate that the API allows to ask the address in the middle of a word (2 bytes).
It sounds common to pass f_lasti to PyCode_Addr2Line(): PyCode_Addr2Line(code, f_lasti) would still work, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Depends on how you think the API is defined.
If you want the the line number from an offset derived from code.co_linetable
, then the API should be unchanged.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PyCode_Addr2Line(code, offset)
must take a byte offset.
To getf_lasti
from the frame, the only API to do so is something like PyObject_GetAttrString(frame, "f_lasti")
, which would return a byte offset.
Also see:
https://www.python.org/dev/peps/pep-0626/#c-api
https://www.python.org/dev/peps/pep-0626/#out-of-process-debuggers-and-profilers
I wanted to ask to document properly that PyCode_Addr2Line() uses a byte offset, but the function is not documented! Can you please document the function at: Please document f_lasti unit and the change at: https://docs.python.org/dev/library/inspect.html |
The comment on There are no changes to the inspect module. |
It's part of the public C API, so people uses it (for good or bad reasons). A search on the top 4000 PyPI projects give me:
Hopefully, the list is short. The documentation should describes the behavior, and it's ok to strongly advice to use another function instead ;-) |
Lib/dis.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not go a step further and divide the labels in dis output by 2?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple of reasons.
- I wanted to keep changes to a minimum.
- There is no guarantee that instructions will remain the same size, so using byte offsets keeps things more consistent across versions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. However now we have a confusing discrepancy between what's displayed and how we're encouraged to think about the bytecode in the C code. I guess there's no perfect answer, so status quo wins.
@vstinner I'm fine with documenting |
…rpreter dispatch a bit, and removes most EXTENDED_ARGs for jumps.
79b3b3b
to
325d135
Compare
You should be allowed to click on the CI to "Re-run all jobs" to avoid closing/reopening the PR. Sadly, it's not possible to only re-run a single job. |
) * Update magic numbers and bootstrapping for GH-25069 * add blurb Co-authored-by: Terry Jan Reedy <[email protected]>
Fixup Numba to address the changes introduced in cpython for Python 3.10 at python/cpython#25069 . But for Numba, we have to alter the patch slightly as Numba rewrites the bytecode to use a `_FIXED_OFFSET` with a value of `2`.
Fixup Numba to address the changes introduced in cpython for Python 3.10 at python/cpython#25069 . But for Numba, we have to alter the patch slightly as Numba rewrites the bytecode to use a `_FIXED_OFFSET` with a value of `2`.
Fixup Numba to address the changes introduced in cpython for Python 3.10 at python/cpython#25069 . But for Numba, we have to alter the patch slightly as Numba rewrites the bytecode to use a `_FIXED_OFFSET` with a value of `2`.
Fixup Numba to address the changes introduced in cpython for Python 3.10 at python/cpython#25069 . But for Numba, we have to alter the patch slightly as Numba rewrites the bytecode to use a `_FIXED_OFFSET` with a value of `2`.
Fixup Numba to address the changes introduced in cpython for Python 3.10 at python/cpython#25069 . But for Numba, we have to alter the patch slightly as Numba rewrites the bytecode to use a `_FIXED_OFFSET` with a value of `2`.
f_lasti
to match, which saves a bit of shifting in the interpreter.https://bugs.python.org/issue27129