Skip to content

Commit 6876168

Browse files
authored
gh-111140: PyLong_From/AsNativeBytes: Take *flags* rather than just *endianness* (GH-116053)
1 parent abfa16b commit 6876168

File tree

6 files changed

+319
-88
lines changed

6 files changed

+319
-88
lines changed

Doc/c-api/long.rst

+97-45
Original file line numberDiff line numberDiff line change
@@ -113,24 +113,28 @@ distinguished from a number. Use :c:func:`PyErr_Occurred` to disambiguate.
113113
retrieved from the resulting value using :c:func:`PyLong_AsVoidPtr`.
114114
115115
116-
.. c:function:: PyObject* PyLong_FromNativeBytes(const void* buffer, size_t n_bytes, int endianness)
116+
.. c:function:: PyObject* PyLong_FromNativeBytes(const void* buffer, size_t n_bytes, int flags)
117117
118118
Create a Python integer from the value contained in the first *n_bytes* of
119119
*buffer*, interpreted as a two's-complement signed number.
120120
121-
*endianness* may be passed ``-1`` for the native endian that CPython was
122-
compiled with, or else ``0`` for big endian and ``1`` for little.
121+
*flags* are as for :c:func:`PyLong_AsNativeBytes`. Passing ``-1`` will select
122+
the native endian that CPython was compiled with and assume that the
123+
most-significant bit is a sign bit. Passing
124+
``Py_ASNATIVEBYTES_UNSIGNED_BUFFER`` will produce the same result as calling
125+
:c:func:`PyLong_FromUnsignedNativeBytes`. Other flags are ignored.
123126
124127
.. versionadded:: 3.13
125128
126129
127-
.. c:function:: PyObject* PyLong_FromUnsignedNativeBytes(const void* buffer, size_t n_bytes, int endianness)
130+
.. c:function:: PyObject* PyLong_FromUnsignedNativeBytes(const void* buffer, size_t n_bytes, int flags)
128131
129132
Create a Python integer from the value contained in the first *n_bytes* of
130133
*buffer*, interpreted as an unsigned number.
131134
132-
*endianness* may be passed ``-1`` for the native endian that CPython was
133-
compiled with, or else ``0`` for big endian and ``1`` for little.
135+
*flags* are as for :c:func:`PyLong_AsNativeBytes`. Passing ``-1`` will select
136+
the native endian that CPython was compiled with and assume that the
137+
most-significant bit is not a sign bit. Flags other than endian are ignored.
134138
135139
.. versionadded:: 3.13
136140
@@ -354,14 +358,41 @@ distinguished from a number. Use :c:func:`PyErr_Occurred` to disambiguate.
354358
Returns ``NULL`` on error. Use :c:func:`PyErr_Occurred` to disambiguate.
355359
356360
357-
.. c:function:: Py_ssize_t PyLong_AsNativeBytes(PyObject *pylong, void* buffer, Py_ssize_t n_bytes, int endianness)
361+
.. c:function:: Py_ssize_t PyLong_AsNativeBytes(PyObject *pylong, void* buffer, Py_ssize_t n_bytes, int flags)
358362
359-
Copy the Python integer value to a native *buffer* of size *n_bytes*::
363+
Copy the Python integer value *pylong* to a native *buffer* of size
364+
*n_bytes*. The *flags* can be set to ``-1`` to behave similarly to a C cast,
365+
or to values documented below to control the behavior.
366+
367+
Returns ``-1`` with an exception raised on error. This may happen if
368+
*pylong* cannot be interpreted as an integer, or if *pylong* was negative
369+
and the ``Py_ASNATIVEBYTES_REJECT_NEGATIVE`` flag was set.
370+
371+
Otherwise, returns the number of bytes required to store the value.
372+
If this is equal to or less than *n_bytes*, the entire value was copied.
373+
All *n_bytes* of the buffer are written: large buffers are padded with
374+
zeroes.
375+
376+
If the returned value is greater than than *n_bytes*, the value was
377+
truncated: as many of the lowest bits of the value as could fit are written,
378+
and the higher bits are ignored. This matches the typical behavior
379+
of a C-style downcast.
380+
381+
.. note::
382+
383+
Overflow is not considered an error. If the returned value
384+
is larger than *n_bytes*, most significant bits were discarded.
385+
386+
``0`` will never be returned.
387+
388+
Values are always copied as two's-complement.
389+
390+
Usage example::
360391
361392
int32_t value;
362393
Py_ssize_t bytes = PyLong_AsNativeBits(pylong, &value, sizeof(value), -1);
363394
if (bytes < 0) {
364-
// A Python exception was set with the reason.
395+
// Failed. A Python exception was set with the reason.
365396
return NULL;
366397
}
367398
else if (bytes <= (Py_ssize_t)sizeof(value)) {
@@ -372,19 +403,24 @@ distinguished from a number. Use :c:func:`PyErr_Occurred` to disambiguate.
372403
// lowest bits of pylong.
373404
}
374405
375-
The above example may look *similar* to
376-
:c:func:`PyLong_As* <PyLong_AsSize_t>`
377-
but instead fills in a specific caller defined type and never raises an
378-
error about of the :class:`int` *pylong*'s value regardless of *n_bytes*
379-
or the returned byte count.
406+
Passing zero to *n_bytes* will return the size of a buffer that would
407+
be large enough to hold the value. This may be larger than technically
408+
necessary, but not unreasonably so.
380409
381-
To get at the entire potentially big Python value, this can be used to
382-
reserve enough space and copy it::
410+
.. note::
411+
412+
Passing *n_bytes=0* to this function is not an accurate way to determine
413+
the bit length of a value.
414+
415+
If *n_bytes=0*, *buffer* may be ``NULL``.
416+
417+
To get at the entire Python value of an unknown size, the function can be
418+
called twice: first to determine the buffer size, then to fill it::
383419
384420
// Ask how much space we need.
385421
Py_ssize_t expected = PyLong_AsNativeBits(pylong, NULL, 0, -1);
386422
if (expected < 0) {
387-
// A Python exception was set with the reason.
423+
// Failed. A Python exception was set with the reason.
388424
return NULL;
389425
}
390426
assert(expected != 0); // Impossible per the API definition.
@@ -395,11 +431,11 @@ distinguished from a number. Use :c:func:`PyErr_Occurred` to disambiguate.
395431
}
396432
// Safely get the entire value.
397433
Py_ssize_t bytes = PyLong_AsNativeBits(pylong, bignum, expected, -1);
398-
if (bytes < 0) { // Exception set.
434+
if (bytes < 0) { // Exception has been set.
399435
free(bignum);
400436
return NULL;
401437
}
402-
else if (bytes > expected) { // Be safe, should not be possible.
438+
else if (bytes > expected) { // This should not be possible.
403439
PyErr_SetString(PyExc_RuntimeError,
404440
"Unexpected bignum truncation after a size check.");
405441
free(bignum);
@@ -409,35 +445,51 @@ distinguished from a number. Use :c:func:`PyErr_Occurred` to disambiguate.
409445
// ... use bignum ...
410446
free(bignum);
411447
412-
*endianness* may be passed ``-1`` for the native endian that CPython was
413-
compiled with, or ``0`` for big endian and ``1`` for little.
414-
415-
Returns ``-1`` with an exception raised if *pylong* cannot be interpreted as
416-
an integer. Otherwise, return the size of the buffer required to store the
417-
value. If this is equal to or less than *n_bytes*, the entire value was
418-
copied. ``0`` will never be returned.
419-
420-
Unless an exception is raised, all *n_bytes* of the buffer will always be
421-
written. In the case of truncation, as many of the lowest bits of the value
422-
as could fit are written. This allows the caller to ignore all non-negative
423-
results if the intent is to match the typical behavior of a C-style
424-
downcast. No exception is set on truncation.
425-
426-
Values are always copied as two's-complement and sufficient buffer will be
427-
requested to include a sign bit. For example, this may cause an value that
428-
fits into 8 bytes when treated as unsigned to request 9 bytes, even though
429-
all eight bytes were copied into the buffer. What has been omitted is the
430-
zero sign bit -- redundant if the caller's intention is to treat the value
431-
as unsigned.
432-
433-
Passing zero to *n_bytes* will return the size of a buffer that would
434-
be large enough to hold the value. This may be larger than technically
435-
necessary, but not unreasonably so.
448+
*flags* is either ``-1`` (``Py_ASNATIVEBYTES_DEFAULTS``) to select defaults
449+
that behave most like a C cast, or a combintation of the other flags in
450+
the table below.
451+
Note that ``-1`` cannot be combined with other flags.
452+
453+
Currently, ``-1`` corresponds to
454+
``Py_ASNATIVEBYTES_NATIVE_ENDIAN | Py_ASNATIVEBYTES_UNSIGNED_BUFFER``.
455+
456+
============================================= ======
457+
Flag Value
458+
============================================= ======
459+
.. c:macro:: Py_ASNATIVEBYTES_DEFAULTS ``-1``
460+
.. c:macro:: Py_ASNATIVEBYTES_BIG_ENDIAN ``0``
461+
.. c:macro:: Py_ASNATIVEBYTES_LITTLE_ENDIAN ``1``
462+
.. c:macro:: Py_ASNATIVEBYTES_NATIVE_ENDIAN ``3``
463+
.. c:macro:: Py_ASNATIVEBYTES_UNSIGNED_BUFFER ``4``
464+
.. c:macro:: Py_ASNATIVEBYTES_REJECT_NEGATIVE ``8``
465+
============================================= ======
466+
467+
Specifying ``Py_ASNATIVEBYTES_NATIVE_ENDIAN`` will override any other endian
468+
flags. Passing ``2`` is reserved.
469+
470+
By default, sufficient buffer will be requested to include a sign bit.
471+
For example, when converting 128 with *n_bytes=1*, the function will return
472+
2 (or more) in order to store a zero sign bit.
473+
474+
If ``Py_ASNATIVEBYTES_UNSIGNED_BUFFER`` is specified, a zero sign bit
475+
will be omitted from size calculations. This allows, for example, 128 to fit
476+
in a single-byte buffer. If the destination buffer is later treated as
477+
signed, a positive input value may become negative.
478+
Note that the flag does not affect handling of negative values: for those,
479+
space for a sign bit is always requested.
480+
481+
Specifying ``Py_ASNATIVEBYTES_REJECT_NEGATIVE`` causes an exception to be set
482+
if *pylong* is negative. Without this flag, negative values will be copied
483+
provided there is enough space for at least one sign bit, regardless of
484+
whether ``Py_ASNATIVEBYTES_UNSIGNED_BUFFER`` was specified.
436485
437486
.. note::
438487
439-
Passing *n_bytes=0* to this function is not an accurate way to determine
440-
the bit length of a value.
488+
With the default *flags* (``-1``, or *UNSIGNED_BUFFER* without
489+
*REJECT_NEGATIVE*), multiple Python integers can map to a single value
490+
without overflow. For example, both ``255`` and ``-1`` fit a single-byte
491+
buffer and set all its bits.
492+
This matches typical C cast behavior.
441493
442494
.. versionadded:: 3.13
443495

Include/cpython/longobject.h

+19-5
Original file line numberDiff line numberDiff line change
@@ -4,11 +4,24 @@
44

55
PyAPI_FUNC(PyObject*) PyLong_FromUnicodeObject(PyObject *u, int base);
66

7+
#define Py_ASNATIVEBYTES_DEFAULTS -1
8+
#define Py_ASNATIVEBYTES_BIG_ENDIAN 0
9+
#define Py_ASNATIVEBYTES_LITTLE_ENDIAN 1
10+
#define Py_ASNATIVEBYTES_NATIVE_ENDIAN 3
11+
#define Py_ASNATIVEBYTES_UNSIGNED_BUFFER 4
12+
#define Py_ASNATIVEBYTES_REJECT_NEGATIVE 8
13+
714
/* PyLong_AsNativeBytes: Copy the integer value to a native variable.
815
buffer points to the first byte of the variable.
916
n_bytes is the number of bytes available in the buffer. Pass 0 to request
1017
the required size for the value.
11-
endianness is -1 for native endian, 0 for big endian or 1 for little.
18+
flags is a bitfield of the following flags:
19+
* 1 - little endian
20+
* 2 - native endian
21+
* 4 - unsigned destination (e.g. don't reject copying 255 into one byte)
22+
* 8 - raise an exception for negative inputs
23+
If flags is -1 (all bits set), native endian is used and value truncation
24+
behaves most like C (allows negative inputs and allow MSB set).
1225
Big endian mode will write the most significant byte into the address
1326
directly referenced by buffer; little endian will write the least significant
1427
byte into that address.
@@ -24,19 +37,20 @@ PyAPI_FUNC(PyObject*) PyLong_FromUnicodeObject(PyObject *u, int base);
2437
calculate the bit length of an integer object.
2538
*/
2639
PyAPI_FUNC(Py_ssize_t) PyLong_AsNativeBytes(PyObject* v, void* buffer,
27-
Py_ssize_t n_bytes, int endianness);
40+
Py_ssize_t n_bytes, int flags);
2841

2942
/* PyLong_FromNativeBytes: Create an int value from a native integer
3043
n_bytes is the number of bytes to read from the buffer. Passing 0 will
3144
always produce the zero int.
3245
PyLong_FromUnsignedNativeBytes always produces a non-negative int.
33-
endianness is -1 for native endian, 0 for big endian or 1 for little.
46+
flags is the same as for PyLong_AsNativeBytes, but only supports selecting
47+
the endianness or forcing an unsigned buffer.
3448
3549
Returns the int object, or NULL with an exception set. */
3650
PyAPI_FUNC(PyObject*) PyLong_FromNativeBytes(const void* buffer, size_t n_bytes,
37-
int endianness);
51+
int flags);
3852
PyAPI_FUNC(PyObject*) PyLong_FromUnsignedNativeBytes(const void* buffer,
39-
size_t n_bytes, int endianness);
53+
size_t n_bytes, int flags);
4054

4155
PyAPI_FUNC(int) PyUnstable_Long_IsCompact(const PyLongObject* op);
4256
PyAPI_FUNC(Py_ssize_t) PyUnstable_Long_CompactValue(const PyLongObject* op);

0 commit comments

Comments
 (0)