-
-
Notifications
You must be signed in to change notification settings - Fork 32k
struct (un)packing of half-precision nan
floats is non-invertible
#130317
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
It seems you are on IEEE-platform, or unpacking special values will fail for float and double formats. So for those formats, pack/unpack functions work by copying bits. But not PyFloat_Pack2() and PyFloat_Unpack2(). E.g. the later just ignores all payload in the nan value and maps Lines 2402 to 2405 in 12e1d30
The PyFloat_Pack2() also ignores all payload from double nan: Lines 2050 to 2059 in 12e1d30
Looks as a bug for me. CC @mdickinson Edit: assuming doubles are binary64, following patch fix your tests: diff --git a/Objects/floatobject.c b/Objects/floatobject.c
index 3b72a1e7c3..e473fb72fe 100644
--- a/Objects/floatobject.c
+++ b/Objects/floatobject.c
@@ -2048,14 +2048,16 @@ PyFloat_Pack2(double x, char *data, int le)
bits = 0;
}
else if (isnan(x)) {
- /* There are 2046 distinct half-precision NaNs (1022 signaling and
- 1024 quiet), but there are only two quiet NaNs that don't arise by
- quieting a signaling NaN; we get those by setting the topmost bit
- of the fraction field and clearing all other fraction bits. We
- choose the one with the appropriate sign. */
sign = (copysign(1.0, x) == -1.0);
e = 0x1f;
- bits = 512;
+
+ uint64_t v;
+
+ memcpy(&v, &x, sizeof(v));
+ bits = v & 0x1ff;
+ if (v & 0x800000000000) {
+ bits += 0x200;
+ }
}
else {
sign = (x < 0.0);
@@ -2401,7 +2403,16 @@ PyFloat_Unpack2(const char *data, int le)
}
else {
/* NaN */
- return sign ? -fabs(Py_NAN) : fabs(Py_NAN);
+ uint64_t v = ((sign? 0xff00000000000000 : 0x7f00000000000000)
+ + 0xf0000000000000);
+
+ if (f & 0x200) {
+ v += 0x800000000000;
+ f -= 0x200;
+ }
+ v += f;
+ memcpy(&x, &v, sizeof(v));
+ return x;
}
}
FYI: #55943. Probably the reason why payload was ignored is that the patch was adapted from numpy sources. |
@tim-one, does it looks as an issue for you? |
PR is ready for review: #130452 |
Co-authored-by: Victor Stinner <[email protected]>
Fixed in the main branch (future Python 3.14) by change 6157135. The change is not backported to 3.13 branch since it's a minor issue. |
I reopen the issue, there are failures on x86 (32-bit): #130452 (comment) |
Hmm, I think that underlying reason is same as for 32-bit Windows. Here reproducer. I create qNaN (1), add payload (2), change it to sNaN (3), assign to a temporary variable (4) and, finally - trying to return a sNaN from a function (5). In 64-bit:
In 32-bit:
It seems, in 32-bit - double loaded to FPU:
#include <math.h>
#include <stdio.h>
#include <stdint.h>
#include <stdlib.h>
#include <inttypes.h>
const unsigned long long float_layout = (1 << 30) + (1 << 22);
const unsigned long long double_layout = (1ULL << 62) + (1ULL << 51);
void print_bits(const size_t size, void const * const ptr,
const unsigned long long layout)
{
unsigned char *b = (unsigned char*) ptr;
unsigned char byte;
long long i, j;
for (i = size - 1; i >= 0; i--) {
for (j = 7; j >= 0; j--) {
if (layout & 1ULL<<(8*i+j)) {
printf(" ");
}
byte = (b[i] >> j) & 1;
printf("%u", byte);
}
}
printf("\n");
}
void print_double(const double *x)
{
uint64_t v = *((uint64_t *)x);
uint64_t man = (v - (v & (0xffffffffULL << 52)));
int sign = (v >> 63) & 1, bexp = ((v & (0x7ffULL << 52)) >> 52);
int exp = bexp;
if (exp && exp != 2047) {
exp -= 1023;
}
printf("%+a = ", *x);
print_bits(8, &v, double_layout);
if (bexp != 2047) {
printf(("%s0x%d.%" PRIx64 " p%+d\n"), sign ? "-" : "+",
(bexp != 0), man, bexp ? exp : -1023);
}
else {
if (man) {
printf("%s %s nan with payload: %llu\n",
sign ? "negative" : "positive",
man & (1ULL << 51) ? "quiet" : "signaling",
man & ~(1ULL << 51));
}
else {
printf("%sinf\n", sign ? "-" : "+");
}
}
}
double test(){
double snan = NAN;
print_double(&snan);
uint64_t *v = ((uint64_t *)&snan);
*v |= 12311111111;
print_double(&snan);
*v -= (1ULL << 51);
print_double(&snan);
double v2 = snan;
print_double(&v2);
v = ((uint64_t *)&v2);
if (*v & (1ULL << 51)) {
*v -= (1ULL << 51);
}
return v2;
}
int main(int argc, char* argv[])
{
double v3 = test();
print_double(&v3);
return 0;
} |
* Fix strict aliasing in PyFloat_Pack8() and PyFloat_Pack4() * Fix _testcapi.float_set_snan() on x86 (32-bit).
* Fix strict aliasing in PyFloat_Pack8() and PyFloat_Pack4(). * Fix _testcapi.float_set_snan() on x86 (32-bit).
* Only test 64-bit double on x86. * Fix _testcapi.float_pack(): use memcpy() to preserve the sNaN bit.
Reduce also the number of iterations from 1000 to 10 to ease debugging failures and prevent "command line too line" error when tests are re-run.
Reduce also the number of iterations from 1000 to 10 to ease debugging failures and prevent "command line too line" error when tests are re-run.
No, please don't close this yet. I think the best option is just skip testing of just snan's round-trip in 32-bit mode. Unfortunately we can't do here anything else with the current PyFloat_Pack/Unpack API. (Passing by reference - works.) Maybe it worth documenting? |
Documentating the issue sounds like a good option.
I don't think that it's worth it it to add a new API just for sNaN. |
* skip sNaN's testing in 32-bit mode * drop float_set_snan() helper * use memcpy() workaround for sNaN's in PyFloat_Unpack4() * document, that sNaN's may not be preserved by PyFloat_Pack/Unpack API
PR is ready: #133204
Sure. Though, maybe it's something we could keep in mind. Using unsigned char* PyFloat_Pack(PyObject *x, size_t size, int le);
PyObject* PyFloat_Unpack(const unsigned char *p, size_t size, int le); Two functions vs 6, we also can support someday IEEE sizes>=128. Currently, we can workaround this problem in struct.pack/unpack(), at cost of code complexity. But I doubt it worth. |
* Skip sNaN's testing in 32-bit mode. * Drop float_set_snan() helper. * Use memcpy() workaround for sNaN's in PyFloat_Unpack4(). * Document, that sNaN's may not be preserved by PyFloat_Pack/Unpack API.
Most platforms are now 64-bit and don't seem to be affected by the issue. I don't think that it's worth it to invest time on fixing struct.pack/unpack(). And I would prefer that |
Bug report
Bug description:
I noticed that chaining
struct.unpack()
andstruct.pack()
for IEEE 754 Half Precision floats (e
) is non-invertible fornan
. E.g.:IEEE
nan
s aren't unique, so this isn't that surprising... However I found it curious that the same behavior is not exhibited forfloat
(f
) ordouble
(d
) format, where every original bit pattern I tested could be recovered from the unpackednan
object.Is this by design?
Here's a quick
pytest
script that tests over a broad range ofnan
/inf
/-inf
cases for each encoding format.CPython versions tested on:
3.13, 3.11, 3.12
Operating systems tested on:
Linux, Windows
Linked PRs
The text was updated successfully, but these errors were encountered: