Skip to content

Commit 87d3bd0

Browse files
authored
gh-100833: Remove 'volatile' qualifiers in fsum algorithm (#100845)
This PR removes the `volatile` qualifier on various intermediate quantities in the `math.fsum` implementation, and updates the notes preceding the algorithm accordingly (as well as fixing some of the exsting notes). See the linked issue #100833 for discussion.
1 parent b139bcd commit 87d3bd0

File tree

2 files changed

+23
-22
lines changed

2 files changed

+23
-22
lines changed
Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
Speed up :func:`math.fsum` by removing defensive ``volatile`` qualifiers.

Modules/mathmodule.c

Lines changed: 22 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -1358,30 +1358,30 @@ FUNC1(tanh, tanh, 0,
13581358
Dickinson's post at <http://bugs.python.org/file10357/msum4.py>.
13591359
See those links for more details, proofs and other references.
13601360
1361-
Note 1: IEEE 754R floating point semantics are assumed,
1362-
but the current implementation does not re-establish special
1363-
value semantics across iterations (i.e. handling -Inf + Inf).
1361+
Note 1: IEEE 754 floating-point semantics with a rounding mode of
1362+
roundTiesToEven are assumed.
13641363
1365-
Note 2: No provision is made for intermediate overflow handling;
1366-
therefore, sum([1e+308, 1e-308, 1e+308]) returns 1e+308 while
1367-
sum([1e+308, 1e+308, 1e-308]) raises an OverflowError due to the
1364+
Note 2: No provision is made for intermediate overflow handling;
1365+
therefore, fsum([1e+308, -1e+308, 1e+308]) returns 1e+308 while
1366+
fsum([1e+308, 1e+308, -1e+308]) raises an OverflowError due to the
13681367
overflow of the first partial sum.
13691368
1370-
Note 3: The intermediate values lo, yr, and hi are declared volatile so
1371-
aggressive compilers won't algebraically reduce lo to always be exactly 0.0.
1372-
Also, the volatile declaration forces the values to be stored in memory as
1373-
regular doubles instead of extended long precision (80-bit) values. This
1374-
prevents double rounding because any addition or subtraction of two doubles
1375-
can be resolved exactly into double-sized hi and lo values. As long as the
1376-
hi value gets forced into a double before yr and lo are computed, the extra
1377-
bits in downstream extended precision operations (x87 for example) will be
1378-
exactly zero and therefore can be losslessly stored back into a double,
1379-
thereby preventing double rounding.
1380-
1381-
Note 4: A similar implementation is in Modules/cmathmodule.c.
1382-
Be sure to update both when making changes.
1383-
1384-
Note 5: The signature of math.fsum() differs from builtins.sum()
1369+
Note 3: The algorithm has two potential sources of fragility. First, C
1370+
permits arithmetic operations on `double`s to be performed in an
1371+
intermediate format whose range and precision may be greater than those of
1372+
`double` (see for example C99 §5.2.4.2.2, paragraph 8). This can happen for
1373+
example on machines using the now largely historical x87 FPUs. In this case,
1374+
`fsum` can produce incorrect results. If `FLT_EVAL_METHOD` is `0` or `1`, or
1375+
`FLT_EVAL_METHOD` is `2` and `long double` is identical to `double`, then we
1376+
should be safe from this source of errors. Second, an aggressively
1377+
optimizing compiler can re-associate operations so that (for example) the
1378+
statement `yr = hi - x;` is treated as `yr = (x + y) - x` and then
1379+
re-associated as `yr = y + (x - x)`, giving `y = yr` and `lo = 0.0`. That
1380+
re-association would be in violation of the C standard, and should not occur
1381+
except possibly in the presence of unsafe optimizations (e.g., -ffast-math,
1382+
-fassociative-math). Such optimizations should be avoided for this module.
1383+
1384+
Note 4: The signature of math.fsum() differs from builtins.sum()
13851385
because the start argument doesn't make sense in the context of
13861386
accurate summation. Since the partials table is collapsed before
13871387
returning a result, sum(seq2, start=sum(seq1)) may not equal the
@@ -1467,7 +1467,7 @@ math_fsum(PyObject *module, PyObject *seq)
14671467
Py_ssize_t i, j, n = 0, m = NUM_PARTIALS;
14681468
double x, y, t, ps[NUM_PARTIALS], *p = ps;
14691469
double xsave, special_sum = 0.0, inf_sum = 0.0;
1470-
volatile double hi, yr, lo;
1470+
double hi, yr, lo;
14711471

14721472
iter = PyObject_GetIter(seq);
14731473
if (iter == NULL)

0 commit comments

Comments
 (0)