Skip to content

Commit 5b24d15

Browse files
authored
Improve hypot() accuracy with three separate accumulators (GH-22032)
1 parent 1d25f5b commit 5b24d15

File tree

1 file changed

+4
-4
lines changed

1 file changed

+4
-4
lines changed

Modules/mathmodule.c

+4-4
Original file line numberDiff line numberDiff line change
@@ -2456,7 +2456,7 @@ Given that csum >= 1.0, we have:
24562456
Since lo**2 is less than 1/2 ulp(csum), we have csum+lo*lo == csum.
24572457
24582458
To minimize loss of information during the accumulation of fractional
2459-
values, the lo**2 term has a separate accumulator.
2459+
values, each term has a separate accumulator.
24602460
24612461
The square root differential correction is needed because a
24622462
correctly rounded square root of a correctly rounded sum of
@@ -2487,7 +2487,7 @@ static inline double
24872487
vector_norm(Py_ssize_t n, double *vec, double max, int found_nan)
24882488
{
24892489
const double T27 = 134217729.0; /* ldexp(1.0, 27)+1.0) */
2490-
double x, csum = 1.0, oldcsum, frac = 0.0, frac_lo = 0.0, scale;
2490+
double x, csum = 1.0, oldcsum, scale, frac=0.0, frac_mid=0.0, frac_lo=0.0;
24912491
double t, hi, lo, h;
24922492
int max_e;
24932493
Py_ssize_t i;
@@ -2529,12 +2529,12 @@ vector_norm(Py_ssize_t n, double *vec, double max, int found_nan)
25292529
assert(fabs(csum) >= fabs(x));
25302530
oldcsum = csum;
25312531
csum += x;
2532-
frac += (oldcsum - csum) + x;
2532+
frac_mid += (oldcsum - csum) + x;
25332533

25342534
assert(csum + lo * lo == csum);
25352535
frac_lo += lo * lo;
25362536
}
2537-
frac += frac_lo;
2537+
frac += frac_lo + frac_mid;
25382538
h = sqrt(csum - 1.0 + frac);
25392539

25402540
x = h;

0 commit comments

Comments
 (0)