Description
Description
I was experiencing some really strange issues where a fit would only succeed when executed in a process-pool, but fail in single threaded mode. Digging down, it appears the divergence was due to non-identical output from sigma_clipped_stats
between single and multi-process execution.
The reason for that boils down to FITS-images being read as dtype >f4
(big endian, non-native on x86), which when passed directly to bottleneck for mean/median will default to the numpy implementation of nanmean
/nanmedian
. In the process of pickling, the array is converted to <f4
and ends up in the bottleneck implementation.
As far as I can tell, bottleneck performs a straight sum over all elements, while numpy seems to use some form of compensated summation scheme, to decrease numerical error.
Actual behavior
As bottleneck gets automatically selected by astropy if available, the choice between a accuracy and speed is potentially hidden from the user. This may well be a case of "works as intended" but the chance of reading FITS and unintentionally triggering a conversion to a (on most machines) native array format only under specific circumstances seems rather high. This may cause a lot of headscratching for people that see different results for the same data in seemingly equivalent programs.
This could plausibly also be considered a bug or unexpected behaviour in bottleneck as the library gives the impression of being a complete drop-in replacement for certain numpy routines.
Expected behavior
I'm not exactly sure what the best resolution is here. The fact that numpy supports non-native byteorder without explicit conversion/warnings complicates things a bit. Obviously for the user it would be the easiest if all array-operations would have the same results, irrespective of byte-order, but that would possibly cause issues with mmap or performance.
Some kind of warning or note, either when applying sigma_clipped_stats
to a non-native byteorder or when a FITS file is read, or at least a disclaimer in the docs for io.fits
about this possible gotcha would help.
Steps to Reproduce
import numpy as np
import requests
from astropy.stats import sigma_clipped_stats
import warnings
r=requests.get('https://gist.github.com/krachyon/7408a4029fd9cca83f5a8c6ecbeb1173/raw/0ba5cc3e5a0d0eb02a2a4da3d04ec16ee8aa4267/repro_array.npy')
with open('repro_array.npy','wb') as f:
f.write(r.content)
arr = np.load('repro_array.npy')
with warnings.catch_warnings():
warnings.filterwarnings('ignore', '.*Input data contains invalid values.*', append=True)
print('as_is: ', sigma_clipped_stats(arr))
print('native: ', sigma_clipped_stats(arr.astype(np.float32)))
print('higher precision: ', sigma_clipped_stats(arr.astype(np.float64)))
print('')
print('delta:', np.array(sigma_clipped_stats(arr))-np.array(sigma_clipped_stats(arr.astype(np.float32))))
print('dtype resolution:', np.finfo(np.float32).resolution)
prints
as_is: (2523.1194, 2519.48, 50.65029)
native: (2523.126220703125, 2519.47998046875, 50.6502799987793)
higher precision: (2523.1193250563483, 2519.4801025390625, 50.650290905964354)
delta: [-6.83593750e-03 0.00000000e+00 1.14440918e-05]
dtype resolution: 1e-06
System Details
Linux-5.11.11-arch1-1-x86_64-with-glibc2.33
Python 3.9.2 (default, Feb 20 2021, 18:40:11)
[GCC 10.2.0]
astropy 4.2
bottleneck 1.3.2