Skip to content

Surprising differences in stats due to effects of FITS-endianness  #11492

Closed
@krachyon

Description

@krachyon

Description

I was experiencing some really strange issues where a fit would only succeed when executed in a process-pool, but fail in single threaded mode. Digging down, it appears the divergence was due to non-identical output from sigma_clipped_stats between single and multi-process execution.
The reason for that boils down to FITS-images being read as dtype >f4 (big endian, non-native on x86), which when passed directly to bottleneck for mean/median will default to the numpy implementation of nanmean/nanmedian. In the process of pickling, the array is converted to <f4 and ends up in the bottleneck implementation.
As far as I can tell, bottleneck performs a straight sum over all elements, while numpy seems to use some form of compensated summation scheme, to decrease numerical error.

Actual behavior

As bottleneck gets automatically selected by astropy if available, the choice between a accuracy and speed is potentially hidden from the user. This may well be a case of "works as intended" but the chance of reading FITS and unintentionally triggering a conversion to a (on most machines) native array format only under specific circumstances seems rather high. This may cause a lot of headscratching for people that see different results for the same data in seemingly equivalent programs.

This could plausibly also be considered a bug or unexpected behaviour in bottleneck as the library gives the impression of being a complete drop-in replacement for certain numpy routines.

Expected behavior

I'm not exactly sure what the best resolution is here. The fact that numpy supports non-native byteorder without explicit conversion/warnings complicates things a bit. Obviously for the user it would be the easiest if all array-operations would have the same results, irrespective of byte-order, but that would possibly cause issues with mmap or performance.

Some kind of warning or note, either when applying sigma_clipped_stats to a non-native byteorder or when a FITS file is read, or at least a disclaimer in the docs for io.fits about this possible gotcha would help.

Steps to Reproduce

import numpy as np
import requests
from astropy.stats import sigma_clipped_stats
import warnings

r=requests.get('https://gist.github.com/krachyon/7408a4029fd9cca83f5a8c6ecbeb1173/raw/0ba5cc3e5a0d0eb02a2a4da3d04ec16ee8aa4267/repro_array.npy')
with open('repro_array.npy','wb') as f:
    f.write(r.content)

arr = np.load('repro_array.npy')

with warnings.catch_warnings():
    warnings.filterwarnings('ignore', '.*Input data contains invalid values.*', append=True)

    print('as_is: ', sigma_clipped_stats(arr))
    print('native: ', sigma_clipped_stats(arr.astype(np.float32)))
    print('higher precision: ', sigma_clipped_stats(arr.astype(np.float64)))
    print('')
    print('delta:', np.array(sigma_clipped_stats(arr))-np.array(sigma_clipped_stats(arr.astype(np.float32))))
    print('dtype resolution:', np.finfo(np.float32).resolution)

prints

as_is:  (2523.1194, 2519.48, 50.65029)
native:  (2523.126220703125, 2519.47998046875, 50.6502799987793)
higher precision:  (2523.1193250563483, 2519.4801025390625, 50.650290905964354)

delta: [-6.83593750e-03  0.00000000e+00  1.14440918e-05]
dtype resolution: 1e-06

System Details

Linux-5.11.11-arch1-1-x86_64-with-glibc2.33
Python 3.9.2 (default, Feb 20 2021, 18:40:11)
[GCC 10.2.0]
astropy 4.2
bottleneck 1.3.2

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions