Surprising differences in stats due to effects of FITS-endianness 

### Description
I was experiencing some really strange issues where a fit would only succeed when executed in a process-pool, but fail in single threaded mode. Digging down, it appears the divergence was due to non-identical output from `sigma_clipped_stats` between single and multi-process execution.
The reason for that boils down to FITS-images being read as dtype `>f4` (big endian, non-native on x86), which when passed directly to bottleneck for mean/median will default to the numpy implementation of `nanmean`/`nanmedian`. In the process of pickling, the array is converted to `<f4` and ends up in the bottleneck implementation.
As far as I can tell, bottleneck performs a straight sum over all elements, while numpy seems to use some form of compensated summation scheme, to decrease numerical error.

### Actual behavior
As bottleneck gets automatically selected by astropy if available, the choice between a accuracy and speed is potentially hidden from the user. This may well be a case of "works as intended" but the chance of reading FITS and unintentionally triggering a conversion to a (on most machines) native array format only under specific circumstances seems rather high. This may cause a lot of headscratching for people that see different results for the same data in seemingly equivalent programs. 

This could plausibly also be considered a bug or unexpected behaviour in bottleneck as the library gives the impression of being a complete drop-in replacement for certain numpy routines.

### Expected behavior
I'm not exactly sure what the best resolution is here. The fact that numpy supports non-native byteorder without explicit conversion/warnings complicates things a bit. Obviously for the user it would be the easiest if all array-operations would have the same results, irrespective of byte-order, but that would possibly cause issues with mmap or performance.

Some kind of warning or note, either when applying `sigma_clipped_stats` to a non-native byteorder or when a FITS file is read, or at least a disclaimer in the docs for `io.fits` about this possible gotcha would help.

### Steps to Reproduce

```python
import numpy as np
import requests
from astropy.stats import sigma_clipped_stats
import warnings

r=requests.get('https://gist.github.com/krachyon/7408a4029fd9cca83f5a8c6ecbeb1173/raw/0ba5cc3e5a0d0eb02a2a4da3d04ec16ee8aa4267/repro_array.npy')
with open('repro_array.npy','wb') as f:
    f.write(r.content)

arr = np.load('repro_array.npy')

with warnings.catch_warnings():
    warnings.filterwarnings('ignore', '.*Input data contains invalid values.*', append=True)

    print('as_is: ', sigma_clipped_stats(arr))
    print('native: ', sigma_clipped_stats(arr.astype(np.float32)))
    print('higher precision: ', sigma_clipped_stats(arr.astype(np.float64)))
    print('')
    print('delta:', np.array(sigma_clipped_stats(arr))-np.array(sigma_clipped_stats(arr.astype(np.float32))))
    print('dtype resolution:', np.finfo(np.float32).resolution)
```

prints

```
as_is:  (2523.1194, 2519.48, 50.65029)
native:  (2523.126220703125, 2519.47998046875, 50.6502799987793)
higher precision:  (2523.1193250563483, 2519.4801025390625, 50.650290905964354)

delta: [-6.83593750e-03  0.00000000e+00  1.14440918e-05]
dtype resolution: 1e-06
```

### System Details
Linux-5.11.11-arch1-1-x86_64-with-glibc2.33
Python 3.9.2 (default, Feb 20 2021, 18:40:11) 
[GCC 10.2.0]
astropy 4.2
bottleneck 1.3.2


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Surprising differences in stats due to effects of FITS-endianness #11492

Description

Actual behavior

Expected behavior

Steps to Reproduce

System Details

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Surprising differences in stats due to effects of FITS-endianness #11492

Description

Description

Actual behavior

Expected behavior

Steps to Reproduce

System Details

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions