Skip to content

use mean of min/max years as offset in calculation of datetime64 mean #10035

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Feb 7, 2025

Conversation

kmuehlbauer
Copy link
Contributor

@kmuehlbauer kmuehlbauer commented Feb 7, 2025

This calculates the mean out of min and max years and uses that as the offset, effectively moving the offset right in the middle of the available datetime64. This decreases the maximum value of the calculated timedelta64 values by half, such that overflow is prevented.

@kmuehlbauer kmuehlbauer changed the title use mean of min/max years as offset in caclulation of datetime64 mean use mean of min/max years as offset in calculation of datetime64 mean Feb 7, 2025
@kmuehlbauer kmuehlbauer marked this pull request as ready for review February 7, 2025 10:04
@kmuehlbauer kmuehlbauer merged commit df2ecf4 into pydata:main Feb 7, 2025
29 checks passed
@kmuehlbauer kmuehlbauer deleted the fix-mean-datetime64 branch February 7, 2025 11:08
@kmuehlbauer
Copy link
Contributor Author

Merged the wrong PR 😞 in the wrong repo. Before I revert this, would it make sense to check if it's good enough, @spencerkclark?

@spencerkclark
Copy link
Member

Thanks @kmuehlbauer—no worries. This is pretty clever—with this approach I think you can still end up in the odd situation where computing the offset can lead to overflow, e.g. if the minimum and maximum year are both 1677 with nanosecond-resolution times, but the resulting timedeltas and mean remain correct (at least to the extent that they would be without overflow). I think I'm on board with keeping this change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Taking the mean of a long-spanning np.datetime64 array produces the wrong value
2 participants