-
-
Notifications
You must be signed in to change notification settings - Fork 19.3k
Closed
Closed
Copy link
Labels
BugDtype ConversionsUnexpected or buggy dtype conversionsUnexpected or buggy dtype conversionsExtensionArrayExtending pandas with custom dtypes or arrays.Extending pandas with custom dtypes or arrays.NA - MaskedArraysRelated to pd.NA and nullable extension arraysRelated to pd.NA and nullable extension arrays
Description
Constructing a Series with dtype=Int64Dtype() suggests that there can be an intermiediate conversion to double if the input contains NaNs:
import pandas as pd
n = 2**62+5
print(n, int(float(n)))
print(pd.Series([n, float('nan')]))
print(pd.Series([n, float('nan')], dtype=pd.Int64Dtype()))
print(pd.Series([n, float('nan')], dtype=object).astype(pd.Int64Dtype()))
In both latter cases we end up with a Series with dtype=Int64 containing integer NaNs. However, in the penultimate case we get the int-float-int converted value of n (4611686018427387904), and we only get the exact Int64 result (4611686018427387909) if we instantiate the Series with dtype=object and convert afterward.
This doesn't happen if the input doesn't contain any NaNs:
print(pd.Series([n, n+3], dtype=pd.Int64Dtype()))
Expected output is 4611686018427387909 inside the Series constructed with dtype=Int64Dtype().
Issue is there in master.
ztane
Metadata
Metadata
Assignees
Labels
BugDtype ConversionsUnexpected or buggy dtype conversionsUnexpected or buggy dtype conversionsExtensionArrayExtending pandas with custom dtypes or arrays.Extending pandas with custom dtypes or arrays.NA - MaskedArraysRelated to pd.NA and nullable extension arraysRelated to pd.NA and nullable extension arrays