You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Right now, the output shape and dtype of DataFrame.describe for object columns depends on whether the DataFrame is empty.
Code Sample, a copy-pastable example if possible
In [75]: x=pd.DataFrame({"A": ['a', np.nan, np.nan]})
In [76]: x.describe()
Out[76]:
Acount1unique1topafreq1In [77]: x.iloc[:0].describe()
Out[77]:
Acount0unique0
Problem description
This leads to instability in the output dtypes and shape.
Would people prefer that we use np.NaN or None for the top and freq in this case? I believe there's no ambiguity, since we drop missing values before computing.
While the output consistency would be nice, it's not clear to me what's actually best for users here.
The text was updated successfully, but these errors were encountered:
On Sat, May 18, 2019 at 11:10 AM enisnazif ***@***.***> wrote:
Seems reasonable to me - would probably also need a note in the docs
explaining what None means in this context. Mind if I attempt a fix?
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#26397?email_source=notifications&email_token=AAKAOIWTQX3YM2K4JJCG26LPWAS7XA5CNFSM4HM5TRIKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODVWRLSY#issuecomment-493688267>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAKAOIXHHEB2LAIQB6E764DPWAS7XANCNFSM4HM5TRIA>
.
Uh oh!
There was an error while loading. Please reload this page.
Right now, the output shape and dtype of DataFrame.describe for object columns depends on whether the DataFrame is empty.
Code Sample, a copy-pastable example if possible
Problem description
This leads to instability in the output dtypes and shape.
Would people prefer that we use
np.NaN
orNone
for the top and freq in this case? I believe there's no ambiguity, since we drop missing values before computing.While the output consistency would be nice, it's not clear to me what's actually best for users here.
The text was updated successfully, but these errors were encountered: