We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
There was an error while loading. Please reload this page.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
I have checked that this issue has not already been reported.
I have confirmed this bug exists on the latest version of pandas.
I have confirmed this bug exists on the main branch of pandas.
import pandas as pd import numpy as np import numpy.ma as ma x = np.array(['a', 'b']) mx = ma.masked_array(x, mask=[0, 1]) print(pd.DataFrame(mx))
creating dataframe from masked numpy array turns missing string values into NaN
For examples like this, where the masked array is of string dtype, I expect pandas to fill missing data with None instead of NaN.
None
NaN
INSTALLED VERSIONS ------------------ commit : 0f437949513225922d851e9581723d82120684a6 python : 3.8.16.final.0 python-bits : 64 OS : Darwin OS-release : 22.5.0 Version : Darwin Kernel Version 22.5.0: Mon Apr 24 20:51:50 PDT 2023; root:xnu-8796.121.2~5/RELEASE_X86_64 machine : x86_64 processor : i386 byteorder : little LC_ALL : None LANG : en_US.UTF-8 LOCALE : en_US.UTF-8 pandas : 2.0.3 numpy : 1.24.4 pytz : 2023.3 dateutil : 2.8.2 setuptools : 67.8.0 pip : 23.1.2 Cython : None pytest : None hypothesis : None sphinx : None blosc : None feather : None xlsxwriter : None lxml.etree : None html5lib : None pymysql : None psycopg2 : None jinja2 : None IPython : 8.12.2 pandas_datareader: None bs4 : None bottleneck : None brotli : None fastparquet : None fsspec : None gcsfs : None matplotlib : None numba : None numexpr : None odfpy : None openpyxl : None pandas_gbq : None pyarrow : None pyreadstat : None pyxlsb : None s3fs : None scipy : None snappy : None sqlalchemy : None tables : None tabulate : None xarray : None xlrd : None zstandard : None tzdata : 2023.3 qtpy : None pyqt5 : None
The text was updated successfully, but these errors were encountered:
Possibly #51872 and #20427 are relevant.
Sorry, something went wrong.
Agreed, Duplicate of #51872
Successfully merging a pull request may close this issue.
Pandas version checks
I have checked that this issue has not already been reported.
I have confirmed this bug exists on the latest version of pandas.
I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
Issue Description
creating dataframe from masked numpy array turns missing string values into NaN
Expected Behavior
For examples like this, where the masked array is of string dtype, I expect pandas to fill missing data with
None
instead ofNaN
.Installed Versions
Details
The text was updated successfully, but these errors were encountered: