Skip to content

BUG: index_col in read_csv ignores dtype #35431

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
2 of 3 tasks
metazoic opened this issue Jul 28, 2020 · 4 comments
Closed
2 of 3 tasks

BUG: index_col in read_csv ignores dtype #35431

metazoic opened this issue Jul 28, 2020 · 4 comments
Labels
Bug Duplicate Report Duplicate issue or pull request Index Related to the Index class or subclasses IO CSV read_csv, to_csv

Comments

@metazoic
Copy link

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • (optional) I have confirmed this bug exists on the master branch of pandas.


If index.csv contains:

id,key
00,11
22,33

then

df = pandas.read_csv('index.csv', dtype=str, index_col='id')
df.index.dtype

results in

dtype('int64')

Problem description

This does not result in an index of strings.

Expected Output

By contrast,

df = pandas.read_csv('index.csv', dtype=str)
df.set_index('id', inplace=True)
df.index.dtype

results in

dtype('O')

as intended.

Output of pd.show_versions()

INSTALLED VERSIONS
------------------
commit           : None
python           : 3.7.7.final.0
python-bits      : 64
OS               : Darwin
OS-release       : 18.7.0
machine          : x86_64
processor        : i386
byteorder        : little
LC_ALL           : None
LANG             : None
LOCALE           : None.UTF-8
 
pandas           : 1.0.5
numpy            : 1.18.5
pytz             : 2020.1
dateutil         : 2.8.1
pip              : 20.1.1
setuptools       : 49.2.0.post20200714
Cython           : 0.29.21
pytest           : 5.4.3
hypothesis       : None
sphinx           : 3.1.2
blosc            : None
feather          : None
xlsxwriter       : 1.2.9
lxml.etree       : 4.5.2
html5lib         : 1.1
pymysql          : None
psycopg2         : None
jinja2           : 2.11.2
IPython          : 7.16.1
pandas_datareader: None
bs4              : 4.9.1
bottleneck       : 1.3.2
fastparquet      : None
gcsfs            : None
lxml.etree       : 4.5.2
matplotlib       : 3.2.2
numexpr          : 2.7.1
odfpy            : None
openpyxl         : 3.0.4
pandas_gbq       : None
pyarrow          : None
pytables         : None
pytest           : 5.4.3
pyxlsb           : None
s3fs             : None
scipy            : 1.5.0
sqlalchemy       : 1.3.18
tables           : 3.4.4
tabulate         : None
xarray           : None
xlrd             : 1.2.0
xlwt             : 1.3.0
xlsxwriter       : 1.2.9
numba            : 0.50.1
@metazoic metazoic added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Jul 28, 2020
@simonjayhawkins
Copy link
Member

Thanks @metazoic for the report.

Further investigation and PRs welcome.

@simonjayhawkins simonjayhawkins added Index Related to the Index class or subclasses IO CSV read_csv, to_csv and removed Needs Triage Issue that has not been reviewed by a pandas team member labels Jul 28, 2020
@simonjayhawkins simonjayhawkins added this to the Contributions Welcome milestone Jul 28, 2020
@jreback
Copy link
Contributor

jreback commented Jul 28, 2020

look for a duplicate issue

@simonjayhawkins
Copy link
Member

duplicate of #32930 so closing

@simonjayhawkins simonjayhawkins added the Duplicate Report Duplicate issue or pull request label Jul 28, 2020
@simonjayhawkins simonjayhawkins removed this from the Contributions Welcome milestone Jul 28, 2020
@metazoic
Copy link
Author

@simonjayhawkins @jreback I promise I searched open issues, but the other issue didn't pop up with a "read_csv index_col" query (because it was labeled "pd.read_csv ...")!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Duplicate Report Duplicate issue or pull request Index Related to the Index class or subclasses IO CSV read_csv, to_csv
Projects
None yet
Development

No branches or pull requests

3 participants