Minor inaccurracy in documentation of read_csv's option mangle_dupe_cols #19203

Bernhard10 · 2018-01-12T09:51:08Z

Code example:

File test.csv:

,a,a,b
0,1,2,3
1,4,5,6

Python code:

import pandas as pd
df = pd.read_csv("test.csv")
df.columns.values

Gives ['a', 'a.1', 'b' ] and not, as documented ['a.0', 'a.1', 'b']

Problem description

The documentation states that names will be specified as ‘X.0’...’X.N’, but in fact the names become 'X','X.1',...'X.N'

So in contrast to what the documentation says, the duplicate column-name is not changed at the first occurrence and only subsequent occurrences get a number appended.

Expected Output

Either change the code to mangle the first duplicate column name, or simply fix the documentation.

Output of `pd.show_versions()`

INSTALLED VERSIONS

commit: None
python: 3.6.3.final.0
python-bits: 64
OS: Linux
OS-release: 4.14.11-200.fc26.x86_64
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8

pandas: 0.22.0
pytest: 3.3.1
pip: 9.0.1
setuptools: 36.6.0
Cython: None
numpy: 1.14.0
scipy: 0.19.1
pyarrow: None
xarray: None
IPython: 6.2.1
sphinx: 1.5.5
patsy: None
dateutil: 2.6.1
pytz: 2017.3
blosc: None
bottleneck: 1.2.1
tables: 3.3.0
numexpr: 2.6.1
feather: None
matplotlib: 2.1.1
openpyxl: None
xlrd: 1.0.0
xlwt: 1.1.2
xlsxwriter: None
lxml: None
bs4: 4.6.0
html5lib: 1.0b10
sqlalchemy: 1.1.13
pymysql: None
psycopg2: 2.6.2 (dt dec pq3 ext lo64)
jinja2: 2.9.6
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None

The text was updated successfully, but these errors were encountered:

jorisvandenbossche · 2018-01-12T11:05:00Z

That observation seems correct. Do you want to do a PR to update the docs?

Bernhard10 · 2018-01-12T11:39:05Z

Sure, I'll submit a PR today.

bhavybarca · 2018-01-12T20:20:04Z

I would like to contribute to this issue

jorisvandenbossche · 2018-01-12T22:09:01Z

@bhavybarca there is already an open PR for this

jorisvandenbossche added Docs IO CSV read_csv, to_csv good first issue labels Jan 12, 2018

jreback added this to the Next Major Release milestone Jan 12, 2018

Bernhard10 mentioned this issue Jan 12, 2018

DOC: Fix documentation for read_csv's mangle_dupe_cols (GH19203) #19208

Merged

jreback modified the milestones: Next Major Release, 0.23.0 Jan 15, 2018

jorisvandenbossche closed this as completed in #19208 Jan 15, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Minor inaccurracy in documentation of read_csv's option mangle_dupe_cols #19203

Minor inaccurracy in documentation of read_csv's option mangle_dupe_cols #19203

Bernhard10 commented Jan 12, 2018 •

edited

Loading

INSTALLED VERSIONS

jorisvandenbossche commented Jan 12, 2018

Uh oh!

Bernhard10 commented Jan 12, 2018

Uh oh!

bhavybarca commented Jan 12, 2018

Uh oh!

jorisvandenbossche commented Jan 12, 2018

Uh oh!

Uh oh!

Minor inaccurracy in documentation of read_csv's option mangle_dupe_cols #19203

Minor inaccurracy in documentation of read_csv's option mangle_dupe_cols #19203

Comments

Bernhard10 commented Jan 12, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code example:

Problem description

Expected Output

Output of pd.show_versions()

INSTALLED VERSIONS

jorisvandenbossche commented Jan 12, 2018

Uh oh!

Bernhard10 commented Jan 12, 2018

Uh oh!

bhavybarca commented Jan 12, 2018

Uh oh!

jorisvandenbossche commented Jan 12, 2018

Uh oh!

Bernhard10 commented Jan 12, 2018 •

edited

Loading

Output of `pd.show_versions()`