-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
read_csv: European numbers do not work with dates #14066
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Can you post some example data and the actual |
@TomAugspurger thanks for the response. Here goes the example:
|
A couple of things:
However, even with that parameter fixed, the reason you're seeing the That is a bug, so thank you for pointing it out! The issue actually has nothing to do with European date formats. You can see the bug surfaced here with a much more simplified example: >>> from pandas import read_csv
>>> from pandas.compat import StringIO
>>>
>>> data = 'a\n04.15.2016'
>>> read_csv(StringIO(data), index_col=0, parse_dates=True, thousands='.')
Empty DataFrame
Columns: []
Index: [4152016] # WRONG
>>>
>>> read_csv(StringIO(data), index_col=0, parse_dates=True)
Empty DataFrame
Columns: []
Index: [2016-04-15 00:00:00] # RIGHT Note that this bug does not affect non-index columns: >>> read_csv(StringIO(data), parse_dates=['a'], thousands='.')
a
0 2016-04-15 Similar observations can be made with the Python parser. |
@gfyoung thanks for confirmation. minor clarification:
I was referring to European data as under: Quoting, Compression, and File Format No idea how to go on from here but looks like the processing priorities need to be changed in the parser. |
@dacoex : Ah, okay. Good to know that my minimal example is capturing the issue you were seeing! |
When a thousands parameter is specified, if the index column data contains that thousands value for date purposes (e.g. '.'), do not interpret those characters as the thousands parameter. Closes pandas-devgh-14066.
When a thousands parameter is specified, if the index column data contains that thousands value for date purposes (e.g. '.'), do not interpret those characters as the thousands parameter. Closes gh-14066.
Big thanks to @jorisvandenbossche & @gfyoung FOSS is great! |
BTW, to we need to add this to: |
it's takes a while to actually generate those docs |
I tested with the v0.19 RC and it works with the original data. Thanks again! |
Uh oh!
There was an error while loading. Please reload this page.
Code Sample, a copy-pastable example if possible
Using the following reader leads to omission of the dates resulting in no index:
Expected Output
dataframe with pyhon numeric data and datecol as index
output of
pd.show_versions()
The text was updated successfully, but these errors were encountered: