Skip to content

Updated read_excel docstring to include parse_dates and date_parser #11527

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

litchfield
Copy link

No description provided.

@chris-b1
Copy link
Contributor

chris-b1 commented Nov 6, 2015

I'm not sure those keywords actually do anything with the excel parser?

In [12]: dti = pd.date_range('2014-1-1', periods=10)

In [13]: df = pd.DataFrame({'dates':dti, 'strings':dti.strftime('%m/%d/%Y')})

In [14]: df.dtypes
Out[14]: 
dates      datetime64[ns]
strings            object
dtype: object

In [15]: df.to_excel('test.xlsx')

In [16]: pd.read_excel('test.xlsx').dtypes
Out[16]: 
dates      datetime64[ns]
strings            object
dtype: object

In [17]: pd.read_excel('test.xlsx', parse_dates=True).dtypes
Out[17]: 
dates      datetime64[ns]
strings            object
dtype: object

In [18]: pd.read_excel('test.xlsx', parse_dates=False).dtypes
Out[18]: 
dates      datetime64[ns]
strings            object
dtype: object

@jreback
Copy link
Contributor

jreback commented Nov 7, 2015

yeh, I don't think these are valid keywords. Actually what we really need here is a check on non-implemented keywords. Closing this and I will create another issue.

@jorisvandenbossche
Copy link
Member

@chris-b1 It actually even gives an error and is not just ignored. With your example (as parse_dates=True if for parsing the index, so if you want to see if it can parse the string column, you have to pass its name):

In [37]: pd.read_excel('test.xlsx', parse_dates=['strings'])

....

C:\Anaconda\lib\site-packages\pandas\io\parsers.pyc in _should_parse_dates(self,
 i)
    812             return self.parse_dates
    813         else:
--> 814             name = self.index_names[i]
    815             j = self.index_col[i]
    816

TypeError: 'NoneType' object has no attribute '__getitem__'

But if you set the strings column as the index, parse_dates=True is indeed ignored.

@jorisvandenbossche
Copy link
Member

@chris-b1 This error actually only happens if you have an implicit index due to the structure of the excel file. If you don't have this, parse_dates works as expected:

In [46]: df.to_excel('test.xlsx', index=False)

In [47]: pd.read_excel('test.xlsx').dtypes
Out[47]:
dates      datetime64[ns]
strings            object
dtype: object

In [48]: pd.read_excel('test.xlsx', parse_dates=['strings']).dtypes
Out[48]:
dates      datetime64[ns]
strings    datetime64[ns]
dtype: object

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
IO Excel read_excel, to_excel
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants