Skip to content

DOC: update the to_excel docstring #20185

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Jul 24, 2018
Merged
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
79 changes: 52 additions & 27 deletions pandas/core/generic.py
Original file line number Diff line number Diff line change
Expand Up @@ -1591,62 +1591,87 @@ def _repr_latex_(self):
# I/O Methods

_shared_docs['to_excel'] = """
Write %(klass)s to an excel sheet
%(versionadded_to_excel)s
Write %(klass)s to an excel sheet.

To write a %(klass)s to an excel .xlsx file it is necessary to first create
an ExcelWriter object with a target file name, and specify a sheet in the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since ExcelWriter is a class I believe it should be between backticks to make it itallic like

`ExcelWriter`

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

file to write to. Multiple sheets may be written to by
specifying unique sheet_name. With all data written to the file it is
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sheet name between backquotes since it's a variable name, see http://numpydoc.readthedocs.io/en/latest/format.html#sections

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

necessary to save the changes. Note that creating an ExcelWriter object
with a file name that already exists will result in the contents of the
existing file being erased.

Parameters
----------
excel_writer : string or ExcelWriter object
File path or existing ExcelWriter
File path or existing ExcelWriter.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we link to the class? I believe

:class:`pandas.ExcelWriter`

should work.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, in the extended description we say that an ExcelWriter object is necessary to be able to use .to_excel, but in the parameter description we say that we also accept a "file path". So the ExcelWriter is not really needed? What happens if we specify a file path? I guess that a new ExcelWriter gets automatically created with default options.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes this is allowed df1.to_excel("example.xlsx", sheet_name='Sheet3')

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the ExcelWriter is not required

sheet_name : string, default 'Sheet1'
Name of sheet which will contain DataFrame
Name of sheet which will contain DataFrame.
na_rep : string, default ''
Missing data representation
Missing data representation.
float_format : string, default None
Format string for floating point numbers
columns : sequence, optional
Columns to write
Format string for floating point numbers.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you mention some examples and/or link to where the format of this parameter is documented?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Formatting takes place here using the string modulo operator which is described here. I've added an example.

columns : sequence or list of string, optional
Columns to write.
header : boolean or list of string, default True
Write out the column names. If a list of strings is given it is
assumed to be aliases for the column names
assumed to be aliases for the column names.
index : boolean, default True
Write row names (index)
Write row names (index).
index_label : string or sequence, default None
Column label for index column(s) if desired. If None is given, and
`header` and `index` are True, then the index names are used. A
sequence should be given if the DataFrame uses MultiIndex.
startrow :
upper left cell row to dump data frame
startcol :
upper left cell column to dump data frame
startrow : integer, default 0
Upper left cell row to dump data frame.
startcol : integer, default 0
Upper left cell column to dump data frame.
engine : string, default None
write engine to use - you can also set this via the options
``io.excel.xlsx.writer``, ``io.excel.xls.writer``, and
Write engine to use, 'openpyxl' or 'xlsxwriter'. You can also set this
via the options ``io.excel.xlsx.writer``, ``io.excel.xls.writer``, and
``io.excel.xlsm.writer``.
merge_cells : boolean, default True
Write MultiIndex and Hierarchical Rows as merged cells.
encoding: string, default None
encoding of the resulting excel file. Only necessary for xlwt,
encoding : string, default None
Encoding of the resulting excel file. Only necessary for xlwt,
other writers support unicode natively.
inf_rep : string, default 'inf'
Representation for infinity (there is no native representation for
infinity in Excel)
infinity in Excel).
verbose : boolean, default True
Display more information in the error logs.
freeze_panes : tuple of integer (length 2), default None
Specifies the one-based bottommost row and rightmost column that
is to be frozen
is to be frozen.

.. versionadded:: 0.20.0
.. versionadded:: 0.20.0.

See Also
--------
pandas.read_excel

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add pandas.ExcelWriter ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK

Examples
--------

>>> df1 = pd.DataFrame([['a', 'b'], ['c', 'd']],
... index=['row 1', 'row 2'],
... columns=['col 1', 'col 2'])
>>> writer = pd.ExcelWriter('output.xlsx', engine='xlsxwriter')
>>> df1.to_excel(writer, sheet_name='Sheet1')
>>> writer.save()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've observed this is only needed if we are creating an ExcelWriter explicitly. If one specifies a file path, like df.to_excel('my_file.xml') it is not needed.

Perhaps it makes sense to show first an example with a file path, which is simpler, and then an example with an ExcelWriter where we can specify more custom options?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added


Notes
-----
If passing an existing ExcelWriter object, then the sheet will be added
to the existing workbook. This can be used to save different
DataFrames to one workbook:

>>> writer = pd.ExcelWriter('output.xlsx')
>>> df1.to_excel(writer,'Sheet1')
>>> df2.to_excel(writer,'Sheet2')
>>> writer.save()
>>> writer2 = pd.ExcelWriter('output2.xlsx', engine='xlsxwriter')
>>> df1.to_excel(writer2, sheet_name='Sheet1')
>>> df2 = df1.copy()
>>> df2.to_excel(writer2, sheet_name='Sheet2')
>>> writer2.save()

Limit floats to a fixed precision using float_format. For example
float_format="%.2f" will format 0.1234 to 0.12.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I think this one is generating the problem you have, because due to the % the templating / shared_docs system thinks it has to fill in something here, which it can't ...
You can fix it by doing "%%.2f" (and in the actual docstring it will look like a single %)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK


For compatibility with to_csv, to_excel serializes lists and dicts to
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a link to the method pandas.DataFrame.to_csv here as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

strings before writing.
Expand Down