Skip to content

BUG: cells are missing in the excel file when exporting excel using xlsxwriter with option constant_memory set to True (#15392) #27624

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 17 commits into from

Conversation

KenTsui
Copy link

@KenTsui KenTsui commented Jul 27, 2019

@KenTsui KenTsui changed the title BUG: cells are missing in the excel file when exporting excel using xlsxwriter with option constant_memory set to True (#15392) BUG: cells are missing in the excel file when exporting excel using xlsxwriter with option constant_memory set to True (#15392) Jul 27, 2019
@@ -2230,6 +2230,15 @@ def to_excel(
):
df = self if isinstance(self, ABCDataFrame) else self.to_frame()

from pandas.io.excel._xlsxwriter import _XlsxWriter
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not the appropriate place for this, it needs to be in _XlsxWriter

Copy link
Author

@KenTsui KenTsui Jul 27, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to disable merge_cell based on 1) if it is using _XlsxWriter and 2) if it has set constant_memory True. In _XlsxWriter, you cannot control merge_cell. I will try to move it this piece of logic to pandas\io\formats\excel.py if you agree.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think placing it in excel.py is a better place than generic.py for starters. We can then evaluate feasibility of moving _XlsxWriter if necessary, though your points are well-taken.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now the testing if using _XlsxWriter and constant_memory==True is moved to excel.py.
And by referencing the solution of an issue of xlsxwriter, there is a perk in the updated code that we can keep merged cells even if the two aforementioned conditions are present.

@jreback jreback added the IO Excel read_excel, to_excel label Jul 27, 2019
@gfyoung gfyoung added Error Reporting Incorrect or improved errors from pandas and removed Error Reporting Incorrect or improved errors from pandas labels Jul 28, 2019
@pep8speaks
Copy link

pep8speaks commented Jul 28, 2019

Hello @KenTsui! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2019-07-28 19:03:37 UTC

@@ -106,7 +106,7 @@ MultiIndex
I/O
^^^

-
- Bug in :func:`DataFrame.to_excel()` where cells are missing in the excel file when exporting excel using ``xlsxwriter`` with option ``constant_memory`` set to ``True`` ((:issue:`15392`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't need the parens after to_excel.

Mismatched parens in the :issue:.

@WillAyd
Copy link
Member

WillAyd commented Aug 5, 2019

I think all of this logic should be contained within the write_cells method of pandas.io.excel._xlsxwriter.py - does moving it there make things easier?

@KenTsui
Copy link
Author

KenTsui commented Aug 7, 2019

I think all of this logic should be contained within the write_cells method of pandas.io.excel._xlsxwriter.py - does moving it there make things easier?

It seems not quite possible as the sequence of generating cells, which is the crux of this bug, is not controlled by write_cells.

BTW, I am stuck on how to debug using azure pipelines. Does anyone have some good resources to learn about that?

@WillAyd
Copy link
Member

WillAyd commented Sep 5, 2019

@KenTsui can you merge master? As far as log failures go if you click on next to failing logs you should be able to click through and find test failure output. Should match what you would see locally in a lot of cases

@WillAyd
Copy link
Member

WillAyd commented Sep 13, 2019

@KenTsui is this still active?

@WillAyd
Copy link
Member

WillAyd commented Oct 11, 2019

Closing as stale but ping if you'd like to pick back up

@WillAyd WillAyd closed this Oct 11, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
IO Excel read_excel, to_excel
Projects
None yet
Development

Successfully merging this pull request may close these issues.

DataFrame.to_excel with xlsxwriter and constant_memory makes most of the cells empty
6 participants