Skip to content

BUG: pd.ExcelWriter(.., mode="a", if_sheet_exists="overlay") gives unexpected result #52189

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
2 of 3 tasks
abokey1 opened this issue Mar 24, 2023 · 2 comments · Fixed by #52222
Closed
2 of 3 tasks

BUG: pd.ExcelWriter(.., mode="a", if_sheet_exists="overlay") gives unexpected result #52189

abokey1 opened this issue Mar 24, 2023 · 2 comments · Fixed by #52222
Labels

Comments

@abokey1
Copy link
Contributor

abokey1 commented Mar 24, 2023

Pandas version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

import pandas as pd

df1 = pd.DataFrame({"col1": [1, 2, 3]})
df1.to_excel("wbook.xlsx", sheet_name = "Sheet1")

df2 = pd.DataFrame({"col2": [4, 5, 6]})

with pd.ExcelWriter("wbook.xlsx",
                    mode = "a",
                    engine = "openpyxl",
                    if_sheet_exists = "overlay") as writer:  
    df2.to_excel(writer, sheet_name = "Sheet1")

Issue Description

pd.ExcelWriter with mode = "a" and if_sheet_exists = "overlay" shows unexpected behaviour (at least for me).

From the documentation, we read :

if_sheet_exists{‘error’, ‘new’, ‘replace’, ‘overlay’}, default ‘error’

  • overlay: Write contents to the existing sheet without removing the old contents.

But when I run the code above, I got this :

image

I feel like this is not a bug tough but counterintuitive to be honest.

Expected Behavior

I would expected the second dataframe to be written right after the first one (like below) :

image

I know that I can pass startrow=len(df1)+1 to df2.to_excel() to get this kind of output but what if I had only one dataframe that needs to be appended to an existing spreadsheet (starting from the first non empty row) by using only pd.ExcelWriter ?

Installed Versions

INSTALLED VERSIONS

python : 3.11.0.final.0
python-bits : 64
OS : Windows
OS-release : 10
Version : 10.0.22000
machine : AMD64
pandas : 1.5.3

@abokey1 abokey1 added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Mar 24, 2023
@rhshadrach
Copy link
Member

Thanks for the report. This is the intended behavior. Perhaps we could change the description to something like:

Write contents to the existing sheet without first removing, but possibly over top of, the existing contents.

PRs to clarify the description are welcome!

but what if I had only one dataframe that needs to be appended to an existing spreadsheet (starting from the first non empty row) by using only pd.ExcelWriter ?

You can read the sheet with pd.read_excel with header=None and use the length of the result to determine what line the contents stop on.

@rhshadrach rhshadrach added Docs IO Excel read_excel, to_excel good first issue and removed Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Mar 26, 2023
@abokey1
Copy link
Contributor Author

abokey1 commented Mar 26, 2023

@rhshadrach, thank you so much for the answer!

I created a PR, btw my first every PR, so go easy on me ;)
The commit to consider is 81de560.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants