-
-
Notifications
You must be signed in to change notification settings - Fork 18.6k
BUG: iterparse of read_xml not parsing duplicate element and attribute names #47414
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
row[col] = elem.text.strip() if elem.text else None | ||
if col in elem.attrib: | ||
row[col] = elem.attrib[col] | ||
if self.names: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you share this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. I refactored to use code once in base class of the parsers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a bit confusing now. Could you split the refactoring and the bug fix into 2 prs? Hard to review like this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will revert back to bug fix only and on subsequent PR refactor to remove repetitive code between parsers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks.
I was not sure if you could share more initially, but if you can share most of the code than we can separate these
Thanks @ParfaitG |
…e names (pandas-dev#47414) * BUG: iterparse of read_xml not parsing duplicate element and attribute names * Refactor duplicative code in each parser to shared base class * Add lxml preceding-sibling iterparse cleanup * Revert code refactoring back to bug fix only * Remove whatsnew bug fix note on unreleased version feature
doc/source/whatsnew/v1.5.0.rst
file if fixing a bug or adding a new feature.