Skip to content

BUG: Check for duplicate names columns and index in crosstab #26717

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 4 commits into from

Conversation

cuchoi
Copy link

@cuchoi cuchoi commented Jun 7, 2019

Copy link
Member

@WillAyd WillAyd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add tests? Ideally the first part of any PR

@WillAyd WillAyd added the Reshaping Concat, Merge/Join, Stack/Unstack, Explode label Jun 8, 2019
@cuchoi
Copy link
Author

cuchoi commented Jun 8, 2019

Sure! Wanted to know that this was the right approach to solving this issue before writing the tests (sorry, first PR to pandas!)

@pep8speaks
Copy link

pep8speaks commented Jun 9, 2019

Hello @cuchoi! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2019-06-09 22:09:06 UTC

@cuchoi
Copy link
Author

cuchoi commented Jun 9, 2019

Added tests. Basically updated an old test that would break if we used different data. This would break before:

""" We create two series, rename one of the columns, 
    do a crosstab and check if they are equal. We expect them to be equal,
    but it fails due to an issue when there are duplicate column names."""
s1 = pd.Series(range(3), name='foo')
s2 = s1 + 1

expected = pd.crosstab(s1, s2.rename("bar"))
result = pd.crosstab(s1, s2)

tm.assert_frame_equal(result, expected)

So now I add raises exceptions checking in case that there is:

  • A duplicated index
  • Duplicated columns
  • Name shared between the index and columns.

@codecov
Copy link

codecov bot commented Jun 9, 2019

Codecov Report

❗ No coverage uploaded for pull request base (master@9a67ff4). Click here to learn what that means.
The diff coverage is 0%.

Impacted file tree graph

@@            Coverage Diff            @@
##             master   #26717   +/-   ##
=========================================
  Coverage          ?   41.19%           
=========================================
  Files             ?      179           
  Lines             ?    50774           
  Branches          ?        0           
=========================================
  Hits              ?    20918           
  Misses            ?    29856           
  Partials          ?        0
Flag Coverage Δ
#single 41.19% <0%> (?)
Impacted Files Coverage Δ
pandas/core/reshape/pivot.py 8.02% <0%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9a67ff4...301eb87. Read the comment docs.

@codecov
Copy link

codecov bot commented Jun 9, 2019

Codecov Report

Merging #26717 into master will decrease coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #26717      +/-   ##
==========================================
- Coverage   91.71%    91.7%   -0.01%     
==========================================
  Files         178      178              
  Lines       50740    50747       +7     
==========================================
+ Hits        46538    46540       +2     
- Misses       4202     4207       +5
Flag Coverage Δ
#multiple 90.3% <100%> (ø) ⬆️
#single 41.2% <0%> (-0.12%) ⬇️
Impacted Files Coverage Δ
pandas/core/reshape/pivot.py 96.6% <100%> (+0.07%) ⬆️
pandas/io/gbq.py 78.94% <0%> (-10.53%) ⬇️
pandas/core/frame.py 96.88% <0%> (-0.12%) ⬇️
pandas/util/testing.py 90.84% <0%> (-0.11%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0f3e8e8...b6d1e1e. Read the comment docs.

@jreback
Copy link
Contributor

jreback commented Jul 11, 2019

can you merge master and see if you can get this passing

@jbrockmendel
Copy link
Member

@cuchoi pls rebase

@WillAyd
Copy link
Member

WillAyd commented Aug 28, 2019

Looks like this has gone stale but ping if you'd like to pick it back up

@WillAyd WillAyd closed this Aug 28, 2019
@cuchoi
Copy link
Author

cuchoi commented Sep 16, 2019

Hi! I rebased. Is it not showing up because the issue is closed? Can your reopen it?

@TomAugspurger
Copy link
Contributor

Yes. GitHub struggles with reopening PRs to branches that have been closed. You're probably best off opening a new PR if you're wanting to work on this again!

@cuchoi
Copy link
Author

cuchoi commented Sep 17, 2019

Thanks! New PR here: #28474

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Crosstab Not Working with Duplicate Column Labels
6 participants