drop_duplicates() is dropping more than just duplicates in 0.17.0

When I upgraded from 0.16.2 to 0.17.0, I was met with a nasty surprise when dropping duplicates.  It looks like DataFrame.drop_duplicates() is not working as I would expect it to based on the previous version.  I have a dataframe

```
test_ids = df['test_id'].unique()
print('N test ids: {}'.format(test_ids.shape))
print('N tests: {}'.format(df[['test_id', <some other columns>]].drop_duplicates().shape))
```

the output is:

```
N test ids: (341334,)
N tests: (237426, 10)
```

when I run the same in 0.16.2 the output is:

```
N test ids: (341334,)
N tests: (341334, 10)
```

I don't think you should be able to get fewer rows than the number of unique entries in a single column.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

drop_duplicates() is dropping more than just duplicates in 0.17.0 #11512

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

drop_duplicates() is dropping more than just duplicates in 0.17.0 #11512

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions