Skip to content

DataFrame bug when setting column after filtering #2703

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
pikeas opened this issue Jan 16, 2013 · 5 comments
Closed

DataFrame bug when setting column after filtering #2703

pikeas opened this issue Jan 16, 2013 · 5 comments

Comments

@pikeas
Copy link

pikeas commented Jan 16, 2013

In [615]: df = pd.DataFrame([3,4,5],columns=['a'])
In [616]: df['b'] = 'nope'
In [617]: df[df['a'] == 3]['b'] = 'found'
In [618]: df
Out[618]:
   a          b
0  3  nope
1  4  nope
2  5  nope

#Expected
In [618]: df
Out[618]:
   a          b
0  3  found
1  4  nope
2  5  nope

I believe the intermediate filter step creates a new DataFrame, but as far as I know, the outer step (where the value is set) should modify in-place.

Pandas 0.10.

@wesm
Copy link
Member

wesm commented Jan 16, 2013

You should do df.ix[df['a'] == 3, 'b'] = 'found'. What you have modifies a copy not the original data structure.

@wesm wesm closed this as completed Jan 16, 2013
@changhiskhan
Copy link
Contributor

Since df[df['a'] == 3] creates a new DataFrame, doing df[df['a'] == 3]['b'] = 'found' shouldn't modify df inplace right?

@pikeas
Copy link
Author

pikeas commented Jan 16, 2013

Chang - df[<stuff>]['b'] looks like it should modify inplace, regardless of what is passed as stuff. I've been working with Pandas for a month and a half and still got bit by this. Possibly worth a mention in the docs?

@changhiskhan
Copy link
Contributor

@pikeas that's a good suggestion. We should put that to the list of "gotchas"

@pikeas
Copy link
Author

pikeas commented Jan 17, 2013

@wesm Your suggestion of using df.ix[<filter>, <col>] = will not work if the filter matches multiple items and the assignment is a Series.

Eg:

df = pd.DataFrame([3,4,5,3,6], columns=['a'])
df['b'] = 'nope'
df2 = df.copy()
#Works
df.ix[df['a'] == 3, 'b'] = ['f1', 'f2']
#Broken
df2.ix[df2['a'] == 3, 'b'] = pd.Series(['f1', 'f2'])

Out[927]:
   a     b
0  3    f1
1  4  nope
2  5  nope
3  3   NaN
4  6  nope

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants