-
-
Notifications
You must be signed in to change notification settings - Fork 18.7k
Closed
Closed
Copy link
Labels
Milestone
Description
When concatting two dataframes where there are a) there are duplicate columns in one of the dataframes, and b) there are non-overlapping column names in both, then you get a IndexError:
In [9]: df1 = pd.DataFrame(np.random.randn(3,3), columns=['A', 'A', 'B1'])
...: df2 = pd.DataFrame(np.random.randn(3,3), columns=['A', 'A', 'B2'])
In [10]: pd.concat([df1, df2])
Traceback (most recent call last):
File "<ipython-input-10-f61a1ab4009e>", line 1, in <module>
pd.concat([df1, df2])
...
File "c:\users\vdbosscj\scipy\pandas-joris\pandas\core\index.py", line 765, in take
taken = self.view(np.ndarray).take(indexer)
IndexError: index 3 is out of bounds for axis 0 with size 3
I don't know if it should work (although I suppose it should, as with only the duplicate columns it does work), but at least the error message is not really helpfull.