Skip to content

Conversation

behzadnouri
Copy link
Contributor

closes #8850

on master:

>>> cols = MultiIndex.from_tuples([('1st', 'a'), ('2nd', 'b'), ('3rd', 'c')])
>>> df = DataFrame([[1.0, 2, 3], [4.0, 5, 6]], columns=cols)
>>> df['2nd'] = df['2nd'] * 2.0  # type change in block manager
/usr/lib/python3.4/site-packages/numpy/lib/function_base.py:3612: FutureWarning: in the future negative indices will not be ignored by `numpy.delete`.
  "`numpy.delete`.", FutureWarning)
>>> df.values
...
  File "/usr/lib/python3.4/site-packages/pandas-0.15.1_72_gf504885-py3.4-linux-x86_64.egg/pandas/core/internals.py", line 2392, in _verify_integrity
    tot_items))
AssertionError: Number of manager items must equal union of block items
# manager items: 3, # tot_items: 4
>>> df.blocks
...
  File "/usr/lib/python3.4/site-packages/pandas-0.15.1_72_gf504885-py3.4-linux-x86_64.egg/pandas/core/internals.py", line 2392, in _verify_integrity
    tot_items))
AssertionError: Number of manager items must equal union of block items
# manager items: 3, # tot_items: 4

._data is also broken:

>>> df._data
BlockManager
Items:       
1st  a
2nd  b
3rd  c
Axis 1: Int64Index([0, 1], dtype='int64')
FloatBlock: slice(0, 1, 1), 1 x 2, dtype: float64
IntBlock: slice(1, 3, 1), 2 x 2, dtype: int64
FloatBlock: slice(1, 2, 1), 1 x 2, dtype: float64

integer block is bigger than what it should be and overlaps with one of the float blocks.

@jreback
Copy link
Contributor

jreback commented Nov 19, 2014

cc @immerrr

@jreback jreback added Bug Reshaping Concat, Merge/Join, Stack/Unstack, Explode labels Nov 19, 2014
@jreback jreback added this to the 0.15.2 milestone Nov 19, 2014
@immerrr
Copy link
Contributor

immerrr commented Nov 20, 2014

Yeah, I guess I didn't expect loc to be a slice in BlockManager.set.

Such corruption would be also a lot easier to prevent if blkloc invalidation (self._blklocs[blk.mgr_locs.indexer] = -1) used np.info(self._blklocs.dtype).min instead of -1 which is a valid indexer most of the time.

jreback added a commit that referenced this pull request Nov 20, 2014
BUG: type change breaks BlockManager integrity
@jreback jreback merged commit 5470f5c into pandas-dev:master Nov 20, 2014
@jreback
Copy link
Contributor

jreback commented Nov 20, 2014

@behzadnouri thanks

@immerrr I'd u would like to post an issue about this invalidation go ahead

@behzadnouri behzadnouri deleted the blk-mgr branch November 21, 2014 01:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: Accessing DataFrame multi-index column *seems* to modify its content
3 participants