Skip to content

BUG: clear cache on DataFrame._is_homogeneous_dtype #34937

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 24, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion pandas/core/frame.py
Original file line number Diff line number Diff line change
Expand Up @@ -611,7 +611,8 @@ def _is_homogeneous_type(self) -> bool:
if self._mgr.any_extension_types:
return len({block.dtype for block in self._mgr.blocks}) == 1
else:
return not self._mgr.is_mixed_type
# Note: consolidates inplace
Copy link
Contributor

@jreback jreback Jun 23, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i would rather just push this entire thing to the manager, the frame/series shouldn't really be in charge of this as its accessing internal apis (or at least eventually they will be)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

semi-agree: we shouldnt mix-and-match. But _item_cache pretty much has to be on DataFrame, since that is where the Series objects are created.

return not self._is_mixed_type

@property
def _can_fast_transpose(self) -> bool:
Expand Down
12 changes: 12 additions & 0 deletions pandas/tests/frame/test_dtypes.py
Original file line number Diff line number Diff line change
Expand Up @@ -233,6 +233,18 @@ def test_constructor_list_str_na(self, string_dtype):
def test_is_homogeneous_type(self, data, expected):
assert data._is_homogeneous_type is expected

def test_is_homogeneous_type_clears_cache(self):
ser = pd.Series([1, 2, 3])
df = ser.to_frame("A")
df["B"] = ser

assert len(df._mgr.blocks) == 2

a = df["B"] # caches lookup
df._is_homogeneous_type # _should_ clear cache
assert len(df._mgr.blocks) == 1
assert df["B"] is not a

def test_asarray_homogenous(self):
df = pd.DataFrame({"A": pd.Categorical([1, 2]), "B": pd.Categorical([1, 2])})
result = np.asarray(df)
Expand Down