Skip to content

Conversation

jbrockmendel
Copy link
Member

Largely split off from #52419

@jbrockmendel jbrockmendel requested a review from rhshadrach as a code owner June 26, 2023 22:25
@mroeschke mroeschke added Refactor Internal refactoring of code Internals Related to non-user accessible pandas implementation labels Jun 27, 2023
@mroeschke mroeschke added this to the 2.1 milestone Jun 27, 2023
@mroeschke mroeschke merged commit 5788685 into pandas-dev:main Jun 27, 2023
@mroeschke
Copy link
Member

Thanks @jbrockmendel

@jbrockmendel jbrockmendel deleted the ref-from_mgr-2 branch June 27, 2023 17:23
if self._constructor is Series:
# we are pandas.Series (or a subclass that doesn't override _constructor)
return self._from_mgr(mgr, axes=axes)
ser = self._from_mgr(mgr, axes=axes)
Copy link
Member

@jorisvandenbossche jorisvandenbossche Jul 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jbrockmendel this changes the logic from what we discussed in #52132, and I am afraid this isn't exactly equivalent for subclasses implementing _constructor.

(before, we passed the Manager object to _constructor, while with this PR we pass a Series object)

Now, what was the rationale for changing this?
The PR title says "simplify _from_mgr constructors", but don't see how this is a simplification here? (it actually got longer, and added _name handling)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC i went to use the new constructor_from_mgr in more places and found that broke some tests, and the edits here were the best option I found to keep things working (in particular pinning ser._name)

I also very much like that in this version subclasses don't have to worry about handling Manager objects, just Series/DataFrame objects, which they should already be able to handle.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain why or when the pinning of ser._name is needed? (which case is it solving?)

In geopandas, it is a feature that we can do something different for blockmanager vs generic array-like (we use it to decide whether we want to try to infer objects being passed or not). It also adds some overhead to go through the constructor again with a dataframe/series.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain why or when the pinning of ser._name is needed? (which case is it solving?)

My recollection is that without that, when I changed the usages to use constructor_from_mgr, a handful of tests broke with AttributeError bc _name was not defined.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Internals Related to non-user accessible pandas implementation Refactor Internal refactoring of code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants