Skip to content

BUG: Styler.apply consistently manages Series return objects aligning labels. #42014

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

attack68
Copy link
Contributor

@attack68 attack68 commented Jun 15, 2021

This issue is due to the asymmetric application of DataFrame.apply (#42005).
The bug is that when using Styler.apply the treatment of Series return objects is applied for axis=0 and overwritten for axis=1.

df = pd.DataFrame([[1,2],[3,4]], index=["X", "Y"], columns=["X", "Y"])
s = df.style

def series_ret(s):
    return pd.Series(["color:red;", "color:blue;"], index=["Y", "X"])  # note X-Y order
Command Current Behaviour New Behaviour
s.apply(series_ret, axis=0)
s.apply(series_ret, axis=1)

Also the application has been expanded to allow (for axis=0 | 1) Series return objects that are not the same size, or (for axis=None) DataFrame return objects that are not the same size, if they contain only relevant index labels. This allows for more flexible user defined functions. Potentially it also increases performance by not having to loop over labels that no longer exist.

def series_ret(s):
    return pd.Series(["color:red;"], index=["Y"])  # note only returning 1 element

def array_ret(s):
    return ["color:red;"]

def ser_no_idx_ret(s):
    return pd.Series(["color:red;", "color:blue;"]) # note returning a Series of right shape with no index
Command Current Behaviour New Behaviour
s.apply(series_ret, axis=0) ValueError: Function returned the wrong shape.
s.apply(series_ret, axis=1) ValueError: Length mismatch: Expected axis has 1 elements, new values have 2 elements
s.apply(array_ret, axis=0) ValueError: Function returned the wrong shape. ValueError: Function created invalid index labels. Usually, this is the result of the function returning a Series which contains invalid labels, or returning an incorrectly shaped, list-like object which cannot be mapped to labels, possibly due to applying the function along the wrong axis.
s.apply(array_ret, axis=1) ValueError: Length mismatch: Expected axis has 1 elements, new values have 2 elements ValueError: Function created invalid index labels. Usually, this is the result of the function returning a Series which contains invalid labels, or returning an incorrectly shaped, list-like object which cannot be mapped to labels, possibly due to applying the function along the wrong axis.
s.apply(ser_no_idx_ret, axis=0) KeyError: 0 ValueError: Function created invalid index labels. Usually, this is the result of the function returning a Series which contains invalid labels, or returning an incorrectly shaped, list-like object which cannot be mapped to labels, possibly due to applying the function along the wrong axis.
s.apply(ser_no_idx_ret, axis=1) ValueError: Function created invalid index labels. Usually, this is the result of the function returning a Series which contains invalid labels, or returning an incorrectly shaped, list-like object which cannot be mapped to labels, possibly due to applying the function along the wrong axis.

Note all errors do report the shapes, I just didn't include in the table for brevity.

Edge Cases

The edge case here is that if a user has a DataFrame with a default range index (or a DataFrame with some indexes the same as those in default, e.g. 0 or 1) then even returning arrays of the wrong size may work, since DataFrame.apply assigns them to a RangeIndex.

To be consistent with the above this should raise, but it is impossible to hook into the return value of the user-defined function and determine if it is a Series (where non-matching shapes are OK) or list-like (where matching shapes should be enforced)

df = pd.DataFrame([[1,2],[3,4]])
s = df.style
Command Current Behaviour New Behaviour
s.apply(array_ret, axis=0) ValueError: Function returned the wrong shape.
s.apply(array_ret, axis=1) ValueError: Length mismatch: Expected axis has 1 elements, new values have 2 elements

ASV

Benchmarks unchanged.

@attack68 attack68 marked this pull request as ready for review June 15, 2021 11:29
@attack68 attack68 changed the title [WIP] BUG: Styler.apply consistently manages Series return objects aligning labels. BUG: Styler.apply consistently manages Series return objects aligning labels. Jun 15, 2021
@attack68 attack68 added Bug Styler conditional formatting using DataFrame.style Apply Apply, Aggregate, Transform, Map labels Jun 29, 2021
@github-actions
Copy link
Contributor

This pull request is stale because it has been open for thirty days with no activity. Please update or respond to this comment if you're still interested in working on this.

@github-actions github-actions bot added the Stale label Jul 30, 2021
@attack68 attack68 removed the Stale label Jul 30, 2021
Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm small comments


.. versionchanged:: 1.3.0

.. versionchanged:: 1.4.0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you update the examples to show a Series return matching labels/index (e.g. not the same length)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added

@jreback jreback added this to the 1.4 milestone Aug 12, 2021
…_apply_funcs_index_columns

# Conflicts:
#	doc/source/whatsnew/v1.4.0.rst
@jreback jreback merged commit 2617bfc into pandas-dev:master Aug 19, 2021
@jreback
Copy link
Contributor

jreback commented Aug 19, 2021

thanks @attack68

@attack68 attack68 deleted the styler_consistent_apply_funcs_index_columns branch August 20, 2021 05:31
feefladder pushed a commit to feefladder/pandas that referenced this pull request Sep 7, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Apply Apply, Aggregate, Transform, Map Bug Styler conditional formatting using DataFrame.style
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Row based style.apply doesn't modify background colors
2 participants