REF: deduplicate group min/max #40584

mzeitlin11 · 2021-03-23T14:48:05Z

ASVs look unaffected:

before           after         ratio
     [05a0e98a]       [16d06e22]
     <master>         <ref/deduplicate_grp_min_max>
         33.0±3ms         33.3±1ms     1.01  gil.ParallelGroupbyMethods.time_loop(2, 'max')
       32.2±0.4ms         34.5±3ms     1.07  gil.ParallelGroupbyMethods.time_loop(2, 'min')
         64.1±1ms       65.6±0.5ms     1.02  gil.ParallelGroupbyMethods.time_loop(4, 'max')
         65.3±3ms         65.5±1ms     1.00  gil.ParallelGroupbyMethods.time_loop(4, 'min')
         20.2±2ms       20.2±0.8ms     1.00  gil.ParallelGroupbyMethods.time_parallel(2, 'max')
         21.3±1ms       20.1±0.8ms     0.95  gil.ParallelGroupbyMethods.time_parallel(2, 'min')
       25.1±0.7ms       25.3±0.5ms     1.01  gil.ParallelGroupbyMethods.time_parallel(4, 'max')
       25.0±0.4ms         26.5±2ms     1.06  gil.ParallelGroupbyMethods.time_parallel(4, 'min')
         125±50ms          104±1ms    ~0.83  groupby.GroupByCythonAgg.time_frame_agg('float64', 'max')
          105±2ms          107±2ms     1.02  groupby.GroupByCythonAgg.time_frame_agg('float64', 'min')

mzeitlin11 · 2021-03-23T14:50:53Z

pandas/_libs/groupby.pyx

+                            group_min_or_max[lab, j] = val
+                    else:
+                        if val < group_min_or_max[lab, j]:
+                            group_min_or_max[lab, j] = val


Wanted to be able to do something like if PyObject_RichCompareBool(val, ..., cmp_method): ... to simplify the conditional logic, but couldn't find a way to do something like this in Cython without the GIL.

is there a perf impact here? its checking compute_max at each step in a nested loop

I didn't see any perf impact based on ASVs above. For a loop invariant like compute_max, I'd guess this check is optimized away by the compiler. (It could also just be negligible compared to the cost of the indexing operations)

makes sense. worth it to de-duplicate a bunch of code.

jreback

lgtm. @jbrockmendel if any comments

jbrockmendel · 2021-03-23T20:16:19Z

LGTM

jreback · 2021-03-23T20:29:32Z

thanks @mzeitlin11

mzeitlin11 added 2 commits March 23, 2021 00:54

REF: deduplicate group min/max

16d06e2

Clean up comments

441e8e3

mzeitlin11 commented Mar 23, 2021

View reviewed changes

mzeitlin11 added Algos Non-arithmetic algos: value_counts, factorize, sorting, isin, clip, shift, diff Groupby Refactor Internal refactoring of code labels Mar 23, 2021

jreback added this to the 1.3 milestone Mar 23, 2021

jreback approved these changes Mar 23, 2021

View reviewed changes

jreback merged commit 6ed2fe8 into pandas-dev:master Mar 23, 2021

mzeitlin11 deleted the ref/deduplicate_grp_min_max branch March 23, 2021 20:31

mzeitlin11 mentioned this pull request Mar 23, 2021

REF: deduplicate group cummin/max #40599

Merged

vladu pushed a commit to vladu/pandas that referenced this pull request Apr 5, 2021

REF: deduplicate group min/max (pandas-dev#40584)

5dc783d

JulianWgs pushed a commit to JulianWgs/pandas that referenced this pull request Jul 3, 2021

REF: deduplicate group min/max (pandas-dev#40584)

eda43ec

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

REF: deduplicate group min/max #40584

REF: deduplicate group min/max #40584

Uh oh!

mzeitlin11 commented Mar 23, 2021

Uh oh!

mzeitlin11 Mar 23, 2021

Uh oh!

jbrockmendel Mar 23, 2021

Uh oh!

mzeitlin11 Mar 23, 2021 •

edited

Loading

Uh oh!

jbrockmendel Mar 23, 2021

Uh oh!

jreback left a comment

Uh oh!

jbrockmendel commented Mar 23, 2021

Uh oh!

jreback commented Mar 23, 2021

Uh oh!

Uh oh!

Uh oh!

REF: deduplicate group min/max #40584

REF: deduplicate group min/max #40584

Uh oh!

Conversation

mzeitlin11 commented Mar 23, 2021

Uh oh!

mzeitlin11 Mar 23, 2021

Choose a reason for hiding this comment

Uh oh!

jbrockmendel Mar 23, 2021

Choose a reason for hiding this comment

Uh oh!

mzeitlin11 Mar 23, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jbrockmendel Mar 23, 2021

Choose a reason for hiding this comment

Uh oh!

jreback left a comment

Choose a reason for hiding this comment

Uh oh!

jbrockmendel commented Mar 23, 2021

Uh oh!

jreback commented Mar 23, 2021

Uh oh!

Uh oh!

mzeitlin11 Mar 23, 2021 •

edited

Loading