PERF: get_block_type #52109

lukemanley · 2023-03-21T22:13:26Z

closes PERF: get_block_type heavy use could benefit performance improvements #48212
Tests added and passed if fixing a bug or adding a new feature
All code checks passed.
Added type annotations to new arguments/methods/functions.
Added an entry in the latest doc/source/whatsnew/vX.X.X.rst file if fixing a bug or adding a new feature.

cc @jbrockmendel - this may partly close #48212, however, I suspect the OP was referring to non-EA's given the old version of pandas.

Performance improvement is mostly for EA's where the .kind call can be a bottleneck.

import pyarrow as pa
import pandas as pd
from pandas.core.internals.blocks import get_block_type

%timeit get_block_type(pd.ArrowDtype(pa.float64()))
# 3.51 µs ± 440 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)    <- main
# 740 ns ± 5.19 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)  <- PR

%timeit get_block_type(pd.Float64Dtype())
# 1.3 µs ± 23.2 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)  <- main
# 289 ns ± 2.3 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)   <- PR

jbrockmendel

LGTM

jbrockmendel · 2023-03-21T22:26:20Z

pandas/core/internals/blocks.py

+    kind = dtype.kind
+    if kind in ["M", "m"]:
+        return DatetimeLikeBlock
    elif kind in ["f", "c", "i", "u", "b"]:


we can improve a little bit here by checking kind in "fciub" instead of the list

jbrockmendel · 2023-03-21T22:26:33Z

pandas/core/internals/blocks.py

-
-    cls: type[Block]
-
    if isinstance(dtype, SparseDtype):


i think the SparseDtype check may no longer be needed

your suggested updates give a bit of an improvement to non-EA's as well:

import numpy as np from pandas.core.internals.blocks import get_block_type %timeit get_block_type(np.dtype('float64')) # 724 ns ± 59.4 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each) -> main # 590 ns ± 30 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each) -> PR

jbrockmendel · 2023-03-21T22:58:05Z

ping on green

lukemanley · 2023-03-22T00:44:46Z

ping on green

green - thanks

jbrockmendel · 2023-03-22T01:04:59Z

thanks @lukemanley

PERF: get_block_type (for EA's mostly)

def87db

lukemanley added Performance Memory or execution speed performance Internals Related to non-user accessible pandas implementation labels Mar 21, 2023

updates

65132b5

jbrockmendel approved these changes Mar 21, 2023

View reviewed changes

jbrockmendel reviewed Mar 21, 2023

View reviewed changes

jbrockmendel merged commit 5c15588 into pandas-dev:main Mar 22, 2023

lukemanley added this to the 2.1 milestone Mar 22, 2023

lukemanley deleted the perf-get-block-type branch April 18, 2023 11:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

PERF: get_block_type #52109

PERF: get_block_type #52109

Uh oh!

lukemanley commented Mar 21, 2023 •

edited

Loading

Uh oh!

jbrockmendel left a comment

Uh oh!

jbrockmendel Mar 21, 2023

Uh oh!

lukemanley Mar 21, 2023

Uh oh!

jbrockmendel Mar 21, 2023

Uh oh!

lukemanley Mar 21, 2023

Uh oh!

lukemanley Mar 21, 2023

Uh oh!

jbrockmendel commented Mar 21, 2023

Uh oh!

lukemanley commented Mar 22, 2023

Uh oh!

jbrockmendel commented Mar 22, 2023

Uh oh!

Uh oh!

Uh oh!

PERF: get_block_type #52109

PERF: get_block_type #52109

Uh oh!

Conversation

lukemanley commented Mar 21, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jbrockmendel left a comment

Choose a reason for hiding this comment

Uh oh!

jbrockmendel Mar 21, 2023

Choose a reason for hiding this comment

Uh oh!

lukemanley Mar 21, 2023

Choose a reason for hiding this comment

Uh oh!

jbrockmendel Mar 21, 2023

Choose a reason for hiding this comment

Uh oh!

lukemanley Mar 21, 2023

Choose a reason for hiding this comment

Uh oh!

lukemanley Mar 21, 2023

Choose a reason for hiding this comment

Uh oh!

jbrockmendel commented Mar 21, 2023

Uh oh!

lukemanley commented Mar 22, 2023

Uh oh!

jbrockmendel commented Mar 22, 2023

Uh oh!

Uh oh!

lukemanley commented Mar 21, 2023 •

edited

Loading