issue#27482 - added a check for if obj is instance of type in _isna-new #27664

joy-rosie · 2019-07-30T20:25:02Z

closes pd.isnull() raises AttributeError on Pandas type objects #27482
tests added / passed
passes black pandas
passes git diff upstream/master -u -- "*.py" | flake8 --diff
whatsnew entry

joy-rosie · 2019-07-30T21:09:26Z

Can someone help me; when I run git diff upstream/master -u -- "*.py" | flake8 --diff
I get: fatal: bad revision 'upstream/master'

jbrockmendel · 2019-07-30T21:38:07Z

Needs a test, don’t worry too much about the git thing

joy-rosie · 2019-07-31T06:03:29Z

could you point me to the file where the test would go for this?

joy-rosie · 2019-07-31T08:08:45Z

I had a quick look, would it be okay to go in this function: https://github.com/pandas-dev/pandas/blob/master/pandas/tests/dtypes/test_missing.py#L81

jreback

this logic needs to be replicated in _isnew_old as well. tests are needed.

may want to move this to the last clause of the if/else. performance tests need to be run for this.

joy-rosie · 2019-07-31T19:38:21Z

this logic needs to be replicated in _isnew_old as well. tests are needed.

may want to move this to the last clause of the if/else. performance tests need to be run for this.

I added it above this elif https://github.com/joy-rosie/pandas/blob/issue-27482-isnull/pandas/core/dtypes/missing.py#L138 because that is causing the error. I will add to _isna_old and also add tests, not sure how to add perfomance tests.

joy-rosie · 2019-08-01T13:42:22Z

I have added the tests and the statement to _isna_old, please let me know how I can run the performance tests

jbrockmendel · 2019-08-01T16:25:29Z

please let me know how I can run the performance tests

https://pandas.pydata.org/pandas-docs/stable/development/contributing.html#running-the-performance-test-suite

joy-rosie · 2019-08-03T10:45:00Z

Hello, I ran the performance tests as shown in the link and this is the final output:

. before after ratio
[143bc34] [7aef72c]

 21.7±0.5ms       29.6±0.6ms     1.37  indexing.NumericSeriesIndexing.time_loc_scalar(<class 'pandas.core.indexes.numeric.UInt64Index'>, 'nonunique_monotonic_inc')

    409±5μs         498±10μs     1.22  indexing.NumericSeriesIndexing.time_ix_scalar(<class 'pandas.core.indexes.numeric.Int64Index'>, 'nonunique_monotonic_inc')

12.4±0.09ms       15.0±0.1ms     1.21  index_object.SetOperations.time_operation('int', 'symmetric_difference')

 6.12±0.2ms       6.99±0.3ms     1.14  stat_ops.SeriesMultiIndexOps.time_op(0, 'median')

1.67±0.06μs       1.87±0.1μs     1.12  index_cached_properties.IndexCache.time_values('PeriodIndex')

  332±0.6ns         369±10ns     1.11  timestamp.TimestampOps.time_tz_localize('US/Eastern')

1.01±0.02μs      1.12±0.02μs     1.11  index_cached_properties.IndexCache.time_inferred_type('RangeIndex')

2.14±0.05ms      2.37±0.04ms     1.11  reindex.LevelAlign.time_align_level

    329±2ns          362±9ns     1.10  timestamp.TimestampOps.time_tz_localize(tzutc())

  331±0.5ns          365±8ns     1.10  timestamp.TimestampOps.time_tz_localize(<UTC>)

 4.64±0.2μs      4.21±0.05μs     0.91  timedelta.TimedeltaConstructor.time_from_missing

1.71±0.02μs      1.55±0.05μs     0.91  period.PeriodProperties.time_property('M', 'year')

1.42±0.08ms      1.29±0.01ms     0.90  ctors.SeriesConstructors.time_series_constructor(<function list_of_lists>, False, 'int')

   567±50μs        511±0.9μs     0.90  groupby.GroupByMethods.time_dtype_as_field('float', 'head', 'transformation')

 1.04±0.4ms          936±3μs     0.90  groupby.GroupByMethods.time_dtype_as_group('int', 'cumsum', 'direct')

1.68±0.01μs       1.51±0.1μs     0.90  period.PeriodProperties.time_property('min', 'second')

1.48±0.08ms      1.34±0.02ms     0.90  ctors.SeriesConstructors.time_series_constructor(<function list_of_lists>, True, 'int')

2.41±0.05ms      2.16±0.09ms     0.90  reindex.LevelAlign.time_reindex_level

1.69±0.04μs      1.52±0.06μs     0.90  period.PeriodProperties.time_property('M', 'month')

1.69±0.02μs      1.51±0.06μs     0.89  period.PeriodProperties.time_property('M', 'day')

   586±60μs        523±0.7μs     0.89  groupby.GroupByMethods.time_dtype_as_field('datetime', 'tail', 'transformation')

1.69±0.01μs      1.49±0.06μs     0.88  period.PeriodProperties.time_property('min', 'day')

 1.43±0.2ms         1.24±0ms     0.87  groupby.GroupByMethods.time_dtype_as_group('int', 'rank', 'transformation')

   609±90μs        528±0.9μs     0.87  groupby.GroupByMethods.time_dtype_as_group('int', 'quantile', 'transformation')

 1.46±0.2ms         1.24±0ms     0.85  groupby.GroupByMethods.time_dtype_as_group('int', 'rank', 'direct')

 7.26±0.1ms       6.01±0.3ms     0.83  algorithms.Duplicated.time_duplicated('first', 'int')

   348±80μs          287±1μs     0.83  groupby.GroupByMethods.time_dtype_as_field('float', 'last', 'transformation')

 1.69±0.3ms      1.36±0.01ms     0.81  groupby.GroupByMethods.time_dtype_as_field('int', 'value_counts', 'direct')

  90.4±20μs       65.4±0.1μs     0.72  algorithms.SortIntegerArray.time_argsort(1000)

  422±200μs        289±0.9μs     0.68  groupby.GroupByMethods.time_dtype_as_field('float', 'last', 'direct')

  887±300μs          553±1μs     0.62  groupby.GroupByMethods.time_dtype_as_field('object', 'head', 'transformation')

SOME BENCHMARKS HAVE CHANGED SIGNIFICANTLY.
PERFORMANCE DECREASED.

Machine information:
{
"arch": "x86_64",
"cpu": "Intel(R) Core(TM) i5-3230M CPU @ 2.60GHz",
"machine": "josie-lenovo-ideapad",
"num_cpu": "4",
"os": "Linux 4.18.0-25-generic",
"ram": "3897484",
"version": 1
}

joy-rosie · 2019-08-03T10:50:45Z

I feel like it is a bit random where the performance has gone down and where its gone up

jbrockmendel · 2019-08-03T16:39:23Z

I feel like it is a bit random where the performance has gone down and where its gone up

Yah, this is a common issue with asv (especially in cases where there are no non-noise changes). The usual solution is to re-run it a couple of times and pick out the changes that are consistent across runs.

jreback · 2019-08-04T22:15:11Z

this also needs a whatsnew note, put in 1.0 in Missing bug fix section

joy-rosie · 2019-08-07T20:12:12Z

sorry to be a pain but where do i add a whatsnew note?

I am also finding it difficult to get any reliable results for the performance tests, I have ran it a few times and there are different results each time

Is there anything i can do there?

jbrockmendel · 2019-08-07T20:17:21Z

sorry to be a pain but where do i add a whatsnew note?

Don't worry about it, we were all new here at one point. whatsnew note goes in doc/source/whatsnew/0.25.1.rst

I am also finding it difficult to get any reliable results for the performance tests, I have ran it a few times and there are different results each time

That usually indicates that there isn't any real impact. The non-asv alternative is to use Ipython %timeit directly on the affected function(s)

joy-rosie · 2019-08-13T06:03:48Z

That usually indicates that there isn't any real impact. The non-asv alternative is to use Ipython %timeit directly on the affected function(s)
for benchmarking I am running the test function:

import pandas as pd
from pandas.util import testing as tm
isna_f = pd.isna
pd.__version__

%%timeit -n 10000

assert not isna_f(1.0)
assert isna_f(None)
assert isna_f(pd.np.nan)
assert not isna_f(pd.np.inf)
assert not isna_f(-pd.np.inf)

# # type
# assert not isna_f(type(pd.Series()))
# assert not isna_f(type(pd.DataFrame()))

# series
for s in [
    tm.makeFloatSeries(),
    tm.makeStringSeries(),
    tm.makeObjectSeries(),
    tm.makeTimeSeries(),
    tm.makePeriodSeries(),
]:
    assert isinstance(isna_f(s), pd.Series)

# frame
for df in [
    tm.makeTimeDataFrame(),
    tm.makePeriodFrame(),
    tm.makeMixedDataFrame(),
]:
    result = isna_f(df)
    expected = df.apply(isna_f)
    tm.assert_frame_equal(result, expected)

Outputs:

old:

'0.25.0'

33 ms ± 1.74 ms per loop (mean ± std. dev. of 7 runs, 10000 loops each)

new:

'0.25.0+71.g7aef72ced'

33 ms ± 1.8 ms per loop (mean ± std. dev. of 7 runs, 10000 loops each)

Please let me know if you think I should run the benchmark on anything else

Don't worry about it, we were all new here at one point. whatsnew note goes in doc/source/whatsnew/0.25.1.rst

Will this go into bugfixes or missing?

jbrockmendel · 2019-08-13T14:54:09Z

Will this go into bugfixes or missing?

Yes. Probably in the "Missing" section

joy-rosie · 2019-08-14T20:56:54Z

I added into whatsnew and it seems like there are some things failing but I am not sure why a change in whatsnew could cause these particular failures?

TomAugspurger · 2019-08-14T21:25:31Z

Restarted the build. That test is randomly failing and we haven't diagnosed the cause yet.

joy-rosie · 2019-08-16T18:35:49Z

Is this ready to go now? Let me know if there are any remaining bits.

jbrockmendel · 2019-08-16T23:39:07Z

LGTM

…s-dev#27664) * added a check for if obj is instance of type in _isna-new

…e of type in _isna-new (pandas-dev#27664) * added a check for if obj is instance of type in _isna-new

… in _isna-new (#27664) (#28041) * added a check for if obj is instance of type in _isna-new

…s-dev#27664) * added a check for if obj is instance of type in _isna-new

added a check for if obj is instance of type in _isna-new

f1cbf4f

jreback requested changes Jul 31, 2019

View reviewed changes

jreback added the Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate label Jul 31, 2019

add tests and add to _isna_old

7aef72c

joy-rosie added 3 commits August 14, 2019 20:17

add in whatsnew

8ad5c12

fix typo in whatsnew

9785004

fix typo in whatsnew

84170c0

TomAugspurger approved these changes Aug 20, 2019

View reviewed changes

TomAugspurger merged commit e118b1d into pandas-dev:master Aug 20, 2019

TomAugspurger added this to the 0.25.1 milestone Aug 20, 2019

TomAugspurger mentioned this pull request Aug 20, 2019

Issue - 27482 : fixed isnull attribute for type objects #27980

Closed

2 tasks

TomAugspurger mentioned this pull request Aug 20, 2019

BUG: Exception when calling pandas.isnull() with a pandas type object #27989

Closed

5 tasks

TomAugspurger pushed a commit to TomAugspurger/pandas that referenced this pull request Aug 20, 2019

BUG: added a check for if obj is instance of type in _isna-new (panda…

6599498

…s-dev#27664) * added a check for if obj is instance of type in _isna-new

TomAugspurger pushed a commit to TomAugspurger/pandas that referenced this pull request Aug 20, 2019

Bckport PR pandas-dev#27664: BUG: added a check for if obj is instanc…

8b23b2c

…e of type in _isna-new (pandas-dev#27664) * added a check for if obj is instance of type in _isna-new

TomAugspurger mentioned this pull request Aug 20, 2019

Backport PR #27664: BUG: added a check for if obj is instance of type … #28041

Merged

TomAugspurger added a commit that referenced this pull request Aug 20, 2019

Backport PR #27664: BUG: added a check for if obj is instance of type…

af049e3

… in _isna-new (#27664) (#28041) * added a check for if obj is instance of type in _isna-new

galuhsahid pushed a commit to galuhsahid/pandas that referenced this pull request Aug 25, 2019

BUG: added a check for if obj is instance of type in _isna-new (panda…

a178ce5

…s-dev#27664) * added a check for if obj is instance of type in _isna-new

Uh oh!

issue#27482 - added a check for if obj is instance of type in _isna-new #27664

issue#27482 - added a check for if obj is instance of type in _isna-new #27664

Uh oh!

Conversation

joy-rosie commented Jul 30, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

joy-rosie commented Jul 30, 2019

Uh oh!

jbrockmendel commented Jul 30, 2019

Uh oh!

joy-rosie commented Jul 31, 2019

Uh oh!

joy-rosie commented Jul 31, 2019

Uh oh!

jreback left a comment

Choose a reason for hiding this comment

Uh oh!

joy-rosie commented Jul 31, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

joy-rosie commented Aug 1, 2019

Uh oh!

jbrockmendel commented Aug 1, 2019

Uh oh!

joy-rosie commented Aug 3, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

joy-rosie commented Aug 3, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jbrockmendel commented Aug 3, 2019

Uh oh!

jreback commented Aug 4, 2019

Uh oh!

joy-rosie commented Aug 7, 2019

Uh oh!

jbrockmendel commented Aug 7, 2019

Uh oh!

joy-rosie commented Aug 13, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jbrockmendel commented Aug 13, 2019

Uh oh!

joy-rosie commented Aug 14, 2019

Uh oh!

TomAugspurger commented Aug 14, 2019

Uh oh!

joy-rosie commented Aug 16, 2019

Uh oh!

jbrockmendel commented Aug 16, 2019

Uh oh!

Uh oh!

joy-rosie commented Jul 30, 2019 •

edited

Loading

joy-rosie commented Jul 31, 2019 •

edited

Loading

joy-rosie commented Aug 3, 2019 •

edited

Loading

joy-rosie commented Aug 3, 2019 •

edited

Loading

joy-rosie commented Aug 13, 2019 •

edited

Loading