Skip to content

BUG: Bug in inference in a MultiIndex with datetime.date inputs (GH7888) #8264

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Sep 14, 2014
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions doc/source/v0.15.0.txt
Original file line number Diff line number Diff line change
Expand Up @@ -676,8 +676,8 @@ Enhancements



- ``tz_localize`` now accepts the ``ambiguous`` keyword which allows for passing an array of bools
indicating whether the date belongs in DST or not, 'NaT' for setting transition times to NaT,
- ``tz_localize`` now accepts the ``ambiguous`` keyword which allows for passing an array of bools
indicating whether the date belongs in DST or not, 'NaT' for setting transition times to NaT,
'infer' for inferring DST/non-DST, and 'raise' (default) for an AmbiguousTimeError to be raised (:issue:`7943`).
See :ref:`the docs<timeseries.timezone_ambiguous>` for more details.

Expand Down Expand Up @@ -756,7 +756,7 @@ Bug Fixes
- Bug in HDFStore iteration when passing a where (:issue:`8014`)
- Bug in DataFrameGroupby.transform when transforming with a passed non-sorted key (:issue:`8046`)
- Bug in repeated timeseries line and area plot may result in ``ValueError`` or incorrect kind (:issue:`7733`)

- Bug in inference in a MultiIndex with ``datetime.date`` inputs (:issue:`7888`)

- Bug in ``offsets.apply``, ``rollforward`` and ``rollback`` may reset nanosecond (:issue:`7697`)
- Bug in ``offsets.apply``, ``rollforward`` and ``rollback`` may raise ``AttributeError`` if ``Timestamp`` has ``dateutil`` tzinfo (:issue:`7697`)
Expand Down
2 changes: 1 addition & 1 deletion pandas/core/categorical.py
Original file line number Diff line number Diff line change
Expand Up @@ -232,7 +232,7 @@ def __init__(self, values, levels=None, ordered=None, name=None, fastpath=False,
# which is fine, but since factorize does this correctly no need here
# this is an issue because _sanitize_array also coerces np.nan to a string
# under certain versions of numpy as well
values = com._possibly_infer_to_datetimelike(values)
values = com._possibly_infer_to_datetimelike(values, convert_dates=True)
if not isinstance(values, np.ndarray):
values = _convert_to_list_like(values)
from pandas.core.series import _sanitize_array
Expand Down
25 changes: 17 additions & 8 deletions pandas/core/common.py
Original file line number Diff line number Diff line change
Expand Up @@ -1961,15 +1961,24 @@ def _possibly_cast_to_datetime(value, dtype, coerce=False):
return value


def _possibly_infer_to_datetimelike(value):
# we might have a array (or single object) that is datetime like,
# and no dtype is passed don't change the value unless we find a
# datetime/timedelta set
def _possibly_infer_to_datetimelike(value, convert_dates=False):
"""
we might have a array (or single object) that is datetime like,
and no dtype is passed don't change the value unless we find a
datetime/timedelta set

this is pretty strict in that a datetime/timedelta is REQUIRED
in addition to possible nulls/string likes

ONLY strings are NOT datetimelike

# this is pretty strict in that a datetime/timedelta is REQUIRED
# in addition to possible nulls/string likes
Parameters
----------
convert_dates : boolean, default False
if True try really hard to convert dates (such as datetime.date), other
leave inferred dtype 'date' alone

# ONLY strings are NOT datetimelike
"""

v = value
if not is_list_like(v):
Expand Down Expand Up @@ -2011,7 +2020,7 @@ def _try_timedelta(v):
sample = v[:min(3,len(v))]
inferred_type = lib.infer_dtype(sample)

if inferred_type in ['datetime', 'datetime64']:
if inferred_type in ['datetime', 'datetime64'] or (convert_dates and inferred_type in ['date']):
value = _try_datetime(v)
elif inferred_type in ['timedelta', 'timedelta64']:
value = _try_timedelta(v)
Expand Down
14 changes: 13 additions & 1 deletion pandas/tests/test_multilevel.py
Original file line number Diff line number Diff line change
@@ -1,12 +1,13 @@
# pylint: disable-msg=W0612,E1101,W0141
import datetime
import itertools
import nose

from numpy.random import randn
import numpy as np

from pandas.core.index import Index, MultiIndex
from pandas import Panel, DataFrame, Series, notnull, isnull
from pandas import Panel, DataFrame, Series, notnull, isnull, Timestamp

from pandas.util.testing import (assert_almost_equal,
assert_series_equal,
Expand Down Expand Up @@ -2066,6 +2067,17 @@ def test_datetimeindex(self):
self.assertTrue(idx.levels[0].equals(expected1))
self.assertTrue(idx.levels[1].equals(idx2))

# from datetime combos
# GH 7888
date1 = datetime.date.today()
date2 = datetime.datetime.today()
date3 = Timestamp.today()

for d1, d2 in itertools.product([date1,date2,date3],[date1,date2,date3]):
index = pd.MultiIndex.from_product([[d1],[d2]])
self.assertIsInstance(index.levels[0],pd.DatetimeIndex)
self.assertIsInstance(index.levels[1],pd.DatetimeIndex)

def test_set_index_datetime(self):
# GH 3950
df = pd.DataFrame({'label':['a', 'a', 'a', 'b', 'b', 'b'],
Expand Down