Skip to content

Replaced PANDAS_DATETIMEUNIT with NPY_DATETIMEUNIT #21863

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jul 12, 2018

Conversation

WillAyd
Copy link
Member

@WillAyd WillAyd commented Jul 11, 2018

progress towards #21852

  • tests added / passed
  • passes git diff upstream/master -u -- "*.py" | flake8 --diff
  • whatsnew entry

Working on my C-fu as well. There may be a better way to do this (ex: even adding NPY_DATETIMEUNIT to Cython/includes) but figured I'd offer this up for review

@jbrockmendel

@@ -33,3 +32,21 @@ cdef maybe_datetimelike_to_i8(object val)
cdef int64_t tz_convert_utc_to_tzlocal(int64_t utc_val, tzinfo tz)

cpdef datetime localize_pydatetime(datetime dt, object tz)

cdef extern from "numpy/ndarraytypes.h":
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if there is a better way to do this declaration. Right now I copy/pasted in the three files where it is need but is there a shared place to put that for shared access across the modules?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since it is in np_datetime.pxd, it can be cimported from there.

@@ -33,3 +32,21 @@ cdef maybe_datetimelike_to_i8(object val)
cdef int64_t tz_convert_utc_to_tzlocal(int64_t utc_val, tzinfo tz)

cpdef datetime localize_pydatetime(datetime dt, object tz)

cdef extern from "numpy/ndarraytypes.h":
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is done in np_datetime.pxd, I'd just cimport from there rather than copying this.

@jbrockmendel
Copy link
Member

Aside from that note on repeated imports, this looks like a step in the right direction. The other super-easy change-over is pandas_datetimestruct.

@@ -7,7 +7,6 @@ from numpy cimport int64_t, int32_t

from np_datetime cimport pandas_datetimestruct


Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file isn't linted, but if it were, I think it would want two lines before the cdef class

@codecov
Copy link

codecov bot commented Jul 11, 2018

Codecov Report

Merging #21863 into master will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##           master   #21863   +/-   ##
=======================================
  Coverage   91.92%   91.92%           
=======================================
  Files         160      160           
  Lines       49906    49906           
=======================================
  Hits        45875    45875           
  Misses       4031     4031
Flag Coverage Δ
#multiple 90.3% <ø> (ø) ⬆️
#single 42.11% <ø> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5d0daa0...8aa70bb. Read the comment docs.

@jbrockmendel
Copy link
Member

I still can’t find a way to access functions in numpy’s _datetime.h. Not even sure what is possible in C-land. @chris-b1 are you the right person to ask about this?

@WillAyd
Copy link
Member Author

WillAyd commented Jul 11, 2018

@jbrockmendel not sure what you are looking for out of that but it doesn't appear that header is included in numpy's core/include directory along with the rest of the headers. You might have to mess around with setup.py if you are trying to include header files outside of that location:

pandas/setup.py

Line 128 in 5d0daa0

numpy_incl = pkg_resources.resource_filename('numpy', 'core/include')

Though that most likely violates a lot of best practices

@WillAyd
Copy link
Member Author

WillAyd commented Jul 11, 2018

Though looking at it I don't believe src/multiarray/_datetime.h is included as part of your numpy installation in site-packages so may not be possible with how the code is currently structured (could be wrong)

@jbrockmendel
Copy link
Member

not sure what you are looking for out of that but it doesn't appear that header is included in numpy's core/include directory along with the rest of the headers

The hope was that it is somehow passed through via an #include in one of the files that is part of core/include. After all, numpy itself somehow manages to call these functions

@chris-b1
Copy link
Contributor

Yeah, even if it were possible I don't think we should be depending on private API of numpy - presumably why some bits (e.g. datetime string parsing) were vendored in the first place.

@jreback jreback added Datetime Datetime data dtype Clean labels Jul 12, 2018
@jreback jreback added this to the 0.24.0 milestone Jul 12, 2018
@jreback
Copy link
Contributor

jreback commented Jul 12, 2018

cc @charris

do you recall if these existed in numpy < 1.6 ?

@WillAyd
Copy link
Member Author

WillAyd commented Jul 12, 2018

AFAICT goes at least back to 1.4

@jbrockmendel
Copy link
Member

AFAICT the code was vendored b/c at the time pandas supported older numpy versions that did not yet have datetime64. Also because pandas handles timezones differently in np_datetime_strings. It'd still be nice to not re-implement the large portion that is identical.

It would also be super-convenient to be able to use the numpy versions in src/ujson/python/obj2JSON.c, as that would let us de-couple the remaing src/datetime code.

@jreback jreback merged commit 32bf504 into pandas-dev:master Jul 12, 2018
@jreback
Copy link
Contributor

jreback commented Jul 12, 2018

thanks @WillAyd and @jbrockmendel for the comments. Agree that the vendored code, just wanted the rationale documented here :>

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Clean Datetime Datetime data dtype
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants