-
-
Notifications
You must be signed in to change notification settings - Fork 18.7k
Closed
Labels
PerformanceMemory or execution speed performanceMemory or execution speed performance
Description
Code Sample, a copy-pastable example if possible
# The existing implementation is:
def maybe_box_datetimelike(value):
# turn a datetime like into a Timestamp/timedelta as needed
if isinstance(value, (np.datetime64, datetime)):
value = tslibs.Timestamp(value)
elif isinstance(value, (np.timedelta64, timedelta)):
value = tslibs.Timedelta(value)
return value
# Proposed improvement:
def maybe_box_datetimelike(value):
# turn a datetime like into a Timestamp/timedelta as needed
if isinstance(value, (np.datetime64, datetime)) and not isinstance(value, tslibs.Timestamp):
value = tslibs.Timestamp(value)
elif isinstance(value, (np.timedelta64, timedelta)):
value = tslibs.Timedelta(value)
return value
Problem description
This function determines whether value
is of type (np.datetime64, datetime)
and if so, converts it into tslibs.Timestamp
. However, the class tslibs.Timestamp
is already a subclass of datetime
. Therefore, even if the object value
is already of type tslibs.Timestamp
, it will be needlessly converted one more time. This issue has large performance, when working with large dataframes, which contain datet time objects. This issue could be fixed by changing the condition
if isinstance(value, (np.datetime64, datetime)):
to:
if isinstance(value, (np.datetime64, datetime)) and not isinstance(value, tslibs.Timestamp):
Metadata
Metadata
Assignees
Labels
PerformanceMemory or execution speed performanceMemory or execution speed performance