-
Notifications
You must be signed in to change notification settings - Fork 1.1k
more asv tests for solar position, fix fuentes asv bug #1059
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Working now. I just needed to add numba to the FYI the I think this should be about ready to merge. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me, a few notes below/. Your comment about polluting a benchmark with pandas index slicing got me thinking about doing anything besides calling the function of interest. I think all of these cases are OK, but something to keep in mind.
equation_of_time = solarposition.equation_of_time_spencer71(dayofyear) | ||
solarposition.sun_rise_set_transit_geometric( | ||
self.times_daily, self.lat, self.lon, declination, | ||
equation_of_time) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what to think about running more than one function in a single benchmark. It would be a little awkward/repetitive to run benchmark each step of a sequence individually, but it seems like the rule of thumb "only test one thing at a time" for unit tests ought to apply to benchmarks as well.
That said, any benchmark is better than no benchmark!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also consider calculating dayofyear
in the setup function because it's external
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general I agree. In this case I wanted a more direct comparison to the other sun_rise_set_transit
functions, so I put all that in the test. The dayofyear
calculation is part of a fair comparison. But I also see the argument to test each of them individually and add them up. For now, I renamed the existing function to make it clear it's for comparison.
benchmarks/asv.conf.json
Outdated
@@ -121,6 +121,7 @@ | |||
// Note: these don't have a minimum in setup.py | |||
"pytables": "3.6.1", | |||
"ephem": "3.7.6.0", | |||
"numba": "", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we keep all the packages in the "minimum" list pinned to something, even if we don't actually have a minimum? That way any changes to numba that affect our performance will show up in one env but not the other.
Side note: I see that spa.py mentions a minimum numba version that isn't reflected in setup.py.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That minimum version is still accurate, as far as I know, but I'd be surprised if it's compatible with the rest of our minimum requirements.
conda search -c main numba | grep py36
shows that numba 0.36.1 is the oldest package compatible with python 3.6 and numpy 1.12. The numba git history says 0.36.1 was released on Dec 7, 2017. (0.17.0 was released Feb 3, 2015.) I'll add that to the asv conf.
I'd be fine with also adding a minimum numba requirement to setup.py, but I think that should be done in combination with changes to the import logic in spa.py. In particular, I don't like the numpy fallback - that led to hard to track down errors when developing this. It would also be worth reviewing modern numba best practices. If we're adding a minimum numba requirement to setup.py then we should probably be testing against it too. More than I want to tackle in this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# Tucson sunrise at 6:08 AM MST, 13:08 UTC according to google. | ||
solarposition.calc_time( | ||
datetime.datetime(2020, 9, 14, 12), | ||
datetime.datetime(2020, 9, 14, 15), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another case of calling external functions. %timeit datetime.datetime(2020, 1, 1, 1)
gives 280ns so it's not a huge deal in this case, more the principle of it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
refactored this test into a new class. I dropped the second calc_time
call now that I'm confident that asv can measure the single call accurately enough.
# datetime.datetime(2020, 9, 14, 13, 24, 13, 861913, tzinfo=<UTC>) | ||
|
||
# Tucson sunset at 6:30 PM MST, 01:30 UTC according to google | ||
pvlib.solarposition.calc_time( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't matter, but this line is pvlib.solarposition.calc_time
and the previous is just solarposition.calc_time
import pandas as pd | ||
|
||
import os | ||
os.environ['PVLIB_USE_NUMBA'] = '1' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the environment var might not be needed now that you added numba to the env specs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With setting the environment variable I see this:
(pvlib38ci) 8:22:16 [email protected] benchmarks solposasv ? asv run --bench SolarPosition --show-stderr
· Creating environments
· Discovering benchmarks
· Running 22 total benchmarks (1 commits * 2 environments * 11 benchmarks)
[ 0.00%] · For pvlib-python commit 30d784e7 <master>:
[ 0.00%] ·· Building for conda-py3.6-ephem3.7.6.0-numba-numpy1.12.0-pandas0.22.0-pytables3.6.1-scipy1.2.0
[ 0.00%] ·· Benchmarking conda-py3.6-ephem3.7.6.0-numba-numpy1.12.0-pandas0.22.0-pytables3.6.1-scipy1.2.0
[ 2.27%] ··· Running (solarposition.SolarPosition.time_calc_time--)...........
[ 27.27%] ··· solarposition.SolarPosition.time_calc_time 363±3μs
[ 29.55%] ··· solarposition.SolarPosition.time_ephemeris 22.6±0.5ms
[ 31.82%] ··· solarposition.SolarPosition.time_ephemeris_localized 22.4±0.2ms
[ 34.09%] ··· solarposition.SolarPosition.time_nrel_earthsun_distance 26.2±1ms
[ 36.36%] ··· solarposition.SolarPosition.time_pyephem 416±2ms
[ 38.64%] ··· solarposition.SolarPosition.time_spa_python 134±4ms
[ 40.91%] ··· solarposition.SolarPosition.time_sun_rise_set_transit_ephem 53.5±0.6ms
[ 43.18%] ··· solarposition.SolarPosition.time_sun_rise_set_transit_geometric 4.38±0.03ms
[ 45.45%] ··· solarposition.SolarPosition.time_sun_rise_set_transit_spa 22.1±0.3ms
[ 47.73%] ··· solarposition_numba.SolarPositionNumba.time_spa_python 28.1±1ms
[ 50.00%] ··· solarposition_numba.SolarPositionNumba.time_sun_rise_set_transit_spa 6.90±0.1ms
Without setting the environment variable (comment out the line), I see this:
(pvlib38ci) 8:20:05 [email protected] benchmarks solposasv ? asv run --bench SolarPosition --show-stderr
· Creating environments
· Discovering benchmarks
· Running 22 total benchmarks (1 commits * 2 environments * 11 benchmarks)
[ 0.00%] · For pvlib-python commit 30d784e7 <master>:
[ 0.00%] ·· Building for conda-py3.6-ephem3.7.6.0-numba-numpy1.12.0-pandas0.22.0-pytables3.6.1-scipy1.2.0
[ 0.00%] ·· Benchmarking conda-py3.6-ephem3.7.6.0-numba-numpy1.12.0-pandas0.22.0-pytables3.6.1-scipy1.2.0
[ 2.27%] ··· Running (solarposition.SolarPosition.time_calc_time--)...........
[ 27.27%] ··· solarposition.SolarPosition.time_calc_time 378±20μs
[ 29.55%] ··· solarposition.SolarPosition.time_ephemeris 23.8±1ms
[ 31.82%] ··· solarposition.SolarPosition.time_ephemeris_localized 22.7±0.5ms
[ 34.09%] ··· solarposition.SolarPosition.time_nrel_earthsun_distance 38.2±7ms
[ 36.36%] ··· solarposition.SolarPosition.time_pyephem 447±10ms
[ 38.64%] ··· solarposition.SolarPosition.time_spa_python 138±5ms
[ 40.91%] ··· solarposition.SolarPosition.time_sun_rise_set_transit_ephem 53.1±0.5ms
[ 43.18%] ··· solarposition.SolarPosition.time_sun_rise_set_transit_geometric 4.56±0.2ms
[ 45.45%] ··· solarposition.SolarPosition.time_sun_rise_set_transit_spa 22.3±0.7ms
[ 47.73%] ··· solarposition_numba.SolarPositionNumba.time_spa_python 29.3±2ms
[ 47.73%] ···· /Users/holmgren/git_repos/pvlib-python/benchmarks/env/4b3141485b42c1a3d28c30cd810d4559/lib/python3.6/site-packages/pvlib/solarposition.py:266: UserWarning: Reloading spa to use numba
warnings.warn('Reloading spa to use numba')
[ 50.00%] ··· solarposition_numba.SolarPositionNumba.time_sun_rise_set_transit_spa 6.81±0.2ms
[ 50.00%] ···· /Users/holmgren/git_repos/pvlib-python/benchmarks/env/4b3141485b42c1a3d28c30cd810d4559/lib/python3.6/site-packages/pvlib/solarposition.py:266: UserWarning: Reloading spa to use numba
warnings.warn('Reloading spa to use numba')
@kanderso-nrel thanks for the review. Thanks also for getting this set up and providing good instructions for us to follow! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! May want to add the PR number to the existing whatsnew entry.
I saw some odd timings when changing the number of days, so I thought it might be valuable to parameterize the length of time. In particular, @kanderso-nrel what do you think of the expanded tests? Useful? Too much?
|
OK with me. It's neat that in the HTML report you can switch the x-axis to be the parameter value instead of time. I think it will be useful. I suggest adding
|
docs/sphinx/source/api.rst
for API changes.docs/sphinx/source/whatsnew
for all changes. Includes link to the GitHub Issue with:issue:`num`
or this Pull Request with:pull:`num`
. Includes contributor name and/or GitHub username (link with:ghuser:`user`
).Here are the results on my machine:
I don't think numba is actually being used but I don't understand why not.
solarposition.ipynb
has some timing functions at the bottom. I reran those with the same environment and saw a 5x improvement with numba. The results are generally consistent with asv if not using numba: https://nbviewer.jupyter.org/github/wholmgren/pvlib-python/blob/solposasv/docs/tutorials/solarposition.ipynb#Speed-tests