Skip to content

more asv tests for solar position, fix fuentes asv bug #1059

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 17 commits into from
Sep 16, 2020

Conversation

wholmgren
Copy link
Member

@wholmgren wholmgren commented Sep 14, 2020

  • Closes #xxxx
  • I am familiar with the contributing guidelines
  • Tests added
  • Updates entries to docs/sphinx/source/api.rst for API changes.
  • Adds description and name entries in the appropriate "what's new" file in docs/sphinx/source/whatsnew for all changes. Includes link to the GitHub Issue with :issue:`num` or this Pull Request with :pull:`num`. Includes contributor name and/or GitHub username (link with :ghuser:`user`).
  • New code is fully documented. Includes numpydoc compliant docstrings, examples, and comments where necessary.
  • Pull request is nearly complete and ready for detailed review.
  • Maintainer: Appropriate GitHub Labels and Milestone are assigned to the Pull Request and linked Issue.

Here are the results on my machine:

Screen Shot 2020-09-14 at 3 24 40 PM

I don't think numba is actually being used but I don't understand why not.

solarposition.ipynb has some timing functions at the bottom. I reran those with the same environment and saw a 5x improvement with numba. The results are generally consistent with asv if not using numba: https://nbviewer.jupyter.org/github/wholmgren/pvlib-python/blob/solposasv/docs/tutorials/solarposition.ipynb#Speed-tests

@wholmgren wholmgren added performance solarfx2 DOE SETO Solar Forecasting 2 / Solar Forecast Arbiter labels Sep 14, 2020
@wholmgren wholmgren added this to the 0.8.1 milestone Sep 14, 2020
@wholmgren
Copy link
Member Author

Working now. I just needed to add numba to the asv.conf.json file.

Screen Shot 2020-09-14 at 8 09 25 PM

FYI the calc_time test takes half as long (within measurement error) if you remove the second call in that method. So asv seems to be able to find good numbers for tests that take just hundreds of microseconds.

I think this should be about ready to merge.

Copy link
Member

@kandersolar kandersolar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, a few notes below/. Your comment about polluting a benchmark with pandas index slicing got me thinking about doing anything besides calling the function of interest. I think all of these cases are OK, but something to keep in mind.

equation_of_time = solarposition.equation_of_time_spencer71(dayofyear)
solarposition.sun_rise_set_transit_geometric(
self.times_daily, self.lat, self.lon, declination,
equation_of_time)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what to think about running more than one function in a single benchmark. It would be a little awkward/repetitive to run benchmark each step of a sequence individually, but it seems like the rule of thumb "only test one thing at a time" for unit tests ought to apply to benchmarks as well.

That said, any benchmark is better than no benchmark!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also consider calculating dayofyear in the setup function because it's external

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general I agree. In this case I wanted a more direct comparison to the other sun_rise_set_transit functions, so I put all that in the test. The dayofyear calculation is part of a fair comparison. But I also see the argument to test each of them individually and add them up. For now, I renamed the existing function to make it clear it's for comparison.

@@ -121,6 +121,7 @@
// Note: these don't have a minimum in setup.py
"pytables": "3.6.1",
"ephem": "3.7.6.0",
"numba": "",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we keep all the packages in the "minimum" list pinned to something, even if we don't actually have a minimum? That way any changes to numba that affect our performance will show up in one env but not the other.

Side note: I see that spa.py mentions a minimum numba version that isn't reflected in setup.py.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That minimum version is still accurate, as far as I know, but I'd be surprised if it's compatible with the rest of our minimum requirements.

conda search -c main numba | grep py36 shows that numba 0.36.1 is the oldest package compatible with python 3.6 and numpy 1.12. The numba git history says 0.36.1 was released on Dec 7, 2017. (0.17.0 was released Feb 3, 2015.) I'll add that to the asv conf.

I'd be fine with also adding a minimum numba requirement to setup.py, but I think that should be done in combination with changes to the import logic in spa.py. In particular, I don't like the numpy fallback - that led to hard to track down errors when developing this. It would also be worth reviewing modern numba best practices. If we're adding a minimum numba requirement to setup.py then we should probably be testing against it too. More than I want to tackle in this PR.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

# Tucson sunrise at 6:08 AM MST, 13:08 UTC according to google.
solarposition.calc_time(
datetime.datetime(2020, 9, 14, 12),
datetime.datetime(2020, 9, 14, 15),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another case of calling external functions. %timeit datetime.datetime(2020, 1, 1, 1) gives 280ns so it's not a huge deal in this case, more the principle of it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

refactored this test into a new class. I dropped the second calc_time call now that I'm confident that asv can measure the single call accurately enough.

# datetime.datetime(2020, 9, 14, 13, 24, 13, 861913, tzinfo=<UTC>)

# Tucson sunset at 6:30 PM MST, 01:30 UTC according to google
pvlib.solarposition.calc_time(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't matter, but this line is pvlib.solarposition.calc_time and the previous is just solarposition.calc_time

import pandas as pd

import os
os.environ['PVLIB_USE_NUMBA'] = '1'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the environment var might not be needed now that you added numba to the env specs

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With setting the environment variable I see this:

(pvlib38ci) 8:22:16 [email protected] benchmarks solposasv ? asv run --bench SolarPosition --show-stderr
· Creating environments
· Discovering benchmarks
· Running 22 total benchmarks (1 commits * 2 environments * 11 benchmarks)
[  0.00%] · For pvlib-python commit 30d784e7 <master>:
[  0.00%] ·· Building for conda-py3.6-ephem3.7.6.0-numba-numpy1.12.0-pandas0.22.0-pytables3.6.1-scipy1.2.0
[  0.00%] ·· Benchmarking conda-py3.6-ephem3.7.6.0-numba-numpy1.12.0-pandas0.22.0-pytables3.6.1-scipy1.2.0
[  2.27%] ··· Running (solarposition.SolarPosition.time_calc_time--)...........
[ 27.27%] ··· solarposition.SolarPosition.time_calc_time                                                       363±3μs
[ 29.55%] ··· solarposition.SolarPosition.time_ephemeris                                                    22.6±0.5ms
[ 31.82%] ··· solarposition.SolarPosition.time_ephemeris_localized                                          22.4±0.2ms
[ 34.09%] ··· solarposition.SolarPosition.time_nrel_earthsun_distance                                         26.2±1ms
[ 36.36%] ··· solarposition.SolarPosition.time_pyephem                                                         416±2ms
[ 38.64%] ··· solarposition.SolarPosition.time_spa_python                                                      134±4ms
[ 40.91%] ··· solarposition.SolarPosition.time_sun_rise_set_transit_ephem                                   53.5±0.6ms
[ 43.18%] ··· solarposition.SolarPosition.time_sun_rise_set_transit_geometric                              4.38±0.03ms
[ 45.45%] ··· solarposition.SolarPosition.time_sun_rise_set_transit_spa                                     22.1±0.3ms
[ 47.73%] ··· solarposition_numba.SolarPositionNumba.time_spa_python                                          28.1±1ms
[ 50.00%] ··· solarposition_numba.SolarPositionNumba.time_sun_rise_set_transit_spa                          6.90±0.1ms

Without setting the environment variable (comment out the line), I see this:

(pvlib38ci) 8:20:05 [email protected] benchmarks solposasv ? asv run --bench SolarPosition --show-stderr
· Creating environments
· Discovering benchmarks
· Running 22 total benchmarks (1 commits * 2 environments * 11 benchmarks)
[  0.00%] · For pvlib-python commit 30d784e7 <master>:
[  0.00%] ·· Building for conda-py3.6-ephem3.7.6.0-numba-numpy1.12.0-pandas0.22.0-pytables3.6.1-scipy1.2.0
[  0.00%] ·· Benchmarking conda-py3.6-ephem3.7.6.0-numba-numpy1.12.0-pandas0.22.0-pytables3.6.1-scipy1.2.0
[  2.27%] ··· Running (solarposition.SolarPosition.time_calc_time--)...........
[ 27.27%] ··· solarposition.SolarPosition.time_calc_time                                                      378±20μs
[ 29.55%] ··· solarposition.SolarPosition.time_ephemeris                                                      23.8±1ms
[ 31.82%] ··· solarposition.SolarPosition.time_ephemeris_localized                                          22.7±0.5ms
[ 34.09%] ··· solarposition.SolarPosition.time_nrel_earthsun_distance                                         38.2±7ms
[ 36.36%] ··· solarposition.SolarPosition.time_pyephem                                                        447±10ms
[ 38.64%] ··· solarposition.SolarPosition.time_spa_python                                                      138±5ms
[ 40.91%] ··· solarposition.SolarPosition.time_sun_rise_set_transit_ephem                                   53.1±0.5ms
[ 43.18%] ··· solarposition.SolarPosition.time_sun_rise_set_transit_geometric                               4.56±0.2ms
[ 45.45%] ··· solarposition.SolarPosition.time_sun_rise_set_transit_spa                                     22.3±0.7ms
[ 47.73%] ··· solarposition_numba.SolarPositionNumba.time_spa_python                                          29.3±2ms
[ 47.73%] ···· /Users/holmgren/git_repos/pvlib-python/benchmarks/env/4b3141485b42c1a3d28c30cd810d4559/lib/python3.6/site-packages/pvlib/solarposition.py:266: UserWarning: Reloading spa to use numba
               warnings.warn('Reloading spa to use numba')

[ 50.00%] ··· solarposition_numba.SolarPositionNumba.time_sun_rise_set_transit_spa                          6.81±0.2ms
[ 50.00%] ···· /Users/holmgren/git_repos/pvlib-python/benchmarks/env/4b3141485b42c1a3d28c30cd810d4559/lib/python3.6/site-packages/pvlib/solarposition.py:266: UserWarning: Reloading spa to use numba
               warnings.warn('Reloading spa to use numba')

@wholmgren
Copy link
Member Author

@kanderso-nrel thanks for the review. Thanks also for getting this set up and providing good instructions for us to follow!

Copy link
Member

@kandersolar kandersolar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! May want to add the PR number to the existing whatsnew entry.

@wholmgren
Copy link
Member Author

I made a typo when adding a comment about the number of days in self.times and carried that into a mistake in the benchmark for the sun_rise_set_transit functions. Fixed the typo and modified the number of days in the sun_rise_set_transit functions for consistency.

Screen Shot 2020-09-15 at 11 05 57 AM

@wholmgren
Copy link
Member Author

I saw some odd timings when changing the number of days, so I thought it might be valuable to parameterize the length of time. In particular, time_sun_rise_set_transit_geometric_full_comparison and time_sun_rise_set_transit_spa don't scale like I thought they would. The former might be slowed by the dayofyear accessor. Not sure about the latter.

@kanderso-nrel what do you think of the expanded tests? Useful? Too much?

Benchmark Value Recent change Changed at
solarposition_numba.SolarPositionNumba.time_spa_python (1) 4.022ms    
solarposition_numba.SolarPositionNumba.time_spa_python (10) 29.969ms    
solarposition_numba.SolarPositionNumba.time_spa_python (100) 294.349ms    
solarposition_numba.SolarPositionNumba.time_sun_rise_set_transit_spa (1) 2.953ms    
solarposition_numba.SolarPositionNumba.time_sun_rise_set_transit_spa (10) 5.113ms    
solarposition_numba.SolarPositionNumba.time_sun_rise_set_transit_spa (100) 8.041ms    
solarposition.SolarPosition.time_ephemeris (1) 9.348ms    
solarposition.SolarPosition.time_ephemeris (10) 23.551ms    
solarposition.SolarPosition.time_ephemeris (100) 130.476ms    
solarposition.SolarPosition.time_ephemeris_localized (1) 9.344ms    
solarposition.SolarPosition.time_ephemeris_localized (10) 23.666ms    
solarposition.SolarPosition.time_ephemeris_localized (100) 129.466ms    
solarposition.SolarPosition.time_nrel_earthsun_distance (1) 4.092ms    
solarposition.SolarPosition.time_nrel_earthsun_distance (10) 25.885ms    
solarposition.SolarPosition.time_nrel_earthsun_distance (100) 229.711ms    
solarposition.SolarPosition.time_pyephem (1) 44.637ms    
solarposition.SolarPosition.time_pyephem (10) 417.869ms    
solarposition.SolarPosition.time_pyephem (100) 4.162s    
solarposition.SolarPosition.time_spa_python (1) 20.751ms    
solarposition.SolarPosition.time_spa_python (10) 137.507ms    
solarposition.SolarPosition.time_spa_python (100) 1.273s    
solarposition.SolarPosition.time_sun_rise_set_transit_ephem (1) 1.286ms    
solarposition.SolarPosition.time_sun_rise_set_transit_ephem (10) 6.078ms    
solarposition.SolarPosition.time_sun_rise_set_transit_ephem (100) 53.456ms    
solarposition.SolarPosition.time_sun_rise_set_transit_geometric_full_comparison (1) 15.915ms    
solarposition.SolarPosition.time_sun_rise_set_transit_geometric_full_comparison (10) 17.692ms    
solarposition.SolarPosition.time_sun_rise_set_transit_geometric_full_comparison (100) 17.531ms    
solarposition.SolarPosition.time_sun_rise_set_transit_spa (1) 28.536ms    
solarposition.SolarPosition.time_sun_rise_set_transit_spa (10) 29.111ms    
solarposition.SolarPosition.time_sun_rise_set_transit_spa (100) 34.511ms    
solarposition.SolarPositionCalcTime.time_calc_time 189.697μs  

@kandersolar
Copy link
Member

OK with me. It's neat that in the HTML report you can switch the x-axis to be the parameter value instead of time. I think it will be useful. I suggest adding param_names = ['ndays'] or something to the class definitions so that the logging and HTML report show that instead of param1. Only negative I see is that it makes the logging from asv run ... a little cluttered, but it's not the end of the world:

[ 36.21%] ··· location.Location_0_6_1.time_location_get_sun_rise_set_transit_pyephem                                                                                                                     326±0ms
[ 37.07%] ··· location.Location_0_6_1.time_location_get_sun_rise_set_transit_spa                                                                                                                         100±0ms
[ 37.93%] ··· solarposition.SolarPosition.time_ephemeris                                                                                                                                                      ok
[ 37.93%] ··· ======== ==========
               ndays            
              -------- ----------
                 1      35.9±0ms 
                 10     85.9±0ms 
                100     326±0ms  
              ======== ==========
[ 38.79%] ··· solarposition.SolarPosition.time_ephemeris_localized                                                                                                                                            ok

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance solarfx2 DOE SETO Solar Forecasting 2 / Solar Forecast Arbiter
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants