Skip to content

replace limits set with numpy/pandas slicing with np.maximum and np.minimum #201

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
12 of 15 tasks
wholmgren opened this issue Jun 24, 2016 · 5 comments
Closed
12 of 15 tasks

Comments

@wholmgren
Copy link
Member

wholmgren commented Jun 24, 2016

A number of functions use numpy/pandas slicing or clip to set their limits. This leads to unnecessary requirements on the input data, unnecessary type promotion of the output data, and/or unnecessary code in the functions. Many of the limits can instead be set with np.maximum and np.minimum.

The easy functions to fix include:

  • haurwitz
  • beam_component
  • haydavies
  • reindl
  • king

Other functions that rely on bins or masks need array-like inputs or coercion and require a little more effort. These include:

  • perez
  • disc
  • dirint
  • erbs
  • singleaxis
  • relativeairmass
  • ashraeiam
  • physicaliam
  • snlinverter
  • pyephem_earthsun_distance
@wholmgren wholmgren added this to the 0.4.0 milestone Jun 24, 2016
@adriesse
Copy link
Member

I'd like to understand this. Could you give one or two problematic
examples?

On 2016-06-24 21:22, Will Holmgren wrote:

A number of functions use numpy/pandas slicing or clip to set their
limits. This leads to unnecessary requirements on the input data,
unnecessary type promotion of the output data, and/or unnecessary code
in the functions. Many of the limits can instead be set with
np.maximum
http://docs.scipy.org/doc/numpy/reference/generated/numpy.maximum.html
and np.minimum
http://docs.scipy.org/doc/numpy/reference/generated/numpy.minimum.html.

The easy functions to fix include:

relativeairmass
ashraeiam
physicaliam
snlinverter
haurwitz
beam_component
haydavies
reindl
king

Other functions that rely on bins or masks need array-like inputs and
require a little more effort. These include:

perez
disc
dirint
erbs
singleaxis


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#201, or mute the thread
https://github.com/notifications/unsubscribe/AIlYQ1nxcHZn0kjoykd4Gjlwg9sjLut_ks5qPC6IgaJpZM4I-DWp.

Photovoltaic Performance Labs Germany
Emmy-Noether-Str. 2
79110 Freiburg
Germany

+49-761-8973-5603 (Europe)
+49-174-532-7677 (Mobile)
+1-613-817-1652 (North America)

www.pvperformancelabs.com

@wholmgren
Copy link
Member Author

Sure. Consider haydavies:

# a bunch of stuff that can handle any input type
# but this line throws a TypeError for scalar input
sky_diffuse[sky_diffuse < 0] = 0

# instead, use
sky_diffuse = np.maximum(sky_diffuse, 0)

Similar for reindl, king, and beam_component.

haurwitz has a couple of problems:

# works with any input
clearsky_GHI = 1098.0 * cos_zenith * np.exp(-0.059/cos_zenith)

# throws a TypeError for scalar input
clearsky_GHI[clearsky_GHI < 0] = 0

# output is promoted to DataFrame whether you like it or not
# throws a ValueError for array input with 2 or more dimensions (including DataFrame input)
df_out = pd.DataFrame({'ghi':clearsky_GHI})
return df_out

I suggest the following makes more sense for the low level functions:

# works with any input
clearsky_GHI = 1098.0 * cos_zenith * np.exp(-0.059/cos_zenith)

# works with any input
clearsky_GHI = np.maximum(clearsky_GHI, 0)

# output type = input type
return clearsky_GHI

The DataFrame wrapper could still be applied in the Location.get_clearsky method that enforces pandas timeseries. I only added the DataFrame wrapper to haurwitz so that it and the ineichen function would have a consistent return type.

I suspect that the vast majority of pvlib analyses use 1-D array-like inputs, but it can be very convenient for functions to accept scalar inputs when you want to better understand what they do to your data.

Finding an example made me realize that I was very sloppy in my lists. relativeairmass, physicaliam, and ashraeiam set the out of bounds data to nan, so they belong on the slightly harder list. snlinverter has one line that is easy to fix, but a second line that is not.

Here's part of snlinverter with more comments:

    # promote scalars to a pandas series so that we can use indexing
    # Exception if p_dc has more than 1 dimension
    p_dc = pd.Series(p_dc)

    ac_power = (Paco/(A-B) - C*(A-B)) * (p_dc-B) + C*((p_dc-B)**2)
    # easily replaced with np.minimum
    ac_power[ac_power > Paco] = Paco
    # but this is more complicated
    ac_power[p_dc < Pso] = - 1.0 * abs(Pnt)

    # enables scalar in, scalar out
    # but disables one element array/series in/out!
    if len(ac_power) == 1:
        ac_power = ac_power.ix[0]

    return ac_power

I think this would be better:

    ac_power = (Paco/(A-B) - C*(A-B)) * (p_dc-B) + C*((p_dc-B)**2)
    ac_power = np.minimum(ac_power, Paco)

    if np.isscalar(p_dc):
        ac_power = -1.0*abs(Pnt) if p_dc < Pso else ac_power
    else:
        ac_power[p_dc < Pso] = - 1.0 * abs(Pnt)

    return ac_power

I'm not 100% sure about the best way to handle these cases. np.isscalar is probably preferable to try/except because exception handling is somewhere between hard and impossible in numba and cython (last I looked, at least).

@wholmgren
Copy link
Member Author

I should also note that although np.where works with scalar or n-dimensional array inputs, it will not preserve Pandas objects.

In [30]: df = pd.DataFrame(np.array([[1,2],[3,4]]))

In [31]: np.where(df > 3, np.nan, df)
Out[31]:
array([[  1.,   2.],
       [  3.,  nan]])

@adriesse
Copy link
Member

adriesse commented Jun 25, 2016

Thanks!  Now I see what you're getting at.  Seems like a good direction to me for the lower level functions.

@wholmgren
Copy link
Member Author

Closing this old issue. Making the remaining functions work with scalar input is not trivial. We can treat them in their own issues if there is demand.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants