-
Notifications
You must be signed in to change notification settings - Fork 1.1k
replace limits set with numpy/pandas slicing with np.maximum and np.minimum #201
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I'd like to understand this. Could you give one or two problematic On 2016-06-24 21:22, Will Holmgren wrote:
Photovoltaic Performance Labs Germany +49-761-8973-5603 (Europe) |
Sure. Consider haydavies: # a bunch of stuff that can handle any input type
# but this line throws a TypeError for scalar input
sky_diffuse[sky_diffuse < 0] = 0
# instead, use
sky_diffuse = np.maximum(sky_diffuse, 0) Similar for reindl, king, and beam_component. haurwitz has a couple of problems: # works with any input
clearsky_GHI = 1098.0 * cos_zenith * np.exp(-0.059/cos_zenith)
# throws a TypeError for scalar input
clearsky_GHI[clearsky_GHI < 0] = 0
# output is promoted to DataFrame whether you like it or not
# throws a ValueError for array input with 2 or more dimensions (including DataFrame input)
df_out = pd.DataFrame({'ghi':clearsky_GHI})
return df_out I suggest the following makes more sense for the low level functions: # works with any input
clearsky_GHI = 1098.0 * cos_zenith * np.exp(-0.059/cos_zenith)
# works with any input
clearsky_GHI = np.maximum(clearsky_GHI, 0)
# output type = input type
return clearsky_GHI The DataFrame wrapper could still be applied in the Location.get_clearsky method that enforces pandas timeseries. I only added the DataFrame wrapper to haurwitz so that it and the ineichen function would have a consistent return type. I suspect that the vast majority of pvlib analyses use 1-D array-like inputs, but it can be very convenient for functions to accept scalar inputs when you want to better understand what they do to your data. Finding an example made me realize that I was very sloppy in my lists. relativeairmass, physicaliam, and ashraeiam set the out of bounds data to nan, so they belong on the slightly harder list. snlinverter has one line that is easy to fix, but a second line that is not. Here's part of snlinverter with more comments: # promote scalars to a pandas series so that we can use indexing
# Exception if p_dc has more than 1 dimension
p_dc = pd.Series(p_dc)
ac_power = (Paco/(A-B) - C*(A-B)) * (p_dc-B) + C*((p_dc-B)**2)
# easily replaced with np.minimum
ac_power[ac_power > Paco] = Paco
# but this is more complicated
ac_power[p_dc < Pso] = - 1.0 * abs(Pnt)
# enables scalar in, scalar out
# but disables one element array/series in/out!
if len(ac_power) == 1:
ac_power = ac_power.ix[0]
return ac_power I think this would be better: ac_power = (Paco/(A-B) - C*(A-B)) * (p_dc-B) + C*((p_dc-B)**2)
ac_power = np.minimum(ac_power, Paco)
if np.isscalar(p_dc):
ac_power = -1.0*abs(Pnt) if p_dc < Pso else ac_power
else:
ac_power[p_dc < Pso] = - 1.0 * abs(Pnt)
return ac_power I'm not 100% sure about the best way to handle these cases. np.isscalar is probably preferable to try/except because exception handling is somewhere between hard and impossible in numba and cython (last I looked, at least). |
I should also note that although np.where works with scalar or n-dimensional array inputs, it will not preserve Pandas objects. In [30]: df = pd.DataFrame(np.array([[1,2],[3,4]]))
In [31]: np.where(df > 3, np.nan, df)
Out[31]:
array([[ 1., 2.],
[ 3., nan]]) |
Thanks! Now I see what you're getting at. Seems like a good direction to me for the lower level functions. |
Closing this old issue. Making the remaining functions work with scalar input is not trivial. We can treat them in their own issues if there is demand. |
Uh oh!
There was an error while loading. Please reload this page.
A number of functions use numpy/pandas slicing or clip to set their limits. This leads to unnecessary requirements on the input data, unnecessary type promotion of the output data, and/or unnecessary code in the functions. Many of the limits can instead be set with np.maximum and np.minimum.
The easy functions to fix include:
Other functions that rely on bins or masks need array-like inputs or coercion and require a little more effort. These include:
The text was updated successfully, but these errors were encountered: