-
Notifications
You must be signed in to change notification settings - Fork 53
Adds MLD analysis and reduces code in SST analysis #29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
archive_dir_ocn = /global/cscratch1/sd/golaz/ACME_simulations/20160805.A_WCYCL1850.ne30_oEC.edison.alpha7_00/run/ | ||
scratch_dir = /global/project/projectdirs/acme/lvroekel/20160805v0atm.A_WCYCL1850_v0atm.ne30_oEC.edison.alpha7_00.test.pp | ||
plots_dir = /global/project/projectdirs/acme/lvroekel/coupled_diagnostics_20160805v0atm.A_WCYCL1850_v0atm.ne30_oEC.edison.alpha7_00-20160520.A_WCYCL1850.ne30_oEC.edison.alpha6_01 | ||
log_dir = /global/project/projectdirs/acme/lvroekel/coupled_diagnostics_20160805v0atm.A_WCYCL1850_v0atm.ne30_oEC.edison.alpha7_00-20160520.A_WCYCL1850.ne30_oEC.edison.alpha6_01.logs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we want to change the defaults here, although it probably doesn't fully matter as long as particular entries aren't removed. However, it would be easy in the future to accidentally remove entires. @xylar should comment on this too.
@@ -42,10 +43,10 @@ climo_yr2 = 10 | |||
yr_offset = 1849 | |||
|
|||
[ohc_timeseries] | |||
generate = 1 | |||
generate = 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same idea for changing logical flags-- I don't think these should be edited unless absolutely necessary.
from ..shared.interpolation.interpolate import interp_fields, init_tree | ||
|
||
def mld_modelvsobs(config): | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing doc string describing function and its purpose.
infiles = "".join([indir,"/am.mpas-o.timeSeriesStats.????-*.nc"]) | ||
|
||
monthdictionary={'Jan':1,'Feb':2,'Mar':3,'Apr':4,'May':5,'Jun':6,'Jul':7,'Aug':8,'Sep':9,'Oct':10, | ||
'Nov':11,'Dec':12,'JFM':np.array([1,2,3]),'AMJ':np.array([4,5,6]),'JAS':np.array([7,8,9]), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First line non-white space characters should be aliged with {
above.
|
||
# Load in model data | ||
ds = xr.open_mfdataset(infiles, preprocess=lambda x: preprocess_mpas(x, yearoffset=yr_offset, | ||
timeSeriesStats=True, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same issue with indentation as above (timeSeriesStats
should be left justified with inflies
lonCell = f.variables["lonCell"][:] | ||
latCell = f.variables["latCell"][:] | ||
|
||
# Load observation MLD data |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comment indentation off from code indentation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are other places like this in this document too that have this issue.
|
||
dinmonth = np.array([31,28,31,30,31,30,31,31,30,31,30,31]) | ||
|
||
daysarray = np.ones((12,dsData.mld_dt_mean.values.shape[1],dsData.mld_dt_mean.values.shape[0])) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Spaces are needed after commas. This is an issue that is throughout PR.
|
||
else: | ||
modeldata = np.sum(dinmonth[monthsvalue-1]*monthly_clim.sel(month=monthsvalue).time_avg_dThreshMLD.values.T,axis=1) / np.sum(dinmonth[monthsvalue-1]) | ||
obsdata = np.nansum(daysarray[monthsvalue-1,:,:]*dsData.sel(iMONTH=monthsvalue).mld_dt_mean.values.T,axis=0) / np.nansum(daysarray[monthsvalue-1,:,:],axis=0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably need a line break on these really long lines
diffTitle = "Model-Observations", | ||
cbarlabel = r"$^o$C") | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please provide a main type function to help modularize this and allow it to be run in script mode with command line arguments? This helps generalize the code and increase its utility, e.g., something like
if __name__ == "__main__":
# process command line arguments and feed to main below
main()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pwolfram could you comment on a use case that wouldn't be covered by disabling other diagnostics except these in the run_analysis? I could do this, but I can't see a pressing use.
@@ -0,0 +1,47 @@ | |||
import numpy as np |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please see comments about docstring above. It would also be good to include some testing functions in this module too so that we can verify it is working properly. This can be as simple as functions with name test_*
that are called in a main file when this is run as a script. Assert statements can be used to ensure desired behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need a docstring on this module.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We probably also need tests although we can wait on that if necessary for now.
@@ -86,8 +86,14 @@ def analysis(config): #{{{ | |||
if config.getboolean('sst_modelvsobs','generate'): | |||
print "" | |||
print "Plotting 2-d maps of SST climatologies..." | |||
from mpas_analysis.ocean.sst_modelvsobs import sst_modelvsobs | |||
from mpas_analysis.ocean.ocean_modelvsobs import sst_modelvsobs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see now-- you are generalizing the above function. Very good.
@xylar, my understanding is that new analysis should have registered additions to config.analysis, right? However, I'm not sure we want to have changes to default values unless absolutely necessary. What we do need are addition of new variables, right? I'm guessing a good workflow here would be to include changes to config.analysis that are required for testing in a gist. But I'm not 100% sure if that is the best way to proceed. |
@vanroekel, I've added some comments to the code. Feel free to not take the style comments too seriously for now (e.g., white space issues, etc.). I think we do want more doc strings to document the code than currently exist, however. I think we will want to squash some commits prior to merging this too. |
|
||
infiles = "".join([indir,"/am.mpas-o.timeSeriesStats.????-*.nc"]) | ||
|
||
monthdictionary={'Jan':1,'Feb':2,'Mar':3,'Apr':4,'May':5,'Jun':6,'Jul':7,'Aug':8,'Sep':9,'Oct':10, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Things like this should probably be in a shared constants location.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe, shared/constants.py
or some such would work
latDATA = latDATA.flatten() | ||
lonDATA = lonDATA.flatten() | ||
|
||
dinmonth = np.array([31,28,31,30,31,30,31,31,30,31,30,31]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably should be in constants file
Okay @pwolfram, I've updated everything except the main function issue. I may have missed a few other minor formatting things perhaps. |
@vanroekel, can you please add some tests to |
@pwolfram the config.analysis script I used on edison is at I will try add tests to interpolate.py soon. I still need to read through your example and the xarray example. |
@vanroekel, I think that you just need functions of the form |
FYI, I will postpone testing for now until we more or less have the code we want to commit to ensure that the full testing is as close to the final version as possible. |
@vanroekel, one question / idea- would it be possible to further generalize the |
@pwolfram and @milenaveneziani, looking at the sea-ice code, I think it certainly may be possible to generalize all of these into one. It fits with how I was eventually envisioning the SST and MLD comparisons as a plot2d_fields_comparison, where no data field is assumed or if it is model to obs. I think the simplification would take some thought as there seem to be enough differences in the sea-ice and ocean plotting scripts. |
@vanroekel, @pwolfram: first of all, thank you for working on this. It's great to have a general routine to do global 2d climatological maps. There are a few reasons why I think we are not ready to generalize the sea-ice script as well:
|
@milenaveneziani these are some of the exact pieces I was thinking about. The SST/MLD data is not preprocessed. For merging with sea-ice this would have to change. Not a hard change though. I would tend to agree that merging sea-ice and ocean model vs obs is a longer term goal, but should be a goal. |
@vanroekel and @milenaveneziani, should we just merge the existing code once we are all happy with this PR? I'm ok waiting to do a more generalized aggregation of the scripts and at this point agree that it is wise to wait for the generalization, which we can keep track of in #31. |
@pwolfram I would be in favor of merging this PR prior to generalization. I'm working on the unit testing now. So I'd like to wait for these additions. |
@pwolfram: I would also merge this prior to generalization. |
@vanroekel, thanks for moving the testing file. pytest checks out now. |
Also, this points to the need for CI regression testing as we have discussed earlier now that we are getting some regression tests. |
@pwolfram I gave you the wrong path in the config.analysis file, see the new version in same location. The path to files should be ''' |
@vanroekel: are you asking to move the MLD observation data to edison? Sorry if I haven't done this already, but you only have the data on IC, right? I think I also have it at OLCF, so I can look into that. Having said that, as I mentioned to Luke already, we should not test any of this within the ACME repo just yet (especially now that we are dealing with a troublesome merge to master). I think the workflow in the future should be similar to that of MPAS and ACME main codes: we make a bunch of changes in MPAS-Analysis, test them thoroughly, update the ACME analysis repo locally and do more testing, then merge to the ACME analysis master. Or at least this seems like the safer way to me. |
@milenaveneziani, the data is on edison at /global/homes/l/lvroekel/MLD @pwolfram, there is an issue in the code right now in testing on edison. I'm debugging. |
f9533d9
to
8259e55
Compare
@pwolfram, confirmed the code just pushed works on edison. |
@milenaveneziani wrote:
I agree that this approach is needed. One way to handle this without disrupting our workflow would be to introduce a |
I went ahead and created a develop branch. I figured it can't do any harm. I'm going to change this PR to merge into develop for now, Let me know if that worries any of you. |
Just realized the above line by @xylar was the buy-in step. However, perhaps we should formalize these things as github issues for discussion in the future to make things a little more transparent. Note, for the record, @xylar's idea of develop is a really good idea because it provides us an additional safety net that we can use to resolve serious issues if something goes wrong prior to merging to master. |
I'll be merging to develop for this PR following testing. @vanroekel, permissions are broken again for the config file at |
@pwolfram sorry for the permissions issue. Try now. |
@vanroekel, does this error make sense to you? ┌─[pwolfram][edison03][~/test_MA_vanroekel][21:46][±][develop ✓]
└─▪ python run_analysis.py config.analysis.test
Plotting 2-d maps of SST climatologies...
Traceback (most recent call last):
File "run_analysis.py", line 133, in <module>
analysis(config)
File "run_analysis.py", line 90, in analysis
ocn_modelvsobs(config, 'sst')
File "/global/u2/p/pwolfram/test_MA_vanroekel/mpas_analysis/ocean/ocean_modelvsobs.py", line 153, in ocn_modelvsobs
if np.issubdtype(monthsvalue, int):
File "/global/homes/p/pwolfram/miniconda2/lib/python2.7/site-packages/numpy/core/numerictypes.py", line 761, in issubdtype
return issubclass(dtype(arg1).type, val)
TypeError: data type not understood |
@pwolfram, yes, I had this and thought I fixed it. This line (153)
should be
I had fixed this and thought I pushed changes, but obviously not. I can clean this up tomorrow morning |
@vanroekel, thanks- sounds good. |
@pwolfram wrote:
Yes, I think you are right. I felt a little bit conflicted about unilaterally adding a develop branch. At the same time, I felt like there was some urgency since a solution was needed to prevent conflicts with ACME and to allow testing of a number of changes before pulling them into ACME. I was afraid the damage would be done before the discussion could take place. I'll create an issue page now. |
This is the first attempt at unit testing in the interpolate.py module. It tests all functionality within that bit of code using pytest Adds an arrayAssertEqual, greaterThan, and lessThan Fixes trailing white space and int to floats Justify issue fixed Updates testing Fixes a bug Small fix
8259e55
to
1325ca9
Compare
@pwolfram the small fix described above has been pushed. Should be ready for you to test once cscratch on edison is back. |
CSCRATCH will be down for a couple of days. Do we have any alternative for testing in the meantime? |
The files I used for testing are also on hpss at nersc. I will try download them to /scratch2/scratchdirs/lvroekel for @pwolfram to finish testing this PR |
@vanroekel, that's great. Can you give me permission to see your scratch folder? That way I can use the files to test #38 as well (once I rebase onto this branch). |
@xylar , I am still downloading data, as soon as it is all there (Should be done shortly), I'll alter permissions. Do you need more than the sea ice and ocean timeSeriesStats files? |
@xylar and @milenaveneziani I downloaded timeseriesstats ocean and sea-ice files to scratch2 for testing. Permissions should be okay for all to see, if not let me know. If you need additional files, please let me know as well. |
┌─[pwolfram][edison08][~/test_MA_vanroekel][19:58][±][ ✓]
└─▪ python run_analysis.py config.analysis.test
Plotting 2-d maps of SST climatologies...
Plotting 2-d maps of MLD climatologies... and output is visually indistinguishable from posted photos above in PR description. Thanks @vanroekel, @xylar, and @milenaveneziani for sticking together so well through these HPC challenges. Great teamwork everyone! |
This adds a MLD model vs observations plotting script (against the holte and talley data set). The previous sst_modelvsobs.py has been subsumed into ocean_modelvsobs.py. I also moved the shared interpolation code to its own module in mpas-analysis/shared
I've tried to reduce the code and computations as well. Previously, BIAS comparisons were calculated at every month and seasonally even if only a few were selected. Now only those times selected are computed.