Adds MLD analysis and reduces code in SST analysis #29

vanroekel · 2016-10-21T14:15:11Z

This adds a MLD model vs observations plotting script (against the holte and talley data set). The previous sst_modelvsobs.py has been subsumed into ocean_modelvsobs.py. I also moved the shared interpolation code to its own module in mpas-analysis/shared

I've tried to reduce the code and computations as well. Previously, BIAS comparisons were calculated at every month and seasonally even if only a few were selected. Now only those times selected are computed.

vanroekel · 2016-10-21T14:16:53Z

Testing:

I ran this on edison and my local laptop. A few images are plotted below. The directories are in the config.analysis within this PR.

I was not sure if I should commit changes to the config.analysis. Since I added options, I thought it was important.

pwolfram · 2016-10-21T14:29:00Z

config.analysis

+archive_dir_ocn = /global/cscratch1/sd/golaz/ACME_simulations/20160805.A_WCYCL1850.ne30_oEC.edison.alpha7_00/run/ 
+scratch_dir = /global/project/projectdirs/acme/lvroekel/20160805v0atm.A_WCYCL1850_v0atm.ne30_oEC.edison.alpha7_00.test.pp
+plots_dir = /global/project/projectdirs/acme/lvroekel/coupled_diagnostics_20160805v0atm.A_WCYCL1850_v0atm.ne30_oEC.edison.alpha7_00-20160520.A_WCYCL1850.ne30_oEC.edison.alpha6_01
+log_dir = /global/project/projectdirs/acme/lvroekel/coupled_diagnostics_20160805v0atm.A_WCYCL1850_v0atm.ne30_oEC.edison.alpha7_00-20160520.A_WCYCL1850.ne30_oEC.edison.alpha6_01.logs


I don't think we want to change the defaults here, although it probably doesn't fully matter as long as particular entries aren't removed. However, it would be easy in the future to accidentally remove entires. @xylar should comment on this too.

pwolfram · 2016-10-21T14:29:27Z

config.analysis

@@ -42,10 +43,10 @@ climo_yr2 = 10
 yr_offset = 1849

 [ohc_timeseries]
-generate = 1
+generate = 0


Same idea for changing logical flags-- I don't think these should be edited unless absolutely necessary.

pwolfram · 2016-10-21T14:30:50Z

mpas_analysis/ocean/ocean_modelvsobs.py

+from ..shared.interpolation.interpolate import interp_fields, init_tree
+
+def mld_modelvsobs(config):
+


Missing doc string describing function and its purpose.

pwolfram · 2016-10-21T14:31:23Z

mpas_analysis/ocean/ocean_modelvsobs.py

+    infiles = "".join([indir,"/am.mpas-o.timeSeriesStats.????-*.nc"])
+
+    monthdictionary={'Jan':1,'Feb':2,'Mar':3,'Apr':4,'May':5,'Jun':6,'Jul':7,'Aug':8,'Sep':9,'Oct':10,
+                         'Nov':11,'Dec':12,'JFM':np.array([1,2,3]),'AMJ':np.array([4,5,6]),'JAS':np.array([7,8,9]),


First line non-white space characters should be aliged with { above.

pwolfram · 2016-10-21T14:31:57Z

mpas_analysis/ocean/ocean_modelvsobs.py

+
+      # Load in model data
+    ds = xr.open_mfdataset(infiles, preprocess=lambda x: preprocess_mpas(x, yearoffset=yr_offset,
+                                timeSeriesStats=True,


Same issue with indentation as above (timeSeriesStats should be left justified with inflies

pwolfram · 2016-10-21T14:32:19Z

mpas_analysis/ocean/ocean_modelvsobs.py

+    lonCell = f.variables["lonCell"][:]
+    latCell = f.variables["latCell"][:]
+
+# Load observation MLD data


Comment indentation off from code indentation

There are other places like this in this document too that have this issue.

pwolfram · 2016-10-21T14:32:50Z

mpas_analysis/ocean/ocean_modelvsobs.py

+
+    dinmonth = np.array([31,28,31,30,31,30,31,31,30,31,30,31])
+
+    daysarray = np.ones((12,dsData.mld_dt_mean.values.shape[1],dsData.mld_dt_mean.values.shape[0]))


Spaces are needed after commas. This is an issue that is throughout PR.

pwolfram · 2016-10-21T14:33:31Z

mpas_analysis/ocean/ocean_modelvsobs.py

+
+        else:
+            modeldata = np.sum(dinmonth[monthsvalue-1]*monthly_clim.sel(month=monthsvalue).time_avg_dThreshMLD.values.T,axis=1) / np.sum(dinmonth[monthsvalue-1])
+            obsdata = np.nansum(daysarray[monthsvalue-1,:,:]*dsData.sel(iMONTH=monthsvalue).mld_dt_mean.values.T,axis=0) / np.nansum(daysarray[monthsvalue-1,:,:],axis=0)


Probably need a line break on these really long lines

pwolfram · 2016-10-21T14:35:18Z

mpas_analysis/ocean/ocean_modelvsobs.py

+            diffTitle = "Model-Observations",
+            cbarlabel = r"$^o$C")    
+
+


Can you please provide a main type function to help modularize this and allow it to be run in script mode with command line arguments? This helps generalize the code and increase its utility, e.g., something like

if __name__ == "__main__": # process command line arguments and feed to main below main()

@pwolfram could you comment on a use case that wouldn't be covered by disabling other diagnostics except these in the run_analysis? I could do this, but I can't see a pressing use.

pwolfram · 2016-10-21T14:36:54Z

mpas_analysis/shared/interpolation/interpolate.py

@@ -0,0 +1,47 @@
+import numpy as np


Please see comments about docstring above. It would also be good to include some testing functions in this module too so that we can verify it is working properly. This can be as simple as functions with name test_* that are called in a main file when this is run as a script. Assert statements can be used to ensure desired behavior.

We need a docstring on this module.

We probably also need tests although we can wait on that if necessary for now.

pwolfram · 2016-10-21T14:37:43Z

run_analysis.py

@@ -86,8 +86,14 @@ def analysis(config): #{{{
    if config.getboolean('sst_modelvsobs','generate'):
        print ""
        print "Plotting 2-d maps of SST climatologies..."
-        from mpas_analysis.ocean.sst_modelvsobs import sst_modelvsobs
+        from mpas_analysis.ocean.ocean_modelvsobs import sst_modelvsobs


I see now-- you are generalizing the above function. Very good.

pwolfram · 2016-10-21T14:38:03Z

@xylar, my understanding is that new analysis should have registered additions to config.analysis, right? However, I'm not sure we want to have changes to default values unless absolutely necessary. What we do need are addition of new variables, right?

I'm guessing a good workflow here would be to include changes to config.analysis that are required for testing in a gist. But I'm not 100% sure if that is the best way to proceed.

pwolfram · 2016-10-21T14:39:29Z

@vanroekel, I've added some comments to the code. Feel free to not take the style comments too seriously for now (e.g., white space issues, etc.). I think we do want more doc strings to document the code than currently exist, however. I think we will want to squash some commits prior to merging this too.

pwolfram · 2016-10-21T14:45:19Z

mpas_analysis/ocean/ocean_modelvsobs.py

+
+    infiles = "".join([indir,"/am.mpas-o.timeSeriesStats.????-*.nc"])
+
+    monthdictionary={'Jan':1,'Feb':2,'Mar':3,'Apr':4,'May':5,'Jun':6,'Jul':7,'Aug':8,'Sep':9,'Oct':10,


Things like this should probably be in a shared constants location.

Maybe, shared/constants.py or some such would work

pwolfram · 2016-10-21T14:48:49Z

mpas_analysis/ocean/ocean_modelvsobs.py

+    latDATA = latDATA.flatten()
+    lonDATA = lonDATA.flatten()
+
+    dinmonth = np.array([31,28,31,30,31,30,31,31,30,31,30,31])


Probably should be in constants file

vanroekel · 2016-10-21T17:45:30Z

Okay @pwolfram, I've updated everything except the main function issue. I may have missed a few other minor formatting things perhaps.

pwolfram · 2016-10-21T21:38:49Z

@vanroekel, can you please add some tests to mpas_analysis/shared/interpolation/interpolate.py if possible? Can you also please let me know where you are testing on edison and I'll double check everything is working?

vanroekel · 2016-10-21T21:46:40Z

@pwolfram the config.analysis script I used on edison is at
'''
/global/homes/l/lvroekel/MPAS-Analysis/config.analysis
'''

I will try add tests to interpolate.py soon. I still need to read through your example and the xarray example.

pwolfram · 2016-10-21T22:54:21Z

@vanroekel, I think that you just need functions of the form test_* in the interpolate.py file. Pytest should then run this function and make sure it exits cleanly, e.g., you need assert statements in the function. If you write the tests (even in pseudo-code) I'll help you get them edited so that pytest can use them.

pwolfram · 2016-10-21T22:55:51Z

FYI, I will postpone testing for now until we more or less have the code we want to commit to ensure that the full testing is as close to the final version as possible.

pwolfram · 2016-10-23T03:13:45Z

@vanroekel, one question / idea- would it be possible to further generalize the *modelvobs* functions with respect to sea ice too? I took a quick look at this today and it looks like there may be even more opportunities for simplification to reduce the total amount of code.

cc @milenaveneziani

vanroekel · 2016-10-23T03:51:00Z

@pwolfram and @milenaveneziani, looking at the sea-ice code, I think it certainly may be possible to generalize all of these into one. It fits with how I was eventually envisioning the SST and MLD comparisons as a plot2d_fields_comparison, where no data field is assumed or if it is model to obs. I think the simplification would take some thought as there seem to be enough differences in the sea-ice and ocean plotting scripts.

milenaveneziani · 2016-10-24T07:30:25Z

@vanroekel, @pwolfram: first of all, thank you for working on this. It's great to have a general routine to do global 2d climatological maps.

There are a few reasons why I think we are not ready to generalize the sea-ice script as well:

We need a shared script that does climatologies. It should be flexible enough that it does various seasons (for sea-ice, for example, we need JFM, but also DJF and FM).
The interpolation routine should allow for a choice of interpolation techniques. The sea-ice script for example uses ncremap, and the SST script uses the python interpolation (not sure if this changed in the mean time). We should allow for both methods (and possibly more options), and then we should stick with one for all of our diagnostics.
We need to decide what to do in terms of observational data processing. When we talked about this, I think we decided that the data should be pre-processed (pre-compute climatologies, and re-grid). I can't remember if this is an issue in this case: @vanroekel, is the SST/MLD data set pre-processed here?
With sea-ice, we are plotting 2 polar hemispheres on a steoreographic projection. With the global modelvsobs we are plotting on a global projection and need less input variables.
So, it seems to me that eventually we may be able to unify the ocean and sea-ice scripts, but perhaps not in the immediate future (in the sense that we need to put out other fires first).
Do you agree?

vanroekel · 2016-10-24T14:54:11Z

@milenaveneziani these are some of the exact pieces I was thinking about. The SST/MLD data is not preprocessed. For merging with sea-ice this would have to change. Not a hard change though.

I would tend to agree that merging sea-ice and ocean model vs obs is a longer term goal, but should be a goal.

pwolfram · 2016-10-24T20:28:19Z

@vanroekel and @milenaveneziani, should we just merge the existing code once we are all happy with this PR? I'm ok waiting to do a more generalized aggregation of the scripts and at this point agree that it is wise to wait for the generalization, which we can keep track of in #31.

vanroekel · 2016-10-24T20:38:35Z

@pwolfram I would be in favor of merging this PR prior to generalization. I'm working on the unit testing now. So I'd like to wait for these additions.

milenaveneziani · 2016-10-24T20:46:24Z

@pwolfram: I would also merge this prior to generalization.
@vanroekel: I think you mentioned that you also tested this within the ACME analysis framework, right?

pwolfram · 2016-10-26T04:46:22Z

@vanroekel, thanks for moving the testing file. pytest checks out now.

pwolfram · 2016-10-26T04:46:55Z

Also, this points to the need for CI regression testing as we have discussed earlier now that we are getting some regression tests.

vanroekel · 2016-10-26T04:56:47Z

@pwolfram I gave you the wrong path in the config.analysis file, see the new version in same location. The path to files should be

'''
/global/cscratch1/sd/golaz/ACME_simulations/20160805.A_WCYCL1850.ne30_oEC.edison.alpha7_00/run/
'''

milenaveneziani · 2016-10-26T04:57:59Z

@vanroekel: are you asking to move the MLD observation data to edison? Sorry if I haven't done this already, but you only have the data on IC, right? I think I also have it at OLCF, so I can look into that.
@pwolfram: that directory of ACME model output files no longer exists at NERSC. We have used a different filepath recently (in run_EDISON_mv or also in Phillip's version, I believe).

Having said that, as I mentioned to Luke already, we should not test any of this within the ACME repo just yet (especially now that we are dealing with a troublesome merge to master). I think the workflow in the future should be similar to that of MPAS and ACME main codes: we make a bunch of changes in MPAS-Analysis, test them thoroughly, update the ACME analysis repo locally and do more testing, then merge to the ACME analysis master. Or at least this seems like the safer way to me.

vanroekel · 2016-10-26T05:03:23Z

@milenaveneziani, the data is on edison at /global/homes/l/lvroekel/MLD

@pwolfram, there is an issue in the code right now in testing on edison. I'm debugging.

vanroekel · 2016-10-26T05:12:44Z

@pwolfram, confirmed the code just pushed works on edison.

xylar · 2016-10-26T06:48:03Z

@milenaveneziani wrote:

I think the workflow in the future should be similar to that of MPAS and ACME main codes: we make a bunch of changes in MPAS-Analysis, test them thoroughly, update the ACME analysis repo locally and do more testing, then merge to the ACME analysis master. Or at least this seems like the safer way to me.

I agree that this approach is needed. One way to handle this without disrupting our workflow would be to introduce a develop branch, and to treat master as a release branch (or rename it). Matt Hoffman might have suggestions for how to make this happen (but he's not on this repo yet).

xylar · 2016-10-26T08:24:47Z

I went ahead and created a develop branch. I figured it can't do any harm. I'm going to change this PR to merge into develop for now, Let me know if that worries any of you.

pwolfram · 2016-10-27T01:42:43Z

@xylar, in the future we all should socialize major changes to the repository with the group, e.g., the introduction of the develop branch. This is our best practices and although this isn't a big issue here there could be good reasons why we would not want to make unilateral changes like this without getting further buy-in from the whole group. It is possible this already happened and there was consensus at a meeting I missed this week because of traveling. If so, my apologies about needlessly raising this issue. Either way, this is not a big deal, but I think we will get a better product if we do our best to socialize larger changes. Please let me know if I'm not following this practice too.

Just realized the above line by @xylar was the buy-in step. However, perhaps we should formalize these things as github issues for discussion in the future to make things a little more transparent.

Note, for the record, @xylar's idea of develop is a really good idea because it provides us an additional safety net that we can use to resolve serious issues if something goes wrong prior to merging to master.

pwolfram · 2016-10-27T01:46:43Z

I'll be merging to develop for this PR following testing. @vanroekel, permissions are broken again for the config file at/global/homes/l/lvroekel/config.analysis and I'm guessing your cached data has this same issue and needs reset permissions. Essentially I just want to make sure I'm using the right configuration file on edison to test this prior to this merge.

vanroekel · 2016-10-27T03:17:34Z

@pwolfram sorry for the permissions issue. Try now.

pwolfram · 2016-10-27T05:00:40Z

@vanroekel, does this error make sense to you?

┌─[pwolfram][edison03][~/test_MA_vanroekel][21:46][±][develop ✓]
└─▪ python run_analysis.py config.analysis.test 

Plotting 2-d maps of SST climatologies...
Traceback (most recent call last):
  File "run_analysis.py", line 133, in <module>
    analysis(config)
  File "run_analysis.py", line 90, in analysis
    ocn_modelvsobs(config, 'sst')
  File "/global/u2/p/pwolfram/test_MA_vanroekel/mpas_analysis/ocean/ocean_modelvsobs.py", line 153, in ocn_modelvsobs
    if np.issubdtype(monthsvalue, int):
  File "/global/homes/p/pwolfram/miniconda2/lib/python2.7/site-packages/numpy/core/numerictypes.py", line 761, in issubdtype
    return issubclass(dtype(arg1).type, val)
TypeError: data type not understood

vanroekel · 2016-10-27T05:03:37Z

@pwolfram, yes, I had this and thought I fixed it. This line (153)

np.issubdtype(monthsvalue, int):

should be

isinstance(monthsvalue, (int,long))

I had fixed this and thought I pushed changes, but obviously not. I can clean this up tomorrow morning

pwolfram · 2016-10-27T05:04:20Z

@vanroekel, thanks- sounds good.

xylar · 2016-10-27T07:18:27Z

@pwolfram wrote:

However, perhaps we should formalize these things as github issues for discussion in the future to make things a little more transparent.

Yes, I think you are right. I felt a little bit conflicted about unilaterally adding a develop branch. At the same time, I felt like there was some urgency since a solution was needed to prevent conflicts with ACME and to allow testing of a number of changes before pulling them into ACME. I was afraid the damage would be done before the discussion could take place. I'll create an issue page now.

This is the first attempt at unit testing in the interpolate.py module. It tests all functionality within that bit of code using pytest Adds an arrayAssertEqual, greaterThan, and lessThan Fixes trailing white space and int to floats Justify issue fixed Updates testing Fixes a bug Small fix

vanroekel · 2016-10-27T16:52:08Z

@pwolfram the small fix described above has been pushed. Should be ready for you to test once cscratch on edison is back.

xylar · 2016-10-27T17:31:24Z

CSCRATCH will be down for a couple of days. Do we have any alternative for testing in the meantime?

vanroekel · 2016-10-27T17:38:20Z

The files I used for testing are also on hpss at nersc. I will try download them to /scratch2/scratchdirs/lvroekel for @pwolfram to finish testing this PR

xylar · 2016-10-27T17:50:15Z

@vanroekel, that's great. Can you give me permission to see your scratch folder? That way I can use the files to test #38 as well (once I rebase onto this branch).

vanroekel · 2016-10-27T19:19:04Z

@xylar , I am still downloading data, as soon as it is all there (Should be done shortly), I'll alter permissions. Do you need more than the sea ice and ocean timeSeriesStats files?

vanroekel · 2016-10-27T19:25:23Z

@xylar and @milenaveneziani I downloaded timeseriesstats ocean and sea-ice files to scratch2 for testing. Permissions should be okay for all to see, if not let me know. If you need additional files, please let me know as well.

pwolfram · 2016-10-28T03:09:56Z

┌─[pwolfram][edison08][~/test_MA_vanroekel][19:58][±][ ✓]
└─▪ python run_analysis.py config.analysis.test 

Plotting 2-d maps of SST climatologies...

Plotting 2-d maps of MLD climatologies...

and output is visually indistinguishable from posted photos above in PR description.

Thanks @vanroekel, @xylar, and @milenaveneziani for sticking together so well through these HPC challenges. Great teamwork everyone!

pwolfram reviewed Oct 21, 2016

View reviewed changes

pwolfram self-assigned this Oct 21, 2016

pwolfram reviewed Oct 21, 2016

View reviewed changes

pwolfram mentioned this pull request Oct 24, 2016

Unification of ocean / seaice scripts #31

Closed

vanroekel force-pushed the mpasmodelvsobs branch from f9533d9 to 8259e55 Compare October 26, 2016 05:11

xylar changed the base branch from master to develop October 26, 2016 08:25

milenaveneziani mentioned this pull request Oct 26, 2016

Read paths from streams and config from namelist #38

Merged

xylar mentioned this pull request Oct 27, 2016

develop branch #39

Closed

vanroekel force-pushed the mpasmodelvsobs branch from 8259e55 to 1325ca9 Compare October 27, 2016 16:51

pwolfram merged commit 1325ca9 into MPAS-Dev:develop Oct 28, 2016

pwolfram mentioned this pull request Dec 13, 2016

Modular Design #8

Closed

		from ..shared.interpolation.interpolate import interp_fields, init_tree

		def mld_modelvsobs(config):


		dinmonth = np.array([31,28,31,30,31,30,31,31,30,31,30,31])

		daysarray = np.ones((12,dsData.mld_dt_mean.values.shape[1],dsData.mld_dt_mean.values.shape[0]))


		infiles = "".join([indir,"/am.mpas-o.timeSeriesStats.????-*.nc"])

		monthdictionary={'Jan':1,'Feb':2,'Mar':3,'Apr':4,'May':5,'Jun':6,'Jul':7,'Aug':8,'Sep':9,'Oct':10,

Adds MLD analysis and reduces code in SST analysis #29

Adds MLD analysis and reduces code in SST analysis #29

Uh oh!

Conversation

vanroekel commented Oct 21, 2016

Uh oh!

vanroekel commented Oct 21, 2016

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pwolfram commented Oct 21, 2016

Uh oh!

pwolfram commented Oct 21, 2016

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

vanroekel commented Oct 21, 2016

Uh oh!

pwolfram commented Oct 21, 2016

Uh oh!

vanroekel commented Oct 21, 2016

Uh oh!

pwolfram commented Oct 21, 2016

Uh oh!

pwolfram commented Oct 21, 2016

Uh oh!

pwolfram commented Oct 23, 2016

Uh oh!

vanroekel commented Oct 23, 2016

Uh oh!

milenaveneziani commented Oct 24, 2016

Uh oh!

vanroekel commented Oct 24, 2016

Uh oh!

pwolfram commented Oct 24, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vanroekel commented Oct 24, 2016

Uh oh!

milenaveneziani commented Oct 24, 2016

Uh oh!

pwolfram commented Oct 26, 2016

Uh oh!

pwolfram commented Oct 26, 2016

Uh oh!

vanroekel commented Oct 26, 2016

Uh oh!

milenaveneziani commented Oct 26, 2016

pwolfram commented Oct 24, 2016 •

edited

Loading

pwolfram commented Oct 27, 2016 •

edited

Loading

vanroekel commented Oct 27, 2016 •

edited

Loading

pwolfram commented Oct 28, 2016 •

edited

Loading