Design Document: Config File Reorganization #91

xylar · 2017-01-30T22:27:20Z

A design document describing reorganization of the config file for increased clarity and to make it easier to control which analyses are run.

I decided not to go too far with changes to ACME-specific options (casename and ref_casename_v0) in this reorganization. I decided to leave `yr_offset` alone for this PR, since I think any changes there are going to need their own separate discussion and PR(s).

xylar · 2017-01-30T22:41:55Z

Nice! @pwolfram, I think this process will work well for design docs. If you view the changes in my second commit as a "rich diff", you can see the edits like you would with Track Changes in Word. Also, it's easy to edit the markdown directly on GitHub once you've created the branch.

pwolfram · 2017-01-31T17:57:03Z

@xylar, the rich diff view is awesome and is probably what we want anyway!

pwolfram · 2017-01-31T17:59:06Z

design_docs/config-file_reorganization.md

+Contributors: Xylar Asay-Davis
+</h2>
+
+There should be a simple, intuitive method for turning on and off individual analysis modules (e.g. `ocean/ohc_timeseries`).  This should replace the current approach of having a boolean `generate` flag for each analysis module in a separate config section.  Preferably, there should be an equivalent method for turning on and off analysis modules from the command line that overrides that in the config file.


there should be an equivalent method for turning on and off analysis modules from the command line that overrides that in the config file.

Do we really need / want this functionality? I can see where it would be useful for testing but I don't think it is essential for 95% of our use cases. I'd skip it unless it is quick / easy to add.

@pwolfram: I think this was one of the ideas @xylar had to fully resolve #14. I think it would be nice to have it.
btw, the rich view thing is very nice, but then these review comments disappear.

@milenaveneziani, thanks for bringing #14 this back to my attention. It would be good to add in this design if possible.

@pwolfram and @milenaveneziani, hopefully you both want this functionality because I already implemented it in #86.

pwolfram · 2017-01-31T18:00:24Z

design_docs/config-file_reorganization.md

+Contributors: Xylar Asay-Davis
+</h2>
+
+The current example config file is specific to a run on Edison and should be made into a general template.  Simplifications should be made to the template so that it can more easily and intuitively be modified for several analyses.  Example config files should also be added for analyzing several existing runs on several different machines.


I think the current config is hacked from multiple sources and am not 100% sure if is strictly Edison. Maybe want

The current example config file ~~is specific to a run on Edison and~~ should be made into a general template.

Okay, I think this is pretty nit picky but I'll change it.

pwolfram · 2017-01-31T18:03:09Z

design_docs/config-file_reorganization.md

+Contributors: Xylar Asay-Davis
+</h2>
+
+To the extent possible, ACME-specific config options such as `casename` and `ref_casename_v0` should be generalized in a way that is also appropriate for MPAS-only runs.


If done right, MPAS-Analysis could be even more general than MPAS. Basically, we are just pointing the analysis to paths and then plotting variables based on some assumption of grid variables existing. This could potentially be generalized away from MPAS, similar to our renaming of namelist and stream variables, if possible. I'd make the following change

To the extent possible, ACME-specific config options such as casename and ref_casename_v0 should be generalized ~~in a way that is also appropriate for MPAS-only runs~~.

Oh, I have no interest in going down that path. This is MPAS analysis and I have no interest in supporting anything more general. This is how you get code that is so generalized it's completely unreadable and useful to no one. We can discuss this philosophy tomorrow but I'm definitely opposed to trying to generalize beyond MPAS and ACME runs.

I think the phrase "in a way that is also appropriate for MPAS-only runs." is somewhat ambiguous because there could be different uses for different cores and this assumes some type of particular codified standard, which may or may not exist. I'm not suggesting that we fully generalize this in a way that isn't helpful, just that we avoid unnecessarily hard coding things that don't absolutely have to be hard coded.

Okay, fair enough regarding generalization.

I will reword "in a way that is also appropriate for MPAS-only runs" rather than getting rid of it. To me it clarifies what the scope of MPAS-Analysis needs to be: not just ACME runs but also any other runs involving the MPAS components we support.

pwolfram · 2017-01-31T18:03:23Z

design_docs/config-file_reorganization.md

+Contributors: Xylar Asay-Davis
+</h2>
+
+Currently, there is an option `yr_offset` that is intended to apply to all dates and observations.  This option should be removed and possibly be replaced by an input and output reference date for each model run and observational data set.  Discussion is needed if these reference dates are actually needed once we find a more generalized calendar (e.g. `netcdftime`).


Hmm, I'm going to take this out of this design doc. I think getting rid of yr_offset is too big a task for this project. I already removed that change from the template in #86.

We need to figure out what to do about yr_offset and related things like that because this appears to be a potential problem that may be causing unnecessary roadblocks.

Yes, absolutely! But not here.

pwolfram · 2017-01-31T18:03:54Z

design_docs/config-file_reorganization.md

+
+The following comment describes the planned implementation in the config file.
+```
+# a list of which analyses to generate.  Valid names are:


a list of ~~which~~ analyses to generate.

pwolfram · 2017-01-31T18:06:04Z

design_docs/config-file_reorganization.md

+```
+./run_analysis.py config.analysis --generate all,no_ocean,all_timeseries
+```
+(I am open to other syntax.)  If the --generate flag is used on the command line, it will replace the generate option in the config file.


Do we want the command line options to fully overwrite the config options? If so, should an effective config file be generated for documentation purposes?

To the first question, yes, absolutely. How else would the outcome be clear? If you just concatenated the command line options on the end of the list of options in the config file, it would make things much more complex rather than simpler.

Yes to the second one. We should do this anyway because there are already a few config options that are generated in the code if they aren't present in the config file. But this is not an issue for this PR to address in my opinion.

We should do this anyway because there are already a few config options that are generated in the code if they aren't present in the config file. But this is not an issue for this PR to address in my opinion.

I think this type of thing needs an issue so that we don't forget that this is something we ultimately need to do to keep the repo clean. Are you opposed to having an issue for output of a provenance-useful config file that can be used to reproduce the analysis? We could potentially check off #23 and or #35 this way.

Yes, I agree that such an issue should be added. Could you take care of adding it?

See soon to exist #94

pwolfram · 2017-01-31T18:06:57Z

design_docs/config-file_reorganization.md

+Contributors: Xylar Asay-Davis
+</h2>
+
+This item needs some discussion.  in #86, I would like to renamed `ref_casename_v0` to simply `ref_casename`.  `casename` and `ref_casename_v0` are used for file names and figure titles (essentially legends).  It would be useful to discuss what the relevant equivalents would be fore standalone MPAS runs.


Obviously needs discussion in our call

@milenaveneziani had a solution that seems reasonable (with a bit of adaptation). We could have runName for the current run, referenceRunName for a reference MPAS or ACME run, and preprocessedReferenceRunName (or something similar) to replace ref_casename_v0 where the data doesn't come from an MPAS run and therefore has been preprocessed.

I am definitely open to alternative solutions but that one works well for me.

pwolfram · 2017-01-31T18:07:56Z

design_docs/config-file_reorganization.md

+Contributors: Xylar Asay-Davis
+</h2>
+
+This will be difficult to test with CI since we currently don't have CI that can perform tests with `run_analysis.py`.


@xylar, is this something we need, e.g., CI testing for run_analysis.py? I think it is hard to test a full workflow with output that is a png and don't see any clear way to do this.

Right, I think that's not really the point of CI anyway. You want to break the code into units as opposed to trying to test the whole thing at once.

Still, until we have some example config files to use as "standard" test runs, our testing will remain pretty arbitrary.

Note, we can use coveralls, landscape, etc to point to where we need to write test if this becomes an issue.

I'm afraid I didn't follow any of that. coveralls and landscape are libraries? Could you clarify what you have in mind?

Coveralls and landscape are github "plug-ins" that can be automatically run, like CI, to provide diagnostics regarding code quality. They can indicate parts of the code that may need further testing to provide a robust and maintainable code base. At the present, I'm not suggesting we use them, just keep them in mind if we have needs related to identification of insufficient test in particular.

Ah, good to know! Something to look into down the road, I agree.

pwolfram · 2017-01-31T18:08:55Z

design_docs/config-file_reorganization.md

+
+This will be difficult to test with CI since we currently don't have CI that can perform tests with `run_analysis.py`.
+
+Instead, I will manually test a variety of combinations of `generate` lists (both in the config file and on the command line).  I will list the tests I perform in #86.


We should be able to test the generate functionality with example config files in CI. We don't need run_analysis.py to do this check, just a config file and the parsers for the config file. This seems to be the right way to test and ensure that functionality isn't lost in the long-term.

I think you do need one of the functions out of run_analysis.py (check_generate: https://github.com/MPAS-Dev/MPAS-Analysis/pull/86/files#diff-199f356fab800f1dc95e9c7c5567146cR38) but that would be pretty easy to use in some CI testing. I'll add that to the design doc and also to the code in #86 when I can.

pwolfram · 2017-01-31T18:10:34Z

design_docs/config-file_reorganization.md

+Contributors: Xylar Asay-Davis
+</h2>
+
+The analysis will be tested on both MPAS and ACME runs to make sure the resulting image files have useful file names and titles (legends).  


How will you assess success for this goal? Naming is a subjective process, at least for the best names. This seems like something that will require some group discussion.

Yes, of course!

xylar · 2017-02-01T19:59:45Z

design_docs/config-file_reorganization.md

+</h2>
+
+A consistent convention of capitalization and underscores should be used throughout the config file.
+


@pwolfram and @milenaveneziani, could you please let me know if this requirement makes sense to you? If so, I would like to begin to implement it.

sounds good to me

xylar · 2017-02-01T23:37:58Z

design_docs/config-file_reorganization.md

+# option:
+#    ./run_analysis.py config.analysis --generate \
+#         all,no_ocean,all_plotTimeSeries
+generate = ['all']


This demonstrates how I'd like to rename the different analysis scripts using CamelCase. @pwolfram and @milenaveneziani, again, I want to make sure you're good with this before I dive into implementation.

xylar · 2017-02-02T12:18:48Z

design_docs/config-file_reorganization.md

+</h2>
+
+Implementation of the `config.template` file can be found [here](https://github.com/xylar/MPAS-Analysis/blob/5d5f64bde6ecf1d71f375a61783ff30f1654df01/config.template).
+


@milenaveneziani and @pwolfram, could you take a careful look at the latest template and make sure you're okay with it?
https://github.com/xylar/MPAS-Analysis/blob/5d5f64bde6ecf1d71f375a61783ff30f1654df01/config.template
I am going to go ahead with implementation of the necessary changes in the code itself today but want to address any concerns as soon as possible before I get too far.

Minor comments:

For https://github.com/xylar/MPAS-Analysis/blob/5d5f64bde6ecf1d71f375a61783ff30f1654df01/config.template#L157 -- we have the option to compare against a model, which is commented out. I think this is ok for now but just want to make sure we keep that option open for the future if possible.

It would be good to get to the bottom of https://github.com/xylar/MPAS-Analysis/blob/5d5f64bde6ecf1d71f375a61783ff30f1654df01/config.template#L299, but this is not essential.

I'd say go for it!

@pwolfram, I'll look into the second issue when I get to working on ocean.modelvsobs. My guess is the last contour value just isn't being used, which would make the diff plots weirdly asymmetric in color space. I'll update the template somehow if that's the case or ask @vanroekel about it.

The template looks good, @xylar.
About the first issue @pwolfram brings up: I think compareWithModel could safely go at some point, because we will be relying on preprocessedRunName and referenceRunName, whether they are None or not, to plot results from another model. Let's keep it there for now though.

xylar · 2017-02-02T23:53:54Z

@pwolfram and @milenaveneziani, I guess this can get merged just after #86, because then we know it's served its purpose and we're not likely to do further editing.

pwolfram · 2017-02-08T21:26:19Z

@xylar and @milenaveneziani, LGTM.

First draft design doc for config file reorg

736fb6b

xylar added the design document label Jan 30, 2017

xylar mentioned this pull request Jan 30, 2017

Design Document: Config File Reorganization #89

Closed

xylar mentioned this pull request Jan 30, 2017

Reorganize config.tempalate (formerly config.analysis) #86

Merged

pwolfram reviewed Jan 31, 2017

View reviewed changes

xylar added 2 commits February 1, 2017 16:55

Update doc to address requested changes

2d55f8a

Add requirement for consistent case in config

8dc2fd8

xylar commented Feb 1, 2017

View reviewed changes

Update 'generate' implementation with CamelCase

0a7846d

xylar commented Feb 1, 2017

View reviewed changes

xylar added 2 commits February 2, 2017 13:11

Update design doc to reflect updated config.template

3a67a67

Update config.template commit hashtag

18bb5d1

xylar commented Feb 2, 2017

View reviewed changes

milenaveneziani merged commit 18bb5d1 into MPAS-Dev:develop Feb 10, 2017

xylar deleted the design_doc_config_reorg branch February 23, 2017 10:52


		This will be difficult to test with CI since we currently don't have CI that can perform tests with `run_analysis.py`.

		Instead, I will manually test a variety of combinations of `generate` lists (both in the config file and on the command line). I will list the tests I perform in #86.

		</h2>

		A consistent convention of capitalization and underscores should be used throughout the config file.

		</h2>

		Implementation of the `config.template` file can be found [here](https://github.com/xylar/MPAS-Analysis/blob/5d5f64bde6ecf1d71f375a61783ff30f1654df01/config.template).

Design Document: Config File Reorganization #91

Design Document: Config File Reorganization #91

Uh oh!

Conversation

xylar commented Jan 30, 2017

Uh oh!

xylar commented Jan 30, 2017

Uh oh!

pwolfram commented Jan 31, 2017

Uh oh!

Choose a reason for hiding this comment

Uh oh!

milenaveneziani Jan 31, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

milenaveneziani Jan 31, 2017 •

edited

Loading