-
Notifications
You must be signed in to change notification settings - Fork 53
Adds namelist and streams file interface #27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@xylar and @milenaveneziani, this is the start of a set of classes which we can used to read / write namelist and streams files. At this point it is probably "prototype" quality code. Please let me know what you think. I'm thinking we should use this as a "straw-man" to build out general capability to manipulate namelist and streams files. I'm putting this out here to stimulate discussion and as a starting point for the changes we need to generalize the code and fully expect many or all of the lines in this file to be rewritten or adapted to our needs. |
@xylar and @milenaveneziani, the thing we need to focus on here is the API for the classes that interface with the namelist and streams files, e.g., # get check if global stats is on
nl = Namelist(nlistpath)
dt = nl.read('config_AM_globalStats_enable')
# get name for mesh file
sf = XMLList(streamspath)
meshname = sf('mesh', 'filename_template') Once we have a good handle on this we should be able to write the necessary functionality that we need. |
@pwolfram: this sounds good to me. How do you suggest we should test it? With an example script, or by modifying one of the scripts that we already have to do plotting/analysis? |
@milenaveneziani, this brings up the large question of having unit tests. We could use pytest for that and essentially ensure that different parts of the code are doing precisely what they need to do to meet our requirements. There would be a new |
@milenaveneziani, I've pushed some changes that include unit tests via the pytest framework (similar to what xarray uses). Basically when you are in the folder you can type |
Note @xylar and @milenaveneziani, this is still somewhat rough but one end goal here is to get automatic testing for each PR via pytest to ensure that we don't accidental break functionality that is important as we modify the repo. We will likely want to modify the interfaces uses for the namelist and streams reader / writer. |
@@ -0,0 +1,91 @@ | |||
#!/usr/bin/python |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be /usr/bin/env python
10/07/2016 | ||
""" | ||
|
||
import os |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this needed?
""" | ||
|
||
import os | ||
import pytest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this needed?
self.setup_namelist(readonly=True) | ||
with self.assertRaisesRegexp(AssertionError, 'Cannot write to namelist file .* because readonly=True'): | ||
self.nl.write('config_dt', '00:00:00') | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@requires_lxml
needed?
class XMLList: | ||
""" | ||
Class to read in streams configuration file, provdies | ||
read and write functionality |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I understand why we would want write functionality as part of this repo. Can you suggest a case where writing or modifying a streams file might make sense?
#print command | ||
result = call(command,shell=True) | ||
|
||
class XMLList: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the class name is too general for the specific usage (or perhaps the description of the class is too specific)
self.write(self.fname+'.backup') | ||
|
||
def read(self, streamname, attribname): | ||
""" name is a list of name entries terminanting in some value |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you fix the docstring? Makes no sense so me currently.
self.readonly = readonly | ||
self.xmlfile = etree.parse(fname) | ||
self.root = self.xmlfile.getroot() | ||
if backup: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you do decide to keep write/modify functionality, I think backup=True only makes sense if readonly=False.
else: | ||
print "%s was not changed to %s because it didn't exist and we aren't setting new fields!"%(attribname, value) | ||
|
||
def write(self, fname=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems odd to me that write doesn't check if readonly==True, especially if fname=None or fname==self.fname.
|
||
def read(self, name): | ||
# shell return value | ||
return_val = check_output(['awk', '/'+name+'/{printf $3}', self.fname]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As you mentioned in the PR description, it would be preferable to have pure python. I would think it would make more sense to read in the full file on init and create a dictionary from the names and values. Then, the various get functions (get, getint, getfloat, getbool) that I suggest we use instead of read would would just return the dictionary value, possibly with the appropriate casting.
return_val = return_val.strip('"').strip("'") | ||
return return_val | ||
|
||
def write(self, name, value): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we want namelist write functionality in this repo. @pwolfram, can you give me an example of where we might need this in this repo?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@xylar, we may want to use this type of code to make edits for namelists / streams for automatic test cases. This was the context that this code was original written to address. I think there is an advantage to having this functionality, even if we don't plan to immediately use it because we want the O part of IO too in order to make this a general tool. For example, I can easily envision using MPAS-Analysis to setup/analyze test cases in the testing core and this would be useful in this endeavor.
@pwolfram wrote:
Can we change the API to be more like # get check if global stats is on
nl = Namelist(nlistpath)
dt = nl.getfloat('config_dt')
timeInteg = nl.get('config_time_integrator')
numHalos = nl.getint('config_num_halos')
explicitProcDecomp = nl.getbool('config_explicit_proc_decomp')
# get name for mesh file
sf = XMLList(streamspath)
meshname = sf.read('mesh', 'filename_template') |
@xylar, would we want to have a dictionary-like capability as well as the explicit calling functions? I don't see why not but it may be more elegant (and risky, however) to try to do automatic type-casting with output / access via a dictionary-like structure. |
@pwolfram, I assume you're still working on updating this PR. Let me know if you're waiting on anything from me. |
@xylar, you are correct-- I have not done anything on this since we chatted Monday. If I'm holding someone up I can increase priority on this and finish it ASAP. |
@pwolfram, @xylar: to put it into perspective, this PR, together with a future one on mpas_xarray, have higher priority with respect to anything else, because anything that went into alpha8 and that will go in alpha9 breaks the scripts. In alpha8, we have changed filenames. In alpha9, we will be changing timeSeriesStats instances, and the variable names will change as a consequence (this of course involves changes in this PR and in mpas_xarray/other python scripts). I think it would be good if we could solve these issues in the next couple of weeks, if possible. |
@xylar, I've updated the code to reflect our conversation earlier this week. Please let me know what you think. We should have read functionality for namelists and streamfiles now and have testing via pytests, which gets us one step away from CI testing for all PRs in the future. |
P.S. obviously commits need squashed but this can be done after you take a look and before the merge. cc @milenaveneziani |
'0100_00:00:00') | ||
|
||
|
||
# NOTE, MAY NEED TO SANITIZE NAMELIST AND STREAMS FILES A LITTLE BIT FOR |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pwolfram, I think the example namelists and streams are okay. You can simplify them if you want. But I'd remove this comment either way.
@pwolfram, this looks good. I confirm that the tests seem to cover our bases and that they pass with the example namelist and streams file. If you would squash the commits and remove the note i mentioned above (after simplifying the streams file if you like), I will merge. I don't think there is a particular need for better type checking in this PR. It seems sufficient to me if type errors are raised when the various |
05546c5
to
d761229
Compare
d761229
to
b06dd4f
Compare
@xylar, I think this should be ready to merge following your quick double-check on the changes. |
Great, I'll take a look as soon as I can. |
Thanks @xylar! |
@pwolfram, I am going to merge this soon. In the future, could you make the description of the PR something that is appropriate as a commit message for the merge? This means it should not include references to other PRs by number and should describe what is in the PR, as opposed to what might be added to the PR. I have modified the commit message to remove/clean up these issues. |
@pwolfram, I made sure the merged branch passed the tests. The new code doesn't touch the existing analysis in any way so I didn't bother to test that the analysis itself still runs correctly. |
@pwolfram, please delete the remote branch, since I don't have permission. |
@xylar, thanks for the feedback on the PR description. I'll put checklists with an introduction of "Features of this merge include" and reference other issues in a comment outside the PR description. |
This is a prototype reader / writer for interfacing with namelist and streams and this will ultimately be needed to support generalization identified in #20.
A list of preliminary features (included or could be included):