Skip to content

Conversation

pwolfram
Copy link
Contributor

Adds xml file to keep track of documentation meta data and a script to parse this into a rst table for use in auto-generation of observation documentation.

@@ -0,0 +1,9 @@
Observations
==========
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs to be longer (same number of equals signs as characters in Observations).

reading sources... [100%] tier1a                                                
/home/xylar/code/mpas-work/analysis/adds_observational_xml/docs/index.rst:34: WARNING: toctree contains reference to nonexisting document 'mpascice'
/home/xylar/code/mpas-work/analysis/adds_observational_xml/docs/observations.rst:2: WARNING: Title underline too short.

Observations
==========

Copy link
Collaborator

@xylar xylar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great! I think this will get us started.

A few questions emerge that may or may not be worth addressing in this PR:

  • How would we specify multi-line entries if we wanted to?
  • Would it be easy to modify the parser so it would only pull out those entries with a specific value for a specific tag (e.g. component == 'ocean')? This would allow us to make separate tables from a single XML file.
  • How would we specify text plus URL for a proper hyperlink in the XML so we get the expected result in the table (i.e the text has a link to the URL rather than the URL being after the text)?

A few thoughts that I think we should address in a separate PR:

  • I don't think division into Tier 1a, Tier 1b, Tier 2, etc. is useful outside of E3SM so I would just lump all the observations into one table per component instead.
  • I don't think we want to store the table of observation in the docs, so I would move it to mpas_analysis/obs or something similar. We want it to be included in the mpas_analysis package.
  • We probably want to break the URLs out separately from the name of the data set. Also, we probably want a link directly to the data whenever available.

Copy link
Collaborator

@xylar xylar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please do some PEP8 clean-up on the new script.

@xylar
Copy link
Collaborator

xylar commented Apr 17, 2018

Would it be easy to modify the parser so it would only pull out those entries with a specific value for a specific tag (e.g. component == 'ocean')? This would allow us to make separate tables from a single XML file.

On second thought, it will be simple enough to have a separate XML file for each component. I don't see a downside to doing that. So no special parsing should be required.

docs/tier1a.xml Outdated
<observations
headers="shortdesc, component, obsDataSet, references"
headernames="Dataset, Component, Observational Dataset, References">
<aobs>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does "aobs" mean?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"A OBServation"=aobs

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, well that doesn't really work for me but it's something I can change in #335

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would you prefer?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Went with "observation"

docs/tier1a.xml Outdated
</detaileddesc>
<obsDataSet>
Merged Hadley Center-NOAA/OI data set from Hurrell et al. 2008
(https://climatedataguide.ucar.edu/climate-data/merged-hadley-noaaoi-sea-surface-temperature-sea-ice-concentration-hurrell-et-al-2008)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if we put this inside <url> and </url> tags (without the parentheses)? Then, if the parser sees a url tag, it would use the given text with the given URL as a hyperlink.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is a version of the parser that handles <url> tags:
https://github.com/xylar/MPAS-Analysis/blob/fill_in_obs_xml/docs/parse_table.py
(and has PEP8 fixes)
Here is the modified xml data including <url> tags:
https://github.com/xylar/MPAS-Analysis/blob/fill_in_obs_xml/mpas_analysis/ocean/observations.xml

If you want, these changes can wait and I'll just make them in a follow-up PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've done this with Markdown url syntax instead: [name of url](http://url_link) as I think this is clearer.

Copy link
Collaborator

@xylar xylar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can make all the changes I requested in a follow-up PR. I'd rather get this merged as it is so I can start working on that PR.

@xylar
Copy link
Collaborator

xylar commented Apr 17, 2018

@milenaveneziani, if you don't have time to review this soon, I think that's fine. Just take yourself off as a reviewer and @pwolfram can merge the branch. If you'd like to review, feel free.

@xylar
Copy link
Collaborator

xylar commented Apr 17, 2018

@pwolfram, the problem with supporting Markdown urls in xml as in your latest commit is that the URL isn't easy to parse out of the XML, as we would want to do if we were automatically downloading the data set. We could parse out the URL using python string manipulation but that's a bit tricky.

Anyway, certainly no harm in supporting this.

@xylar
Copy link
Collaborator

xylar commented Apr 17, 2018

Actually, I think the markdown support is great. It allows a lot of flexibility. If we specifically need to have a URL to download a dataset from, it will typically be different from the URLs we link to in the table (e.g. the exact file rather than a webpage with links to the data files).

@pwolfram
Copy link
Contributor Author

I'm working on an update here-- should be done soon.

@xylar
Copy link
Collaborator

xylar commented Apr 17, 2018

To give you a picture for where things might end up, My mock-up database looks something like:

<?xml version="1.0"?>
<observations
  headers="name, references"
  headernames="Observational Dataset, References">
  <observation>
    <name>
      [SOSE potential temperature and salinity]
      (http://sose.ucsd.edu/sose_stateestimation_data_05to10.html)
    </name>
    <description>
      Monthly potential temperature and salinity output from the Southern Ocean
      State Estimate (SOSE) covering years 2005-2010
    </description>
    <releasePolicy>
      [Conditions of use]
      (http://sose.ucsd.edu/sose_stateestimation_disclaimer.html): The data on
      these webpages are made freely available for scientific, bona fide,
      not-for-profit research only. If your use of the data is different (e.g.
      commercial), you must contact the data providers and receive written
      permission for your use of the data prior to any such use. The user must
      acknowledge SOSE data in all products or publications that use them, e.g.
      by including the following written note: "Computational resources for the
      SOSE were provided by NSF XSEDE resource grant OCE130007." An appropriate
      citation should also be made.
    </releasePolicy>
    <references>
      <reference>
        [M. Mazloff, P. Heimbach, and C. Wunsch, 2010: "An Eddy-Permitting
        Southern Ocean State Estimate." J. Phys. Oceanogr., 40, 880–899. doi:
        10.1175/2009JPO4236.1](http://doi.org/10.1175/2009JPO4236.1)
      </reference>
    </reference>
    <dataUrls>
      <url value="http://sose.ucsd.edu/DATA/SO6_V2/THETA_mnthlyBar.0000000100.data.gz"/>
      <url value="http://sose.ucsd.edu/DATA/SO6_V2/THETA_mnthlyBar.0000000100.meta"/>
      <url value="http://sose.ucsd.edu/DATA/SO6_V2/SALT_mnthlyBar.0000000100.data.gz"/>
      <url value="http://sose.ucsd.edu/DATA/SO6_V2/SALT_mnthlyBar.0000000100.meta"/>
    </dataUrls>
    <preprocessing>
      preprocess_observations/remap_SOSE_T_S.py
    </preprocessing>
    <tasks>
      <task value="climatologyMapSoseTemperature"/>
      <task value="climatologyMapSoseSalinity"/>
    </tasks>
  </observation>

</observations>

@pwolfram pwolfram force-pushed the adds_observational_xml branch from 930c8bc to 8631b15 Compare April 17, 2018 16:42
@pwolfram
Copy link
Contributor Author

I have component-based tagging / splitting now

@pwolfram
Copy link
Contributor Author

@xylar, I'm happy for this to be merged now unless you see something I missed...

@xylar
Copy link
Collaborator

xylar commented Apr 17, 2018

Let me just re-test and then you can merge unless my test fails for some reason.

@xylar
Copy link
Collaborator

xylar commented Apr 17, 2018

@pwolfram, would it be easy to clean up the files to get rid of these warnings?

/home/xylar/code/mpas-work/analysis/adds_observational_xml/docs/landicobservationstable.rst:8: WARNING: Definition list ends without a blank line; unexpected unindent.
/home/xylar/code/mpas-work/analysis/adds_observational_xml/docs/landicobservationstable.rst:9: WARNING: Blank line required after table.
landicobservationstable.rst:8: WARNING: Definition list ends without a blank line; unexpected unindent.
landicobservationstable.rst:9: WARNING: Blank line required after table.
oceanobservationstable.rst:9: WARNING: Blank line required after table.
seaicobservationstable.rst:8: WARNING: Blank line required after table.
/home/xylar/code/mpas-work/analysis/adds_observational_xml/docs/oceanobservationstable.rst:9: WARNING: Blank line required after table.
/home/xylar/code/mpas-work/analysis/adds_observational_xml/docs/seaicobservationstable.rst:8: WARNING: Blank line required after table.

@pwolfram pwolfram mentioned this pull request Apr 17, 2018
@pwolfram
Copy link
Contributor Author

@xylar, there is a false positive following a nested list using tabulate for landice. This isn't worth the trouble to fix as it would require reparsing output of the tabulate library under particular rst rules. It is possible this is a soft bug with tabulate so if this is a big deal this an issue could be submitted to the library maintainers.

@pwolfram pwolfram force-pushed the adds_observational_xml branch from 2ff0654 to 94440f5 Compare April 17, 2018 17:33
@xylar
Copy link
Collaborator

xylar commented Apr 17, 2018

@pwolfram, don't worry about the last warning. It's better to leave it. We're going to change the table significantly from what you have here as an example in any case so maybe it just won't come up again.

Go ahead and merge when you're ready.

@pwolfram pwolfram force-pushed the adds_observational_xml branch from 94440f5 to 2ff0654 Compare April 17, 2018 17:35
@pwolfram
Copy link
Contributor Author

@xylar, it would have been better to merge this before #331 because of its simpler complexity. This has caused a bunch of frustrating rebase issues. I would have really appreciated it if we would have talked before you merged #331.

@xylar
Copy link
Collaborator

xylar commented Apr 17, 2018

How so? I think they are more or less orthogonal. Also, you approved #331, which suggested to me that it was ready to merge.

@xylar
Copy link
Collaborator

xylar commented Apr 17, 2018

I was able to rebase quickly with two very small conflicts to resolve.

@pwolfram pwolfram force-pushed the adds_observational_xml branch from 2ff0654 to d462093 Compare April 17, 2018 17:49
@pwolfram
Copy link
Contributor Author

Thanks! My local clone got messed up...

@pwolfram
Copy link
Contributor Author

Rebase is ready to go now, thanks for checking in so we can get this resolved.

@pwolfram pwolfram merged commit d462093 into MPAS-Dev:develop Apr 17, 2018
@pwolfram pwolfram deleted the adds_observational_xml branch April 17, 2018 17:53
@pwolfram
Copy link
Contributor Author

Thanks!

@xylar
Copy link
Collaborator

xylar commented Apr 17, 2018

@pwolfram, I don't think this got merged in the desired way. It was a fast-forward merge. I don't think it's a big enough deal to edit the history but we want to make sure to avoid that in the future. The easiest way to do that is to use the merge button here on GitHub rather than the command-line tools. If you do use the command-line tools, make sure to always do merges with --no-ff unless you explicitly want fast-forwarding (in which case why not just to a rebase?).

@pwolfram
Copy link
Contributor Author

You are correct, I messed up and appreciate you correcting me here for the future.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants