Skip to content

2D bounds - simple version #370

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 15 commits into from
Nov 11, 2022
Merged

2D bounds - simple version #370

merged 15 commits into from
Nov 11, 2022

Conversation

aulemahal
Copy link
Contributor

@aulemahal aulemahal commented Oct 21, 2022

Fixes #71!

I went through xgcm's code (using 0.6.1) and realized that the proposed solution for estimating 2D bounds was indeed quite simple! And that we already had the functions to implement it.

So here I enable ds.cf.add_bounds to estimate 2D bounds by estimating the 1D bounds on the two axes successively, and then reordering the values to fit the conventions.

  • The dim argument can now be a list (of length 1 or 2)
  • There is now a possibility that we add 2D bounds and 1D bounds on the same dataset. For now, if the bounds dim already exists and has a different length than the new one, the latter is renamed bounds2, which is a bit inelegant... Any ideas?
  • I am always confused about the order to use... It doesn't really matter for most applications I guess (i.e. xESMF), but I'm not 100% sure I coded the right one.
  • I will add more tests.

I first used @zxdawn's gist (https://gist.github.com/zxdawn/49c73e4d1ef9043580644e57eaa16d83) with xgcm 0.6.1. This implementation in cf-xarray does almost the exact same thing. I then compared with an old gist of mine (https://gist.github.com/aulemahal/58c0df9757635cb9b7f862a6d2a223ab) where I generated grid bounds for a RotatedPole grid. The idea there was to generate the 1D bounds on the regular rlat/rlon grid and then use cartopy/pyproj to project them to lat/lon.

I compared this PR's implementation with the projection one and the model-provided bounds for a regional model grid at a resolution of 0.11° and I got this :
image

The maps show the distance between the estimated bounds with the reference ones. Units are lat/lon degrees, so this means an error of maximum 2% of the grid resolution for this implementation. Seems reasonable to me, what do you think?

In the parent issue, there was a detailed proposition of moving the interesting grid and metric generation form xgcm to cf-xarray. I think it is still an interesting idea, but I felt the specific issue of 2D bounds was long overdue!

CI issues seems to be unrelated to the PR, but rather to #365 and may be an update in xarray? Ok,there are indeed issues with this PR 😓 .

@aulemahal
Copy link
Contributor Author

After discussion with a member of the team running the model from which I took my rotated pole grid example, I realized that the "reference" grid corners I was using were in fact generated in almost the same way as my "projection" method. So, disregard the left graphic above.

The right graphic still gives an idea of the error (in lat lon degrees) between the projection method (which we can treat as "exact") and the interpolation method implemented in this PR. The base grid above has a resolution of around 0.11°, thus the error goes up to 2% of that resolution.

This next figure is the same kind of comparison but with a 0.44° grid:
image
Here we see the error goes up to 0.05°, which is around 10%. This is not small.

But is it good enough for cf-xarray ?

My other idea for this issue was to:

  • Implement a parsing for the grid_mapping CF data
  • Generate the bounds of the 1D coordinates (rlat and rlon for example)
  • Project the bounds to lat/lon using pyproj or something similar. That's a new optional dependency though. It does come with geopandas so it should be common in our user's envs.
    This should be way more precise. We could also have a switch between the two according to the presence of the needed dependencies and of a grid_mapping variable.

@larsbuntemeyer
Copy link
Contributor

@aulemahal as a user of cf_xarray and (CORDEX data), i would love to have it understand the grid_mapping attribute. I do the mapping quite regularly and i have also used pyproj, e.g., in this prototype. pyproj can actually understand the CF conventions regarding the grid_mapping attribute with the CRS.from_cf function. So that could probably also be useful to simply create not only bounds but also 2D coordinates in general if they are not part of the dataset. Would be nice if this could be done in cf_xarray...

@aulemahal aulemahal requested a review from dcherian November 7, 2022 18:41
@dcherian
Copy link
Contributor

dcherian commented Nov 7, 2022

Here we see the error goes up to 0.05°, which is around 10%. This is not small.

Not small but the dataset should really contain these by default :)

@larsbuntemeyer Do you ever need something like this? If so, does this approach look reasonable to you?

But is it good enough for cf-xarray ?

I really don't know. We should document this clearly though, both in the docstring and in the docs. Your example with the relative error graphic is nice. We should add that to the docs.

@aulemahal
Copy link
Contributor Author

@dcherian Just pushed some doc in the bounds page and a few more lines in the docstrings.

I also checked if it works with xESMF. There's 2 lines to delete there, but otherwise, it seems to work as expected!

@dcherian
Copy link
Contributor

LGTM. Thanks a lot @aulemahal !

cc @jbusecke

@dcherian dcherian merged commit 205e673 into main Nov 11, 2022
@dcherian dcherian deleted the bounds2d branch November 11, 2022 16:12
@jbusecke
Copy link

jbusecke commented Dec 6, 2022

Oh this is dope! If I understand correctly what is implemented here (just took a very quick look, will test next week hopefully) this might help a lot with getting rid of the dreaded autogenerate logic in xgcm and pave the way to work much easier with gridded datasets, which often only provide the cell center coordinates. In that case IMO this 'is enough' because its better than not working with the dataset!

I am still very curious about the suggestions in @larsbuntemeyer s answer (e.g. an even more precise way to 'generate curvilinear grids' but I am not sure that is within the scope of cf-xarray? Maybe it is. Happy to talk more about this in the future. Particularly I am interested if such an approach could enable us to recompute curvilinear grid metrics like cell area more accurately then doing some sort of dx*dy inferred from differences in lon/lat. Having that functionality would really supercharge the way xgcm would work with e.g. observational datasets! (cc @dcherian, @TomNicholas).

But first and foremost a huge thanks to @aulemahal for working on this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Update add_bounds to support 2D coordinate variables.
4 participants