Skip to content

Add support for flag variables. #252

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Jul 29, 2021
Merged

Conversation

dcherian
Copy link
Contributor

@dcherian dcherian commented Jul 29, 2021

xref #199
Implements rich comparisons and .isin. I changed my mind and dropped the .flag.

@jbusecke I copied your code over but had to make some changes to match the conventions. Try it out and let me know how it works!

import cf_xarray
import numpy as np
import xarray as xr

da = xr.DataArray(
    [1, 2, 1, 1, 2, 2, 3, 3, 3, 3],
    dims=("time",),
    attrs={
        "flag_values": [1, 2, 3],
        "flag_meanings": "atlantic_ocean pacific_ocean indian_ocean",
		"standard_name": "region",
    },
	name="region",
)
da.cf
CF Flag variable with mapping:
	{'atlantic_ocean': 1, 'pacific_ocean': 2, 'indian_ocean': 3}

Coordinates:
- CF Axes:   X, Y, Z, T: n/a

- CF Coordinates:   longitude, latitude, vertical, time: n/a

- Cell Measures:   area, volume: n/a

- Standard Names:   n/a

- Bounds:   n/a
da.cf == "atlantic_ocean"
<xarray.DataArray 'basin' (time: 10)>
array([ True, False,  True,  True, False, False, False, False, False,
       False])
Dimensions without coordinates: time
da.cf.isin(["indian_ocean", "pacific_ocean"])
<xarray.DataArray 'basin' (time: 10)>
array([False,  True, False, False,  True,  True,  True,  True,  True,
        True])
Dimensions without coordinates: time

Quoting the docs:

The flag_values attribute is the same type as the variable to which it is attached, and contains a list of the possible flag values. The flag_meanings attribute is a string whose value is a blank separated list of descriptive words or phrases, one for each flag value.

Closes xarray-contrib#199
Implements rich comparisons and .isin.
@codecov
Copy link

codecov bot commented Jul 29, 2021

Codecov Report

Merging #252 (6f8a243) into main (5185d95) will increase coverage by 0.13%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #252      +/-   ##
==========================================
+ Coverage   96.38%   96.52%   +0.13%     
==========================================
  Files          16       16              
  Lines        1854     1926      +72     
==========================================
+ Hits         1787     1859      +72     
  Misses         67       67              
Flag Coverage Δ
unittests 96.52% <100.00%> (+0.13%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
cf_xarray/accessor.py 95.74% <100.00%> (+0.26%) ⬆️
cf_xarray/datasets.py 100.00% <100.00%> (ø)
cf_xarray/tests/test_accessor.py 99.73% <100.00%> (+<0.01%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5185d95...6f8a243. Read the comment docs.

@jbusecke
Copy link

You are an absolute hero! Will test this right away.

@jbusecke
Copy link

Works like a charm! I modfied my code slightly (I had originally encoded the flags as string, which was bad):

from cmip6_preprocessing.regionmask import merged_mask, _default_merge_dict
from cmip6_preprocessing.utils import _maybe_make_list

def basin_mask(ds):
    basins = regionmask.defined_regions.natural_earth.ocean_basins_50
    basins = merged_mask(basins, ds)
    
    # add cf compatible metadata
    basins.name = 'basin'
    flag_meanings = [m.replace(' ','_').lower() for m in list(_default_merge_dict().keys())]
    flag_values = np.arange(len(flag_meanings))
    basins.attrs.update({'flag_values':flag_values, 'flag_meanings':' '.join(flag_meanings)})
    return basins

def mask_basin(ds, basin, drop=False):
    basin = _maybe_make_list(basin)
    mask = basin_mask(ds)
    ds_masked = ds.where(mask.cf.isin(basin)) #this is nice and clean!
    return ds_masked

Then I can mask out a single basin:

ds = data_dict['CMIP.MIROC.MIROC6.historical.Omon.gn']
ds_masked = mask_basin(ds, 'southern_ocean')
plt.figure()
ds_masked.thetao.isel(time=0, lev=0, member_id=0).plot()

image

and even with any combination of multiple basins:

ds_masked = mask_basin(ds, ['southern_ocean', 'north_atlantic_ocean', 'south_atlantic_ocean'])
plt.figure()
ds_masked.thetao.isel(time=0, lev=0, member_id=0).plot()

image

Thanks again! This is super neat! Ill integrate this with cmip6_pp once this is merged?

@dcherian dcherian merged commit 16d8c39 into xarray-contrib:main Jul 29, 2021
@dcherian dcherian deleted the flags branch July 29, 2021 16:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants