total rain between onset and cessation, multiplying size of problem b… #298

remicousin · 2023-03-21T17:39:17Z

…y munber of years

enacts/onset/maproom.py

aaron-kaplan · 2023-03-27T15:39:52Z

enacts/config-zmd.yaml

+        total_mean:
+            map_max: 1100
+        total_stddev:
+            map_max: 300


It's going to be hard to remember to update all these max values in the config file when we get new data. Not sure what to do about it. One idea is to calculate them within the zarrification script and store the values in zarr attributes, but we can't necessarily guess ahead of time all the different values we'll eventually need. Does it make sense to rerun the zarrification script every time we think of a new one?

If by new data, you mean new data for Zambia, then those values won't need updates, or maybe at the end of the century when climate change will have totally change the domain of definition of those quantities. If you mean new data for another country, we will likely know right away that we have to change them because the values will likely be off for another part of the world and so the maps will look funny.

I also don't think we want those values to be coming from the data because, as you said, the possibilities are many, and we also rounded values that make the colorbar reading easy (and maybe even more specific values depending on the colorbar features). Typically, there will be values that we can figure out beforehand like these, or max_taw (even though I computed it live) that we can hard code. And there will be values that are dependent on the choices made in the Maproom page and thus must be computed in-app (like the length of search days in onset maprooms is a user choice and used to assess max of several maps).

Ultimately, all these things should be default values that have the page make sense upon landing (or new options triggered) but we should have a feature across maprooms to allow the user to change the range of their colorscale. So I would reather focus on such a feature.

enacts/onset/maproom.py

aaron-kaplan · 2023-03-27T15:46:29Z

enacts/onset/maproom.py

+                seasonal_total = precip_tile.where(
+                    (precip_tile["T"] >= onset_dates.rename({"T": "Ty"}))
+                        & (precip_tile["T"] <= cess_dates.rename({"T": "Ty"}))
+                ).sum("T")


How does this work? precip_tile has daily or dekadal resolution, while onset_dates and cess_dates are yearly, right? Along what dimension to they get aligned in order to do the comparison?

remicousin · 2023-03-27T17:55:17Z

Please hold: I clicked you as reviewer by mistake this morning. This is work in progress... sorry.

…y munber of years

remicousin · 2023-03-30T20:16:04Z

@aaron-kaplan , this is ready for review. I did both the map and the time series in the same PR because I find it informative that the dimensionality of the object against this is applied calls for different solutions.

The commits have sense:

total rain between onset... is an FYI mostly: it's the first solution I implemented that blows up the size of the objects being dealt with by creating a days by years array
More efficient... corrects the above by using grouping for the map. Since the groups must be the same through independent dimensions, the varying seasons edges are used to mask the data instead of creating groups, then the search dates for onset are used to create groups (so constantly for all X/Y)
tot time series... covers the time series case. There is no more the previous dimensional problem raised by the map, so instead I'm jumping straight into grouping using succession of onset and cessation dates as bin edges, and then picking every other group
correct paring... and bar plot for total are after rebasing with master and integrating the corrections made this week
then the last commit deals with the time series case where some onset or cessation dates are missing, then the pair must be dropped and the rest of the code can remain the same

(you might see at some point some toto= in the code, please ignore, they were not meant to be committed, they were for printing stuff and exploration).

Let me know what you think and then there might be stuff to extract for generalization.

remicousin · 2023-03-31T20:14:03Z

enacts/onset/maproom.py

+            seasonal_total = precip.groupby_bins(
+                "T",
+                seasonal_edges,
+                include_lowest=True


One thing to double check is whether the inclusion is indeed inclusive for both edges. That's what I would want the way it is currently written.

aaron-kaplan · 2023-04-26T17:34:31Z

Memory usage is excessive. I'm working on bringing it down.

remicousin · 2023-04-28T17:15:31Z

Memory usage is excessive. I'm working on bringing it down.

3 remarks:

1/ This soil moisture, calculated by water_balance, is the same always, at least in this maproom set up. So we could calculate it once and for all, store it in zarr format and read that to feed into the cessation date calculation

2/ This water_balance function was written quickly to satisfy the need to deliver this maproom. It's a special case of agronomy.soil_plant_water_balance but I don't know if soil_plant_water_balance has a more cost effective implementation.

3/ when/if the soil moisture, one day, is required to be computed live for some reason, we could probably be a bit more subtle and compute it only for a period of time allowing to do the seasonal search for the cessation date, as opposed to computing soil_moisture for all times regardless as it's done now.

aaron-kaplan · 2023-05-02T13:05:30Z

I think I see the way forward for computing functions where the value at each time step depends on the previous time steps. We should use DataArray.reduce. Its first argument should be a function func that takes two arguments: a DataArray that represents the result of the cumulative calculation for steps 0..i, and a DataArray that represents the input data for step i+1. Writing it this way will have two advantages: (1) it avoids storing intermediate results longer than necessary, which is currently causing memory problems, and (2) although it won't immediately make the calculation faster, it opens the possibility of someday rewriting func in C, which would make it faster with relatively low effort.

This entails rewriting some of the existing code from previous PRs. I think we should rewrite everything in this paradigm: water balance, onset date, cessation date, and seasonal total. Write each one as a function that takes the result of the calculation from previous steps and the rainfall for the current step, and returns
the result of the calculation for the current step. The seasonal total function will call the onset and cessation date functions, which in turn will each call the water balance function. (I guess that duplicates the water balance calculation; we can probably figure out a way to avoid that.)

aaron-kaplan · 2023-05-02T13:06:47Z

BTW precalculating water balance is also a good idea, but then the zarrification/precalculation script should use reduce.

aaron-kaplan · 2023-05-02T13:15:46Z

This water_balance function was written quickly to satisfy the need to deliver this maproom. It's a special case of agronomy.soil_plant_water_balance but I don't know if soil_plant_water_balance has a more cost effective implementation.

soil_plant_water_balance uses a function called soil_plant_water_step, which is the type of func that reduce expects. But then it stores the result of the soil moisture calculation for all times (variable sm), whereas I'm suggesting that we should store the soil moisture for time t only long enough to calculate it for t+1.

aaron-kaplan · 2023-07-11T20:23:26Z

@remicousin where do we stand on this? Is the ball in your court or mine?

remicousin · 2023-07-11T20:30:01Z

That one is on hold until we finish rewriting the water balance / cessation to save or not to save all time steps, which is in progress in that PR #351 which is on you

remicousin · 2023-07-11T20:32:19Z

But if anything I'd rather you start with PR #280 because Tufa would like to demo it in 2 weeks, whereas this updated onset or the water balance work, it's less clear when they want them finished.

remicousin · 2023-07-11T20:35:05Z

But if anything I'd rather you start with PR #280 because Tufa would like to demo it in 2 weeks, whereas this updated onset or the water balance work, it's less clear when they want them finished.

remicousin added the enacts label Mar 21, 2023

remicousin self-assigned this Mar 21, 2023

remicousin requested review from aaron-kaplan and removed request for aaron-kaplan March 27, 2023 14:29

aaron-kaplan reviewed Mar 27, 2023

View reviewed changes

remicousin added 6 commits March 30, 2023 14:36

total rain between onset and cessation, multiplying size of problem b…

e5562fc

…y munber of years

More efficient seasonal totals with masking and grouping

5ae8a45

tot time series. Need to cover case where onset or cess is NaN

3178632

correct pairing of onset and cessation everywhere

0eff3a7

bar plot for total

28c406d

dealing with missing onset/cess, correct indexing, better use of for

2e59949

remicousin force-pushed the remic/tot branch from 4cae255 to 2e59949 Compare March 30, 2023 20:07

remicousin commented Mar 31, 2023

View reviewed changes

aaron-kaplan mentioned this pull request Apr 26, 2023

taw read from file depends on X/Y instead of constant #321

Merged

total rain between onset and cessation, multiplying size of problem b… #298

Are you sure you want to change the base?

total rain between onset and cessation, multiplying size of problem b… #298

Uh oh!

Conversation

remicousin commented Mar 21, 2023

Uh oh!

Uh oh!

aaron-kaplan Mar 27, 2023

Choose a reason for hiding this comment

Uh oh!

remicousin Apr 28, 2023

Choose a reason for hiding this comment

Uh oh!

Uh oh!

aaron-kaplan Mar 27, 2023

Choose a reason for hiding this comment

Uh oh!

remicousin commented Mar 27, 2023

Uh oh!

remicousin commented Mar 30, 2023

Uh oh!

remicousin Mar 31, 2023

Choose a reason for hiding this comment

Uh oh!

aaron-kaplan commented Apr 26, 2023

Uh oh!

remicousin commented Apr 28, 2023

Uh oh!

aaron-kaplan commented May 2, 2023

Uh oh!

aaron-kaplan commented May 2, 2023

Uh oh!

aaron-kaplan commented May 2, 2023

Uh oh!

aaron-kaplan commented Jul 11, 2023

Uh oh!

remicousin commented Jul 11, 2023

Uh oh!

remicousin commented Jul 11, 2023

Uh oh!

remicousin commented Jul 11, 2023

Uh oh!

Uh oh!