Support hospitalizations in data pipeline #120

nmdefries · 2021-06-09T16:01:22Z

Description

Dashboard data pipeline now fetches and filters hospitalization forecasts in addition to deaths and cases.

Changes

Pull all daily hospitalization forecasts (ahead of 1 to 28)
Generalize valid target date/forecast date filters to apply to hospitalizations
Generalize save_score_cards and evaluate_chu to apply to hospitalizations
Calculate and save hospitalization forecast scores

Implications

This exacerbates existing memory issues since we're loading more forecasts and reference data. Memory issues will be addressed separately in changes to evalcast.

fetch hosp data with all aheads

kateharwood · 2021-06-09T18:28:30Z

Report/create_reports.R

                                             geo_values = state_geos,
                                             verbose = TRUE,
-                                             use_disk = TRUE)
+                                             use_disk = TRUE) %>% 
+    filter(!(incidence_period == "epiweek" & ahead > 4))


Just curious about this addition. It looks like this wasn't here before, yet we were still only saving aheads 1-4 for epiweek predictions (cases and deaths). I thought Jed made this cutoff elsewhere.

By default, get_covidhub_predictions uses ahead = 1:4 for both day and epiweek forecasts. However, daily forecasts actually go up to aheads of 28. To get those without getting epiweek forecasts more than 4 weeks ahead, I switched the ahead setting to 1-28 and added the filter.

We could do two separate calls to get_covidhub_predictions here, one for cases + deaths and one for hospitalizations, with different ahead settings. However the underlying get_forecaster_predictions_alt downloads all forecast files every time it's run (are you aware of any particular reason for this?), so the memory/speed tradeoff is poor at the moment.

are you aware of any particular reason for this?

Ah, perhaps because this was originally intended to be run in the GitHub Actions, the files wouldn't persist between sessions anyway.

I see, that makes sense. And yes I believe that is the case re: downloading files.

kateharwood · 2021-06-09T18:28:34Z

Report/create_reports.R

-# Only accept forecasts made Monday or earlier
+# For epiweek predictions, only accept forecasts made Monday or earlier.
+# target_end_date is the date of the last day (Saturday) in the epiweek
+# For daily predictions, accept any forecast where the target_end_date is later


Is there a reason we aren't using the "Monday or earlier" cutoff for hospitalization data?

The hospitalization forecasts are produced for every day following the forecast date; the target is N day ahead inc hosp. My understanding is that the "Monday or earlier" cutoff is only relevant for weekly forecasts, since we want to make sure that forecasts for a week aren't made with partial information for that week (i.e. it's easy to predict cases for a week if you know the values for 6 out of 7 days for that week). Will check with Dan.

This approach matches Dan's understanding.

Got it, thanks.

… if not

nmdefries · 2021-06-24T14:05:29Z

After incorporating docker changes in #125, this updated pipeline runs as expected.

support hospitalizations in data pipeline

0fca7fe

fetch hosp data with all aheads

nmdefries requested a review from kateharwood June 9, 2021 16:36

kateharwood reviewed Jun 9, 2021

View reviewed changes

nmdefries added 2 commits June 14, 2021 17:42

add warnings if cases/deaths not generated; prevent hosp from failing…

62d868b

… if not

Merge branch 'dev' into support-hospitalizations

2f60e4d

nmdefries requested a review from kateharwood June 24, 2021 14:05

kateharwood approved these changes Jun 25, 2021

View reviewed changes

nmdefries merged commit ac76cc1 into dev Jun 25, 2021

nmdefries deleted the support-hospitalizations branch June 25, 2021 20:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Support hospitalizations in data pipeline #120

Support hospitalizations in data pipeline #120

Uh oh!

nmdefries commented Jun 9, 2021 •

edited

Loading

Uh oh!

kateharwood Jun 9, 2021

Uh oh!

nmdefries Jun 9, 2021 •

edited

Loading

Uh oh!

nmdefries Jun 9, 2021

Uh oh!

kateharwood Jun 10, 2021

Uh oh!

kateharwood Jun 9, 2021

Uh oh!

nmdefries Jun 9, 2021

Uh oh!

nmdefries Jun 9, 2021

Uh oh!

kateharwood Jun 10, 2021

Uh oh!

nmdefries commented Jun 24, 2021

Uh oh!

Uh oh!

Support hospitalizations in data pipeline #120

Support hospitalizations in data pipeline #120

Uh oh!

Conversation

nmdefries commented Jun 9, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Changes

Implications

Uh oh!

kateharwood Jun 9, 2021

Choose a reason for hiding this comment

Uh oh!

nmdefries Jun 9, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

nmdefries Jun 9, 2021

Choose a reason for hiding this comment

Uh oh!

kateharwood Jun 10, 2021

Choose a reason for hiding this comment

Uh oh!

kateharwood Jun 9, 2021

Choose a reason for hiding this comment

Uh oh!

nmdefries Jun 9, 2021

Choose a reason for hiding this comment

Uh oh!

nmdefries Jun 9, 2021

Choose a reason for hiding this comment

Uh oh!

kateharwood Jun 10, 2021

Choose a reason for hiding this comment

Uh oh!

nmdefries commented Jun 24, 2021

Uh oh!

Uh oh!

nmdefries commented Jun 9, 2021 •

edited

Loading

nmdefries Jun 9, 2021 •

edited

Loading