-
Notifications
You must be signed in to change notification settings - Fork 0
Description
Currently, there are some surprises in comparisons like
- by ahead you see dev10v4s lower WIS than amoebalike
- by forecast_date it looks like amoebalike is almost always lower WIS than dev10v4s.
This is probably explained by amoebalike being generated for fewer, mostly lower, aheads right now and slightly fewer times (though somehow it has ~twice as many predictions than dev10v4s when you do x var = forecaster?)
There are a couple of approaches
- Intersecting to common prediction set for the set of forecasters selected.
- Forecaster-pool-relative WIS approaches.
Usually I think we'd favor the first unless some weird missingness patterns or high levels of missingness force us to do something like the second.
There's a bit of old code for this; evalcast::intersect_averagers()
did this one way, and some old code did it another way; this is the core:
matched_scorecards <- scorecards %>%
filter(!is.na(ae)) %>%
group_by(data_source, signal, geo_value, forecast_date, target_end_date, ahead, incidence_period) %>%
filter(n() == length(unique(.[["forecaster"]]))) %>%
ungroup()
There are also variations on this that tried to also simultaneously filter to forecast dates or target end dates that had evaluations for all the aheads like the below, though it's pretty confusing and there's probably a better way to write it. The idea was that for the most recent target dates, we may only have evaluations ready for the shorter aheads, and that this would suggest misleading forecasting "trends" when breaking down by target end date but not simultaneously the ahead.
group_by(data_source, signal, geo_value, target_end_date, incidence_period) %>%
{
n.forecasters <- length(unique(.[["forecaster"]]))
filter(
.,
n() == n.forecasters * length(matching_aheads[matching_aheads %% 7L == extract_single_unique_value(ahead %% 7L)])
)
} %>%
ungroup() %>%
This code looks a little weird because forecast_date
s were expected to be exactly weekly but aheads from 0 or 1 to 28, so for target dates with the same weekday as a forecast date you'd want 5 or 4 predictions per forecaster, and for other target dates you'd want 4 predictions per forecaster. (For complete forecast_dates you'd want 29 or 28 per forecaster.)