-
Notifications
You must be signed in to change notification settings - Fork 3
Use evalcast
caching mechanism to reduce time to pull from API
#242
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me. A couple minor comments [below this message] if you think they need the edits.
Note that I have not been able to test create_reports.R
in its entirety.
Locally, non-docker, I get
Warning in covidHubUtils::get_model_designations(source = "zoltar") :
get_model_designations() will be deprecated soon. please use get_model_metadata() instead.
get_token(): POST: https://zoltardata.com/api-token-auth/
get_resource(): GET: https://zoltardata.com/api/projects/
Error in data.frame(id = id_column, url = url_column, owner_url = owner_url_column, :
arguments imply differing number of rows: 10, 0
and make build
fails with
mkdir dist
test -f dist/score_cards_state_deaths.rds || curl -o dist/score_cards_state_deaths.rds https://forecast-eval.s3.us-east-2.amazonaws.com/score_cards_state_deaths.rds
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 25.0M 100 25.0M 0 0 10.5M 0 0:00:02 0:00:02 --:--:-- 10.5M
test -f dist/score_cards_state_cases.rds || curl -o dist/score_cards_state_cases.rds https://forecast-eval.s3.us-east-2.amazonaws.com/score_cards_state_cases.rds
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 14.2M 100 14.2M 0 0 8760k 0 0:00:01 0:00:01 --:--:-- 8759k
test -f dist/score_cards_nation_cases.rds || curl -o dist/score_cards_nation_cases.rds https://forecast-eval.s3.us-east-2.amazonaws.com/score_cards_nation_cases.rds
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 364k 100 364k 0 0 30567 0 0:00:12 0:00:12 --:--:-- 93150
test -f dist/score_cards_nation_deaths.rds || curl -o dist/score_cards_nation_deaths.rds https://forecast-eval.s3.us-east-2.amazonaws.com/score_cards_nation_deaths.rds
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 651k 100 651k 0 0 756k 0 --:--:-- --:--:-- --:--:-- 756k
test -f dist/score_cards_state_hospitalizations.rds || curl -o dist/score_cards_state_hospitalizations.rds https://forecast-eval.s3.us-east-2.amazonaws.com/score_cards_state_hospitalizations.rds
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 94.8M 100 94.8M 0 0 14.9M 0 0:00:06 0:00:06 --:--:-- 18.1M
test -f dist/score_cards_nation_hospitalizations.rds || curl -o dist/score_cards_nation_hospitalizations.rds https://forecast-eval.s3.us-east-2.amazonaws.com/score_cards_nation_hospitalizations.rds
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 2131k 100 2131k 0 0 1998k 0 0:00:01 0:00:01 --:--:-- 1999k
test -f dist/datetime_created_utc.rds || curl -o dist/datetime_created_utc.rds https://forecast-eval.s3.us-east-2.amazonaws.com/datetime_created_utc.rds
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 167 100 167 0 0 401 0 --:--:-- --:--:-- --:--:-- 400
docker build --no-cache=true --pull -t ghcr.io/cmu-delphi/forecast-eval: -f devops/Dockerfile .
invalid argument "ghcr.io/cmu-delphi/forecast-eval:" for "-t, --tag" flag: invalid reference format
See 'docker build --help'.
make: *** [Makefile:45: build_dashboard] Error 125
Is it worth taking a closer look at these replication issues, or are you already convinced that it will deploy successfully?
This runs successfully in production and locally for me. The The I wouldn't bother trying to test it, it takes a fair amount of setup. Thanks for your feedback on trying, though, I obviously need to update the README and make a |
evalcast
pulls each day/week of truth data from the COVIDcast API one by one. Since the overhead dominates the pull time, this is slow compared to pulling all desired dates at once. We can do this using the caching feature inevalcast
.This reduces the time to pull data from 3h 20m to 20m. Max memory usage decreases ~7 GB.