Skip to content

Release covidcast-indicators 0.3.25 #1709

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 195 commits into from
Oct 19, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
195 commits
Select commit Hold shift + click to select a range
29e15e7
replace deprecated pandas functions
nmdefries Jan 26, 2022
3c25127
convert datetime type
nmdefries Jan 27, 2022
5dec58a
return to current pandas version
nmdefries Jan 27, 2022
dd36917
Merge branch 'main' into ndefries/gs-deprecated-pandas-fns
nmdefries Mar 8, 2022
c1ecc49
upload code
May 13, 2022
51c22e1
update params template
May 13, 2022
2ec5c04
add readme for the input data
May 13, 2022
57b70d3
add more instruction to the main
May 13, 2022
40133ce
add more instruction of required packages to the main
May 13, 2022
4cd3d45
remove incorrect files
May 13, 2022
0e37e2f
add details for the required support from the engineering side
May 13, 2022
28730c0
fixed import errors
May 16, 2022
a4cc2a0
Add missed definition of n_refds
jingjtang Jun 16, 2022
ca6d76e
Fix an error in the unit test
jingjtang Jun 16, 2022
91b7fc6
replace "differ by subq" language with NA
nmdefries Jun 23, 2022
16e78f6
list subq text and responses for matrix base q
nmdefries Jun 28, 2022
9bfaf08
libraries
nmdefries Jun 28, 2022
5d704ce
describe changed respondent group as display logic change
nmdefries Jul 13, 2022
336be6f
change eu_version field to eu_noneu
nmdefries Jul 18, 2022
2bc9462
remove likert/profile naming scheme
nmdefries Aug 10, 2022
aeb22f9
treat dropdown like matrix; null long response options
nmdefries Aug 11, 2022
d0735a2
change "matrix_base_name" to "originating_item_name"
nmdefries Aug 11, 2022
40cb527
change "matrix_subquestion" to "subquestion"
nmdefries Aug 11, 2022
1367689
change "originating_item_name" to "originating_question"
nmdefries Aug 11, 2022
b39af55
extract wave number from files if use "v" version
nmdefries Aug 16, 2022
9b946d0
Fix errors, add tooling scripts
jingjtang Aug 18, 2022
c8a310e
Add checks of the arguments
jingjtang Aug 18, 2022
4c3527c
Add explanation for the scripts
jingjtang Aug 18, 2022
86a9871
Fix a typo in the comment
jingjtang Aug 18, 2022
08572c6
read parquet data
nmdefries Aug 19, 2022
8e1220d
disambiguate tooling function names; factor out df validity checks
nmdefries Aug 22, 2022
456c52b
new main func to take over signal/geo looping; docs
nmdefries Aug 22, 2022
8ab512c
county filter; line wraps
nmdefries Aug 22, 2022
8282c53
factor out valid training days check
nmdefries Aug 22, 2022
f3e9baf
outline input filename-fetching funcs
nmdefries Aug 23, 2022
f073ef9
move data funcs to io.R
nmdefries Aug 23, 2022
3aec897
finish subset_valid_files logic
nmdefries Aug 23, 2022
a7f4165
create indicator-signal combos; loop over columns
nmdefries Aug 23, 2022
b1d57fa
make county filter funcs
nmdefries Aug 23, 2022
e2d1c1d
check existence input data
nmdefries Aug 23, 2022
fa0c107
formatting
nmdefries Aug 24, 2022
b1ed2fe
move local arg parsing and call out of package
nmdefries Aug 24, 2022
5c1de08
initial roxygen build
nmdefries Aug 24, 2022
b1e3e69
get tests working; convert dim to ncol/nrow
nmdefries Aug 24, 2022
8cfd785
allow any name for value field
nmdefries Aug 24, 2022
3039761
get R CMD check passing
nmdefries Aug 25, 2022
79ce886
add Makefile
nmdefries Aug 25, 2022
9d538e1
remove wd2 -- not defined
nmdefries Aug 25, 2022
e6ce2a7
rename evl to evaluate
nmdefries Aug 25, 2022
07177f1
mark environment vars in dplyr logic
nmdefries Aug 25, 2022
6953078
set default params values; change data_path to input_dir
nmdefries Aug 25, 2022
65a9e60
process all value_type combos in groups
nmdefries Aug 25, 2022
1206576
set parallel core params
nmdefries Aug 25, 2022
c262578
define test dates if not specified
nmdefries Aug 25, 2022
e52931e
get full path to input files
nmdefries Aug 25, 2022
54010cd
suppress tibble import note
nmdefries Aug 26, 2022
2b6dea7
formalize tooling funcs
nmdefries Aug 26, 2022
0eb4757
make validity checks work for multiple suffixes
nmdefries Aug 26, 2022
cd51126
include test_lag in output filename
nmdefries Aug 26, 2022
540a3a3
add more templating to docs
nmdefries Aug 26, 2022
788f3c9
more info in readme
nmdefries Aug 26, 2022
9487262
explicitly import covidcast for county_census data
nmdefries Aug 26, 2022
0078d87
move args with defaults to end of list
nmdefries Aug 26, 2022
d8254bf
extract wave number from qsf names with "v" version
nmdefries Aug 26, 2022
f0d32cd
expect at least one digit and _ after wave/version tag
nmdefries Aug 26, 2022
f466131
save modified tranlation files to new dir
nmdefries Aug 29, 2022
351b348
Merge branch 'main' into ndefries/final-qsf-fixes
nmdefries Aug 29, 2022
2a2e80e
undelete dsew cache gitignore
nmdefries Aug 29, 2022
999b9f4
local script docs
nmdefries Aug 30, 2022
2a04879
add arg parsing to main run script
nmdefries Aug 31, 2022
14509e3
add NOT to params setup check
nmdefries Aug 31, 2022
e7a2f28
validate train/predict flags; set in params file
nmdefries Aug 31, 2022
eedf690
add production params file with different defaults
nmdefries Aug 31, 2022
892c74e
exit if neither flag on
nmdefries Aug 31, 2022
42498b0
don't need __index_level__ drop; no longer in parquet files
nmdefries Sep 1, 2022
169a6da
move geo_level loop to run_backfill
nmdefries Sep 1, 2022
fbd2443
move value_type loop to run_backfill to avoid reading same files
nmdefries Sep 1, 2022
eba042d
Merge branch 'ndefries/bcorr_main_looping' into ndefries/bcorr-packag…
nmdefries Sep 1, 2022
b6bded7
create weekday ref and issue fields
nmdefries Sep 1, 2022
5d835a4
use value_col arg instead of fixed name
nmdefries Sep 1, 2022
34ade08
replace "ratio" with "fraction"
nmdefries Sep 2, 2022
606f453
rebuild
nmdefries Sep 2, 2022
f040c54
Merge branch 'ndefries/bcorr-package-org' into ndefries/bcorr-train-t…
nmdefries Sep 2, 2022
f843672
allow train/test flags to be set only via CL
nmdefries Sep 2, 2022
7d8eb13
add train_models flag logic
nmdefries Sep 2, 2022
1e6294e
doc files
nmdefries Sep 2, 2022
6bf446e
add make_predictions flag logic
nmdefries Sep 2, 2022
9bbe7d1
train model if cached file not found
nmdefries Sep 2, 2022
ac20f11
fix spelling of portuguese
nmdefries Sep 9, 2022
a4accf7
report barrier_reason_dontneed_* indicators and medical_care_none in …
nmdefries Sep 14, 2022
ed6e733
set fields NA by wave of addition/removal
nmdefries Sep 16, 2022
ce597c8
fixed errors and added unit tests
Sep 20, 2022
0a8837a
add files to support unit tests
Sep 20, 2022
45a9155
back to the previous version of data filteration
Sep 20, 2022
311e078
update main
Sep 20, 2022
410952b
add movel save dir to params
Sep 20, 2022
b2e2b36
add output dir
Sep 20, 2022
6ec8f3b
change the format of the model file names
Sep 20, 2022
7316c32
add folders
Sep 20, 2022
5a8d35a
update main and io
Sep 20, 2022
fd4d814
update data filteration to allow test-only mode
Sep 20, 2022
66dd670
add comments for TODOs
Sep 20, 2022
a3cffc3
Update Backfill_Correction/delphiBackfillCorrection/R/beta_prior_esti…
jingjtang Sep 20, 2022
ed07a23
Update Backfill_Correction/delphiBackfillCorrection/R/io.R
jingjtang Sep 20, 2022
b91830e
Update Backfill_Correction/delphiBackfillCorrection/R/receiving/.giti…
jingjtang Sep 20, 2022
4f5d84d
Update Backfill_Correction/delphiBackfillCorrection/R/main.R
jingjtang Sep 20, 2022
e1a06e0
Update Backfill_Correction/delphiBackfillCorrection/R/model/.gitignore
jingjtang Sep 20, 2022
dbb2041
Update Backfill_Correction/delphiBackfillCorrection/R/model.R
jingjtang Sep 20, 2022
a28ec44
Update Backfill_Correction/delphiBackfillCorrection/R/model.R
jingjtang Sep 20, 2022
b91f659
Update Backfill_Correction/delphiBackfillCorrection/unit-tests/testth…
jingjtang Sep 20, 2022
4e59b5f
Update Backfill_Correction/delphiBackfillCorrection/unit-tests/testth…
jingjtang Sep 20, 2022
a3b5f5e
add files that have been created or changed in man
Sep 21, 2022
239ae0d
delete duplicated add sqrtscale function
Sep 21, 2022
cc20e45
remove man files
Sep 21, 2022
84532ee
update the man file for add sqrtscale
Sep 21, 2022
d7fb083
add lag pad
Sep 21, 2022
68a68fd
remove model_save_dir and add lag_pad
Sep 21, 2022
81c82fa
remove model_save_dir
Sep 21, 2022
5981346
update the function to get model file names
Sep 21, 2022
305f86b
update the application on csv files
Sep 21, 2022
4dd982e
move the suffix loop after geo list loop
Sep 21, 2022
26573a6
fix an error
Sep 21, 2022
d7ca226
udpate man files
Sep 21, 2022
15df4da
add new man files
nmdefries Sep 21, 2022
39cf9c3
split geo values in one step instead of repeated filtering
nmdefries Sep 21, 2022
2058ced
drop counties directly; remove geo_list
nmdefries Sep 21, 2022
74cd946
remove filter_counties test
nmdefries Sep 21, 2022
7f9f794
fix errors and update docs
Sep 21, 2022
8448a97
small changes in unit tests to use pre defined constants
Sep 21, 2022
0b96c5e
update man file
Sep 21, 2022
4b33519
fix the error in the ojective function
Sep 21, 2022
709bc0e
Merge pull request #123 from cmu-delphi/ndefries/group-split-geo-filter
jingjtang Sep 21, 2022
c7ddd45
finish the unittests for io functions
Sep 22, 2022
9d324c1
update man files
Sep 22, 2022
c0d5fbc
fix an error in test-io
Sep 22, 2022
b1c0c1e
add training/prediction indicator arguments'
Sep 22, 2022
43de85f
update main
Sep 22, 2022
ae21823
use write_parquet in test-io
nmdefries Sep 22, 2022
161c44b
new test params to fix connection errors
nmdefries Sep 22, 2022
051d2fd
remove model, receiving dirs in R/
nmdefries Sep 22, 2022
87f69f1
remove test params during teardown
nmdefries Sep 22, 2022
aa796e4
Update Backfill_Correction/delphiBackfillCorrection/unit-tests/testth…
jingjtang Sep 23, 2022
460fc67
Update Backfill_Correction/delphiBackfillCorrection/unit-tests/testth…
jingjtang Sep 23, 2022
0fb394f
Update Backfill_Correction/delphiBackfillCorrection/unit-tests/testth…
jingjtang Sep 23, 2022
1988f6b
update the names for test_that cases
jingjtang Sep 23, 2022
8c17f07
Add test cases across the year boundary
jingjtang Sep 23, 2022
909243f
Update Backfill_Correction/delphiBackfillCorrection/R/main.R
jingjtang Sep 23, 2022
b99ffc6
Merge pull request #122 from cmu-delphi/jingjing/backfill_correction_…
jingjtang Sep 23, 2022
80e8190
Merge pull request #121 from cmu-delphi/ndefries/bcorr-train-test-sep
jingjtang Sep 23, 2022
58e4970
Merge pull request #120 from cmu-delphi/ndefries/bcorr-package-org
jingjtang Sep 23, 2022
a0537bb
Merge pull request #119 from cmu-delphi/ndefries/bcorr_main_looping
jingjtang Sep 23, 2022
f710a7a
Merge branch 'jingjing/backfill_correction' from covid-19 into ndefri…
nmdefries Sep 23, 2022
bcdd91a
line-ending comma in template params
nmdefries Sep 23, 2022
2ec8921
Merge pull request #1699 from cmu-delphi/bot/sync-prod-main
krivard Sep 23, 2022
2b0e5d7
Update Backfill_Correction/delphiBackfillCorrection/DESCRIPTION
jingjtang Sep 23, 2022
29f3355
Update Backfill_Correction/correct_local_signal.R
jingjtang Sep 23, 2022
3c62f80
update the get_weekofmonths functions
jingjtang Sep 24, 2022
3f26915
add test cases for updated get_weekofmonths
jingjtang Sep 24, 2022
814e754
fix a typo
jingjtang Sep 24, 2022
dfaa223
Add prediction example and model file name suffix
jingjtang Sep 24, 2022
873f3cc
fix an error in the makefile
jingjtang Sep 24, 2022
9f1db9e
remove bash-init.sh use
nmdefries Sep 23, 2022
06c3d4e
make named log file
nmdefries Sep 26, 2022
9ad8073
move covidcst import; rebuild package
nmdefries Sep 26, 2022
55dcad9
compare successes to # of taus
nmdefries Sep 26, 2022
c485807
update testpath in readme
nmdefries Sep 26, 2022
d37ae59
smallest system-derived # cores is 1
nmdefries Sep 26, 2022
2f724f9
fixed an error in the unittest
Sep 26, 2022
775e84a
remove the tooling script
Sep 26, 2022
5ee9860
remove tooling script runner
nmdefries Sep 26, 2022
46fdfd9
drop uppercase from dir name
nmdefries Sep 28, 2022
fb38e07
remove local test files
nmdefries Sep 29, 2022
4f8c446
remove local testing .Rd and imports
nmdefries Sep 29, 2022
c482b2e
fix "no visible binding" warnings
nmdefries Sep 29, 2022
08d2c2f
Merge pull request #1695 from cmu-delphi/ndefries/final-contingency-f…
krivard Sep 29, 2022
ee6d459
Update backfill_corrections/delphiBackfillCorrection/R/main.R
jingjtang Sep 29, 2022
c29665c
Update backfill_corrections/README.md
jingjtang Sep 29, 2022
94ed22b
remove tooling.R comments from readme
nmdefries Sep 30, 2022
88dd5d3
test no rows dropped during preprocess
nmdefries Sep 30, 2022
beb0f4c
Merge branch 'main' into ndefries/gs-deprecated-pandas-fns
nmdefries Sep 30, 2022
7d4d762
add new test file
nmdefries Sep 30, 2022
ab23148
explicitly set sum::numeric_only to suppress warning
nmdefries Sep 30, 2022
2e9a978
make new test file smaller
nmdefries Sep 30, 2022
5535a53
arg spacing
nmdefries Oct 3, 2022
dd5e300
arg spacing
nmdefries Oct 3, 2022
cf4c7b7
Merge pull request #1648 from cmu-delphi/ndefries/final-qsf-fixes
krivard Oct 3, 2022
052bdbc
preallocate output dfs list and concat outside loop for speed
nmdefries Oct 3, 2022
1872b81
use existing test data for test_null_rows
nmdefries Oct 3, 2022
3b16f42
remove facebook dir
nmdefries Oct 6, 2022
9bf17c1
remove facebook CI/CD setup; drop from Jenkinsfile
nmdefries Oct 6, 2022
90a8ce0
keep docker workflow, but drop facebook from package list
nmdefries Oct 6, 2022
ebff6c1
Merge pull request #1703 from cmu-delphi/ndefries/deprecate-ctis-pipe…
krivard Oct 17, 2022
58a57df
Merge pull request #1701 from cmu-delphi/ndefries/merge-backfill-corr…
krivard Oct 17, 2022
2d9df04
Merge pull request #1497 from cmu-delphi/ndefries/gs-deprecated-panda…
krivard Oct 17, 2022
7e744ce
chore: bump covidcast-indicators to 0.3.25
Oct 19, 2022
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
2 changes: 1 addition & 1 deletion .bumpversion.cfg
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
[bumpversion]
current_version = 0.3.24
current_version = 0.3.25
commit = True
message = chore: bump covidcast-indicators to {new_version}
tag = False
2 changes: 1 addition & 1 deletion .github/workflows/build-container-images.yml
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ jobs:
runs-on: ubuntu-latest
strategy:
matrix:
packages: [ facebook ]
packages: [ ]
steps:
- name: Checkout code
uses: actions/checkout@v2
Expand Down
61 changes: 0 additions & 61 deletions .github/workflows/r-ci.yml

This file was deleted.

2 changes: 1 addition & 1 deletion Jenkinsfile
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@
- Keep in sync with '.github/workflows/python-ci.yml'.
- TODO: #527 Get this list automatically from python-ci.yml at runtime.
*/
def indicator_list = ["changehc", "claims_hosp", "facebook", "google_symptoms", "hhs_hosp", "jhu", "nchs_mortality", "quidel", "quidel_covidtest", "safegraph_patterns", "sir_complainsalot", "usafacts", "dsew_community_profile", "doctor_visits"]
def indicator_list = ["changehc", "claims_hosp", "google_symptoms", "hhs_hosp", "jhu", "nchs_mortality", "quidel", "quidel_covidtest", "safegraph_patterns", "sir_complainsalot", "usafacts", "dsew_community_profile", "doctor_visits"]
def build_package = [:]
def deploy_staging = [:]
def deploy_production = [:]
Expand Down
63 changes: 0 additions & 63 deletions ansible/templates/facebook-params-prod.json.j2

This file was deleted.

34 changes: 34 additions & 0 deletions backfill_corrections/Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
SHELL:=/bin/bash

TODAY:=$(shell date -u +"%Y-%m-%d")
CURR_TIME:=$(shell date -u +"%Hh%Mm%Ss")
LOG_FILE:=$(TODAY)_$(CURR_TIME).log

default:
@echo No default implemented yet

install: dev

dev: delphiBackfillCorrection_1.0.tar.gz
R CMD INSTALL delphiBackfillCorrection_1.0.tar.gz

lib:
R -e 'roxygen2::roxygenise("delphiBackfillCorrection")'

run-R:
time Rscript run.R 2>&1 | tee $(LOG_FILE)
grep "backfill correction completed successfully" $(LOG_FILE)
grep "scheduled core" $(LOG_FILE) ; \
[ "$$?" -eq 1 ]

coverage:
Rscript -e 'covr::package_coverage("delphiBackfillCorrection")'

# best we can do
lint: coverage

test: delphiBackfillCorrection_1.0.tar.gz
R CMD check --test-dir=unit-tests $<

delphiBackfillCorrection_1.0.tar.gz: $(wildcard delphiBackfillCorrection/R/*.R)
R CMD build delphiBackfillCorrection
120 changes: 120 additions & 0 deletions backfill_corrections/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,120 @@
# Backfill Correction

## Running the Pipeline

The indicator is run by installing the package `delphiBackfillCorrection` and
running the script "run.R". To install the package, run the following code
from this directory:

```
make install
```

All of the user-changable parameters are stored in `params.json`. A basic
template is included as `params.json.template`. Default values are provided
for most parameters; `input_dir` is the only requied parameter.

To execute the module and produce the output datasets (by default, in
`receiving`), run the following:

```
Rscript run.R
```

Default values are provided for most parameters; `input_dir`,
`test_start_date`, and `test_end_date` must be provided as command line
arguments.

## Building and testing the code

The documentation for the package is written using the **roxygen2** package. To
(re)-create this documentation for the package, run the following from the package
directory:

```
make lib
```

Testing the package is done with the built-in R package checks (which include
both static and dynamic checks), as well as unit tests written with
**testthat**. To run all of these, use the following from within this
directory:

```
make test
```

None of the tests should fail and notes and warnings should be manually
checked for issues. To see the code coverage from the tests and example run
the following:

```
make coverage
```

There should be good coverage of all the core functions in the package.

### Writing tests

Because the package tests involve reading and writing files, we must be
careful with working directories to ensure the tests are portable.

For reading and writing to files contained in the `unit-tests/testthat/` directory,
use the `testthat::test_path` function. It works much like `file.path` but
automatically provides paths relative to `unit-tests/testthat/`, so e.g.
`test_path("input")` becomes `unit-tests/testthat/input/` or whatever relative path
is needed to get there.

`params.json` files contain paths, so `unit-tests/testthat/helper-relativize.R`
contains `relativize_params`, which takes a `params` list and applies
`test_path` to all of its path components. This object can then be passed to
anything that needs it to read or write files.

### Testing during development

Repeatedly building the package and running the full check suite is tedious if
you are working on fixing a failing test. A faster workflow is this:

1. Set your R working directory to `delphiBackfillCorrection/unit-tests/testthat`.
2. Run `testthat::test_dir('.')`

This will test the live code without having to rebuild the package.

## Outline of the Indicator

TODO

### Data requirements

Required columns with fixed column names:

- geo_value: strings or floating numbers to indicate the location
- time_value: reference date
- lag: the number of days between issue date and the reference date
- issue_date: issue date/report, required if lag is not available

Required columns without fixed column names (column names must be specified in [TODO]):

- num_col: the column for the number of reported counts of the numerator. e.g.
the number of COVID claims counts according to insurance data.
- denom_col: the column for the number of reported counts of the denominator.
e.g. the number of total claims counts according to insurance data. Required
if correcting ratios.

## Output Files

The pipeline produces two output types:

1. Predictions

| geo_value | time_value |lag | value | predicted_tauX | ... | wis |
|--- | --- | --- | --- |--- |--- |--- |
| pa | 2022-01-01 | 1 | 0.1 | 0 | ... | 0.01 |

3. Model objects. In production, models are trained on the last year of
versions (as-of dates) and the last year of reference (report) dates. For
one signal at the state level, a model takes about 30 minutes to train. Due
to resource limitations in production, we only train models once a month
and save the model objects between runs. By default, these are saved to the
`cache` directory name with suffix `.model`.

Binary file not shown.
36 changes: 36 additions & 0 deletions backfill_corrections/delphiBackfillCorrection/DESCRIPTION
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
Package: delphiBackfillCorrection
Type: Package
Title: Correct signal outliers
Version: 1.0
Date: 2022-08-24
Author: Jingjing Tang
Maintainer: Jingjing Tang <[email protected]>
Description: Takes auxiliary output from COVIDcast API data pipelines and
adjusts unusual values using a lasso-penalized quantile regression.
Output is used for research and model development.
License: file LICENSE
Depends:
R (>= 3.5.0),
Imports:
dplyr,
readr,
tibble,
stringr,
covidcast,
quantgen,
arrow,
evalcast,
jsonlite,
lubridate,
tidyr,
zoo,
utils,
rlang,
parallel
Suggests:
knitr (>= 1.15),
rmarkdown (>= 1.4),
testthat (>= 1.0.1),
covr (>= 2.2.2)
RoxygenNote: 7.2.0
Encoding: UTF-8
Loading