Skip to content

Commit 833e818

Browse files
authored
Merge pull request #1913 from cmu-delphi/nwss
Nwss: state+nation level
2 parents 56d2d25 + 6ae4d2d commit 833e818

File tree

23 files changed

+1496
-26
lines changed

23 files changed

+1496
-26
lines changed

.github/workflows/python-ci.yml

Lines changed: 34 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -5,37 +5,49 @@ name: Python package
55

66
on:
77
push:
8-
branches: [ main, prod ]
8+
branches: [main, prod]
99
pull_request:
10-
types: [ opened, synchronize, reopened, ready_for_review ]
11-
branches: [ main, prod ]
10+
types: [opened, synchronize, reopened, ready_for_review]
11+
branches: [main, prod]
1212

1313
jobs:
1414
build:
1515
runs-on: ubuntu-20.04
1616
if: github.event.pull_request.draft == false
1717
strategy:
1818
matrix:
19-
packages: [_delphi_utils_python, changehc, claims_hosp, doctor_visits, google_symptoms, hhs_hosp, nchs_mortality, quidel_covidtest, sir_complainsalot]
19+
packages:
20+
[
21+
_delphi_utils_python,
22+
changehc,
23+
claims_hosp,
24+
doctor_visits,
25+
google_symptoms,
26+
hhs_hosp,
27+
nchs_mortality,
28+
nwss_wastewater,
29+
quidel_covidtest,
30+
sir_complainsalot,
31+
]
2032
defaults:
2133
run:
2234
working-directory: ${{ matrix.packages }}
2335
steps:
24-
- uses: actions/checkout@v2
25-
- name: Set up Python 3.8
26-
uses: actions/setup-python@v2
27-
with:
28-
python-version: 3.8
29-
- name: Install testing dependencies
30-
run: |
31-
python -m pip install --upgrade pip
32-
pip install pylint pytest pydocstyle wheel
33-
- name: Install
34-
run: |
35-
make install-ci
36-
- name: Lint
37-
run: |
38-
make lint
39-
- name: Test
40-
run: |
41-
make test
36+
- uses: actions/checkout@v2
37+
- name: Set up Python 3.8
38+
uses: actions/setup-python@v2
39+
with:
40+
python-version: 3.8
41+
- name: Install testing dependencies
42+
run: |
43+
python -m pip install --upgrade pip
44+
pip install pylint pytest pydocstyle wheel
45+
- name: Install
46+
run: |
47+
make install-ci
48+
- name: Lint
49+
run: |
50+
make lint
51+
- name: Test
52+
run: |
53+
make test

Jenkinsfile

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@
1010
- TODO: #527 Get this list automatically from python-ci.yml at runtime.
1111
*/
1212

13-
def indicator_list = ["backfill_corrections", "changehc", "claims_hosp", "google_symptoms", "hhs_hosp", "nchs_mortality", "quidel_covidtest", "sir_complainsalot", "doctor_visits"]
13+
def indicator_list = ["backfill_corrections", "changehc", "claims_hosp", "google_symptoms", "hhs_hosp", "nchs_mortality", "quidel_covidtest", "sir_complainsalot", "doctor_visits", "nwss_wastewater"]
1414
def build_package_main = [:]
1515
def build_package_prod = [:]
1616
def deploy_staging = [:]
Lines changed: 29 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,41 @@
11
"""Unified not-a-number codes for CMU Delphi codebase."""
22

33
from enum import IntEnum
4+
import pandas as pd
5+
46

57
class Nans(IntEnum):
6-
"""An enum of not-a-number codes for the indicators."""
8+
"""An enum of not-a-number codes for the indicators.
9+
10+
See the descriptions here: https://cmu-delphi.github.io/delphi-epidata/api/missing_codes.html
11+
"""
712

813
NOT_MISSING = 0
914
NOT_APPLICABLE = 1
1015
REGION_EXCEPTION = 2
1116
CENSORED = 3
1217
DELETED = 4
1318
OTHER = 5
19+
20+
21+
def add_default_nancodes(df: pd.DataFrame):
22+
"""Add some default nancodes to the dataframe.
23+
24+
This method sets the `"missing_val"` column to NOT_MISSING whenever the
25+
`"val"` column has `isnull()` as `False`; if `isnull()` is `True`, then it
26+
sets `"missing_val"` to `OTHER`. It also sets both the `"missing_se"` and
27+
`"missing_sample_size"` columns to `NOT_APPLICABLE`.
28+
29+
Returns
30+
-------
31+
pd.DataFrame
32+
"""
33+
# Default missingness codes
34+
df["missing_val"] = Nans.NOT_MISSING
35+
df["missing_se"] = Nans.NOT_APPLICABLE
36+
df["missing_sample_size"] = Nans.NOT_APPLICABLE
37+
38+
# Mark any remaining nans with unknown
39+
remaining_nans_mask = df["val"].isnull()
40+
df.loc[remaining_nans_mask, "missing_val"] = Nans.OTHER
41+
return df
Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
{
2+
"common": {
3+
"export_dir": "./receiving",
4+
"log_filename": "./nwss_wastewater.log",
5+
"log_exceptions": false
6+
},
7+
"indicator": {
8+
"wip_signal": true,
9+
"export_start_date": "2020-02-01",
10+
"static_file_dir": "./static",
11+
"token": ""
12+
}
13+
}

nchs_mortality/README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -8,9 +8,9 @@ the state-level data as-is. For detailed information see the files
88
`MyAppToken` is required when fetching data from SODA Consumer API
99
(https://dev.socrata.com/foundry/data.cdc.gov/r8kw-7aab). Follow the
1010
steps below to create a MyAppToken.
11-
- Click the `Sign up for an app toekn` buttom in the linked website
11+
- Click the `Sign up for an app token` button in the linked website
1212
- Sign In or Sign Up with Socrata ID
13-
- Clck the `Create New App Token` button
13+
- Click the `Create New App Token` button
1414
- Fill in `Application Name` and `Description` (You can just use NCHS_Mortality
1515
for both) and click `Save`
1616
- Copy the `App Token`

nwss_wastewater/.pylintrc

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,22 @@
1+
2+
[MESSAGES CONTROL]
3+
4+
disable=logging-format-interpolation,
5+
too-many-locals,
6+
too-many-arguments,
7+
# Allow pytest functions to be part of a class.
8+
no-self-use,
9+
# Allow pytest classes to have one test.
10+
too-few-public-methods
11+
12+
[BASIC]
13+
14+
# Allow arbitrarily short-named variables.
15+
variable-rgx=[a-z_][a-z0-9_]*
16+
argument-rgx=[a-z_][a-z0-9_]*
17+
attr-rgx=[a-z_][a-z0-9_]*
18+
19+
[DESIGN]
20+
21+
# Don't complain about pytest "unused" arguments.
22+
ignored-argument-names=(_.*|run_as_module)

nwss_wastewater/DETAILS.md

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
# NWSS wastewater data
2+
3+
We import the wastewater data, including percentile, raw counts, and smoothed data, from the CDC website, aggregate to the state level from the sub-county wastewater treatment plant level, and export the aggregated data.
4+
5+
For the mean time, we only export the state-level aggregations of the data. This includes aggregating cities into their respective states.
6+
Ideally we will export the state level, the county level, and the wastewater treatment plant level. Possibly an exact mirror that includes sample sites as well.
7+
## Geographical Levels
8+
* `state`: reported using two-letter postal code
9+
## Metrics
10+
* `percentile`: This metric shows whether SARS-CoV-2 virus levels at a site are currently higher or lower than past historical levels at the same site. 0% means levels are the lowest they have been at the site; 100% means levels are the highest they have been at the site. Public health officials watch for increasing levels of the virus in wastewater over time and use this data to help make public health decisions.
11+
* `ptc_15d`: The percent change in SARS-CoV-2 RNA levels over the 15-day interval defined by 'date_start' and 'date_end'.
12+
Percent change is calculated as the modeled change over the interval, based on linear regression of log-transformed SARS-CoV-2 levels.
13+
SARS-CoV-2 RNA levels are wastewater concentrations that have been normalized for wastewater composition.
14+
* `detect_prop_15d`: The proportion of tests with SARS-CoV-2 detected, meaning a cycle threshold (Ct) value <40 for RT-qPCR or at least 3 positive droplets/partitions for RT-ddPCR, by sewershed over the 15-day window defined by 'date_start' and "date_end'. The detection proportion is the percent calculated by dividing the 15-day rolling sum of SARS-CoV-2 detections by the 15-day rolling sum of the number of tests for each sewershed and multiplying by 100.

nwss_wastewater/Makefile

Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,29 @@
1+
.PHONY = venv, lint, test, clean
2+
3+
dir = $(shell find ./delphi_* -name __init__.py | grep -o 'delphi_[_[:alnum:]]*' | head -1)
4+
venv:
5+
python3.8 -m venv env
6+
7+
install: venv
8+
. env/bin/activate; \
9+
pip install wheel ; \
10+
pip install -e ../_delphi_utils_python ;\
11+
pip install -e .
12+
13+
install-ci: venv
14+
. env/bin/activate; \
15+
pip install wheel ; \
16+
pip install ../_delphi_utils_python ;\
17+
pip install .
18+
19+
lint:
20+
. env/bin/activate; pylint $(dir)
21+
. env/bin/activate; pydocstyle $(dir)
22+
23+
test:
24+
. env/bin/activate ;\
25+
(cd tests && ../env/bin/pytest --cov=$(dir) --cov-report=term-missing)
26+
27+
clean:
28+
rm -rf env
29+
rm -f params.json

nwss_wastewater/README.md

Lines changed: 75 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,75 @@
1+
# NWSS wastewater data
2+
3+
We import the wastewater data, currently only the smoothed concentration, from the CDC website, aggregate to the state and national level from the wastewater sample site level, and export the aggregated data.
4+
For details see the `DETAILS.md` file in this directory.
5+
6+
## Create a MyAppToken
7+
`MyAppToken` is required when fetching data from SODA Consumer API
8+
(https://dev.socrata.com/foundry/data.cdc.gov/r8kw-7aab). Follow the
9+
steps below to create a MyAppToken.
10+
- Click the `Sign up for an app token` button in the linked website
11+
- Sign In or Sign Up with Socrata ID
12+
- Click the `Create New App Token` button
13+
- Fill in `Application Name` and `Description` (You can just use delphi_wastewater
14+
for both) and click `Save`
15+
- Copy the `App Token`
16+
17+
18+
## Running the Indicator
19+
20+
The indicator is run by directly executing the Python module contained in this
21+
directory. The safest way to do this is to create a virtual environment,
22+
installed the common DELPHI tools, and then install the module and its
23+
dependencies. To do this, run the following command from this directory:
24+
25+
```
26+
make install
27+
```
28+
29+
This command will install the package in editable mode, so you can make changes that
30+
will automatically propagate to the installed package.
31+
32+
All of the user-changable parameters are stored in `params.json`. To execute
33+
the module and produce the output datasets (by default, in `receiving`), run
34+
the following:
35+
36+
```
37+
env/bin/python -m delphi_nwss
38+
```
39+
40+
If you want to enter the virtual environment in your shell,
41+
you can run `source env/bin/activate`. Run `deactivate` to leave the virtual environment.
42+
43+
Once you are finished, you can remove the virtual environment and
44+
params file with the following:
45+
46+
```
47+
make clean
48+
```
49+
50+
## Testing the code
51+
52+
To run static tests of the code style, run the following command:
53+
54+
```
55+
make lint
56+
```
57+
58+
Unit tests are also included in the module. To execute these, run the following
59+
command from this directory:
60+
61+
```
62+
make test
63+
```
64+
65+
To run individual tests, run the following:
66+
67+
```
68+
(cd tests && ../env/bin/pytest <your_test>.py --cov=delphi_NAME --cov-report=term-missing)
69+
```
70+
71+
The output will show the number of unit tests that passed and failed, along
72+
with the percentage of code covered by the tests.
73+
74+
None of the linting or unit tests should fail, and the code lines that are not covered by unit tests should be small and
75+
should not include critical sub-routines.

nwss_wastewater/REVIEW.md

Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,38 @@
1+
## Code Review (Python)
2+
3+
A code review of this module should include a careful look at the code and the
4+
output. To assist in the process, but certainly not in replace of it, please
5+
check the following items.
6+
7+
**Documentation**
8+
9+
- [ ] the README.md file template is filled out and currently accurate; it is
10+
possible to load and test the code using only the instructions given
11+
- [ ] minimal docstrings (one line describing what the function does) are
12+
included for all functions; full docstrings describing the inputs and expected
13+
outputs should be given for non-trivial functions
14+
15+
**Structure**
16+
17+
- [ ] code should pass lint checks (`make lint`)
18+
- [ ] any required metadata files are checked into the repository and placed
19+
within the directory `static`
20+
- [ ] any intermediate files that are created and stored by the module should
21+
be placed in the directory `cache`
22+
- [ ] final expected output files to be uploaded to the API are placed in the
23+
`receiving` directory; output files should not be committed to the respository
24+
- [ ] all options and API keys are passed through the file `params.json`
25+
- [ ] template parameter file (`params.json.template`) is checked into the
26+
code; no personal (i.e., usernames) or private (i.e., API keys) information is
27+
included in this template file
28+
29+
**Testing**
30+
31+
- [ ] module can be installed in a new virtual environment (`make install`)
32+
- [ ] reasonably high level of unit test coverage covering all of the main logic
33+
of the code (e.g., missing coverage for raised errors that do not currently seem
34+
possible to reach are okay; missing coverage for options that will be needed are
35+
not)
36+
- [ ] all unit tests run without errors (`make test`)
37+
- [ ] indicator directory has been added to GitHub CI
38+
(`covidcast-indicators/.github/workflows/python-ci.yml`)

0 commit comments

Comments
 (0)