-
Notifications
You must be signed in to change notification settings - Fork 16
Description
Zeros appear in JHU incidence signals when the cumulative number of cases or deaths remains the same between one day and the next. This does not necessarily mean that zero new cases or deaths occurred on that day: many regions have downgraded their reporting cadence, so that all the new cases or deaths that occurred in a week are collected together and reported in a batch on e.g. Thursdays.
Alas, this also does not necessarily mean that zero new cases or deaths did not occur on that day: small regions often go many weeks between new actual cases or deaths, so many/most/all of those zeros are true zeros.
It seems like a maintenance nightmare to try and keep accurate track of which regions are expected to report on which days so that we can suppress reporting of zeros on other days. It also seems like it would be difficult to consistently decide whether a region was "small enough" that their zeros were true zeros.
We should switch to never reporting zeros in incidence signals from JHU. It is much less harmful for a researcher to misinterpret the days before a small count in a small region as "not reported" days than for that researcher to misinterpret a large region going from zero cases to a large number in a single day.
This change will require:
- changing the jhu indicator code to drop the zero rows in incidence data frames before writing output
- updating the documentation to describe this behavior
- a one-off modification to the database to remove the zero rows in incidence signals for JHU