Title: | Nowcasting by Bayesian Smoothing |
Version: | 1.1.0 |
Description: | A Bayesian approach to estimate the number of occurred-but-not-yet-reported cases from incomplete, time-stamped reporting data for disease outbreaks. 'NobBS' learns the reporting delay distribution and the time evolution of the epidemic curve to produce smoothed nowcasts in both stable and time-varying case reporting settings, as described in McGough et al. (2020) <doi:10.1371/journal.pcbi.1007735>. |
Depends: | R (≥ 3.3.0) |
SystemRequirements: | JAGS (http://mcmc-jags.sourceforge.net/) for analysis of Bayesian hierarchical models |
License: | MIT + file LICENSE |
Encoding: | UTF-8 |
LazyData: | true |
Imports: | dplyr, rlang, rjags, coda, magrittr |
RoxygenNote: | 7.3.2 |
Suggests: | knitr, rmarkdown, scoringutils (≥ 2.0.0), ggplot2 |
VignetteBuilder: | knitr |
NeedsCompilation: | no |
Packaged: | 2025-04-29 18:57:54 UTC; ry2460 |
Author: | Rami Yaari [cre, aut], Rodrigo Zepeda Tello [aut, ctb], Sarah McGough [aut, ctb], Nicolas Menzies [aut], Marc Lipsitch [aut], Michael Johansson [aut], Teresa Yamana [ctb], Matteo Perini [ctb] |
Maintainer: | Rami Yaari <ry2460@cumc.columbia.edu> |
Repository: | CRAN |
Date/Publication: | 2025-05-07 12:30:25 UTC |
Produce smooth Bayesian nowcasts of incomplete, time-stamped reporting data.
Description
Nowcasting is useful to estimate the true number of cases when they are unknown or incomplete in the present because of reporting delays. 'NobBS' is a Bayesian nowcasting approach that learns from the reporting delay distribution as well as the temporal evolution of the epidemic curve to estimate the number of occurred but not yet reported cases for a given date.
Usage
NobBS(
data,
now,
units,
onset_date,
report_date,
moving_window = NULL,
max_D = NULL,
cutoff_D = NULL,
add_dow_cov = FALSE,
proportion_reported = 1,
quiet = TRUE,
specs = list(dist = c("Poisson", "NB"), alpha1.mean.prior = 0, alpha1.prec.prior =
0.001, alphat.shape.prior = 0.001, alphat.rate.prior = 0.001, beta.priors = NULL,
gamma.mean.prior = rep(0, 6), gamma.prec.prior = rep(0.25, 6), param_names = NULL,
conf = 0.95, quantiles = c(0.025, 0.25, 0.5, 0.75, 0.975), dispersion.prior = NULL,
nAdapt = 1000, nChains = 1, nBurnin = 1000, nThin = 1, nSamp = 10000)
)
Arguments
data |
A time series of reporting data in line list format (one row per case), with a column |
now |
An object of datatype |
units |
Time scale of reporting. Options: "1 day", "1 week". |
onset_date |
In quotations, the name of the column of datatype |
report_date |
In quotations, the name of the column of datatype |
moving_window |
Size of moving window for estimation of cases (numeric). The moving window size should be specified in the same date units as the reporting data (i.e. specify 7 to indicate 7 days, 7 weeks, etc). Default: NULL, i.e. takes all historical dates into consideration. |
max_D |
Maximum possible delay observed or considered for estimation of the delay distribution (numeric). Default: (length of unique dates in time series)-1 ; or, if a moving window is specified, (size of moving window)-1 |
cutoff_D |
Consider only delays d<= |
add_dow_cov |
Whether or not to add day-of-week covariates to the model |
proportion_reported |
A decimal greater than 0 and less than or equal to 1 representing the proportion of all cases expected to be reported. Default: 1, e.g. 100 percent of all cases will eventually be reported. For asymptomatic diseases where not all cases will ever be reported, or for outbreaks in which severe under-reporting is expected, change this to less than 1. |
quiet |
Suppress all output and progress bars from the JAGS process. Default: TRUE. |
specs |
A list with arguments specifying the Bayesian model used: |
Value
The function returns a list with the following elements: estimates
, a 5-column data frame containing estimates for each date in the window of predictions (up to "now") with corresponding date of case onset, lower and upper bounds of the prediction interval, and the number of cases for that onset date reported up to 'now'. If quantiles is not NULL added columns will report the estimates for the requested quantiles; estimates.inflated
, a Tx4 data frame containing estimates inflated by the proportion_reported for each date in the time series (up to "now") with corresponding date of case onset, lower and upper bounds of the prediction interval, and the number of cases for that onset date reported up to 'now'. If quantiles is not NULL added columns will report the inflated estimates for the requested quantiles; nowcast.post.samples
, vector of 10,000 samples from the posterior predictive distribution of the nowcast, and params.post
, a 10,000xN dataframe containing 10,000 posterior samples for the "N" parameters specified in specs[["param_names"]]. See McGough et al. 2019 (https://www.biorxiv.org/content/10.1101/663823v1) for detailed explanation of parameters.
Notes
'NobBS' requires that JAGS (Just Another Gibbs Sampler) is downloaded to the system. JAGS can be downloaded at <http://mcmc-jags.sourceforge.net/>.
Examples
# Load the data
data(denguedat)
# Perform default 'NobBS' assuming Poisson distribution, vague priors, and default specifications.
nowcast <- NobBS(denguedat, as.Date("1990-04-09"),units="1 week",onset_date="onset_week",
report_date="report_week")
nowcast$estimates
Stratified nowcasts of incomplete, time-stamped reporting data.
Description
Produces nowcasts stratified by a single variable of interest, e.g. by geographic unit (province/state/region) or by age group.
Usage
NobBS.strat(
data,
now,
units,
onset_date,
report_date,
strata,
moving_window = NULL,
max_D = NULL,
cutoff_D = NULL,
add_dow_cov = FALSE,
quiet = TRUE,
proportion_reported = 1,
specs = list(dist = c("Poisson", "NB"), alpha1.mean.prior = 0, alpha1.prec.prior =
0.001, alphat.shape.prior = 0.001, alphat.rate.prior = 0.001, beta.priors = NULL,
gamma.mean.prior = rep(0, 6), gamma.prec.prior = rep(0.25, 6), param_names = NULL,
conf = 0.95, quantiles = c(0.025, 0.25, 0.5, 0.75, 0.975), dispersion.prior = NULL,
nAdapt = 1000, nChains = 1, nBurnin = 1000, nThin = 1, nSamp = 10000)
)
Arguments
data |
A time series of reporting data in line list format (one row per case), with a column |
now |
An object of datatype |
units |
Time scale of reporting. Options: "1 day", "1 week". |
onset_date |
In quotations, the name of the column of datatype |
report_date |
In quotations, the name of the column of datatype |
strata |
In quotations, the name of the column indicating the stratifying variable. |
moving_window |
Size of moving window for estimation of cases (numeric). The moving window size should be specified in the same date units as the reporting data (i.e. specify 7 to indicate 7 days, 7 weeks, etc). Default: NULL, i.e. takes all historical dates into consideration. |
max_D |
Maximum possible delay observed or considered for estimation of the delay distribution (numeric). Default: (length of unique dates in time series)-1 ; or, if a moving window is specified, (size of moving window)-1 |
cutoff_D |
Consider only delays d<= |
add_dow_cov |
Whether or not to add day-of-week covariates to the model |
quiet |
Suppress all output and progress bars from the JAGS process. Default: TRUE. |
proportion_reported |
A decimal greater than 0 and less than or equal to 1 representing the proportion of all cases expected to be reported. Default: 1, e.g. 100 percent of all cases will eventually be reported. For asymptomatic diseases where not all cases will ever be reported, or for outbreaks in which severe under-reporting is expected, change this to less than 1. |
specs |
A list with arguments specifying the Bayesian model used: |
Value
The function returns a list with the following elements: estimates
, a 5-column data frame containing estimates for each date in the window of predictions (up to "now") with corresponding date of case onset, lower and upper bounds of the prediction interval, and the number of cases for that onset date reported up to 'now'. If quantiles is not NULL added columns will report the estimates for the requested quantiles; estimates.inflated
, a Tx4 data frame containing estimates inflated by the proportion_reported for each date in the time series (up to "now") with corresponding date of case onset, lower and upper bounds of the prediction interval, and the number of cases for that onset date reported up to 'now'. If quantiles is not NULL added columns will report the inflated estimates for the requested quantiles; nowcast.post.samples
, vector of 10,000 samples from the posterior predictive distribution of the nowcast, and params.post
, a 10,000xN dataframe containing 10,000 posterior samples for the "N" parameters specified in specs[["param_names"]]. See McGough et al. 2019 (https://www.biorxiv.org/content/10.1101/663823v1) for detailed explanation of parameters.
Notes
'NobBS' requires that JAGS (Just Another Gibbs Sampler) is downloaded to the system. JAGS can be downloaded at <http://mcmc-jags.sourceforge.net/>.
Examples
# Load the data
data(denguedat)
# Perform stratified 'NobBS' assuming Poisson distribution, vague priors, and default
# specifications.
nowcast <- NobBS.strat(denguedat, as.Date("1990-02-05"),units="1 week",onset_date="onset_week",
report_date="report_week",strata="gender")
nowcast$estimates
denguedat: Dengue fever reporting data from Puerto Rico
Description
Surveillance data from CDC Division of Vector-Borne Diseases.
1990-2010 case reporting data included.
The first column, onset_week
, indicates the week of symptom onset.
The second column, report_week
, indicates the week of case report.
The third column, gender
, indicates the gender of the infected individual (randomly assigned with 0.5:0.5 probability of "Male"/"Female"). This column may be used to produce stratified nowcasts using the function NobBS.strat
.
Usage
data(denguedat)
Format
A data frame.
Examples
data(denguedat)
nowcast <- NobBS(denguedat, as.Date("1990-04-09"),units="1 week",onset_date="onset_week",
report_date="report_week")
nowcast$estimates
mpoxdat: Mpox reporting data from the 2022 New York City Outbreak
Description
Surveillance line list data provided by the New York City (NYC) Health Department
at https://github.com/nychealth/mpox_nowcast_eval, to accompany a nowcasting performance evaluation (doi: 10.2196/56495).
Patients with a confirmed or probable mpox diagnosis or illness onset from July 8 through September 30, 2022 were included.
The dataset contains 3323 rows and 4 columns.
The first column, dx_date
, is the specimen collection date of the first positive mpox laboratory result.
The second column, dx_report_date
, is the date the report of first positive mpox laboratory result was received by the NYC Health Department.
The third column, onset_date
, is the mpox symptom onset date.
The fourth column, onset_report_date
, is the date symptom onset date was received by the NYC Health Department.
Usage
data(mpoxdat)
Format
A data frame.
Examples
data(mpoxdat)
nowcast <- NobBS(mpoxdat, as.Date("2022-08-31"),units="1 day",onset_date="dx_date",
report_date="dx_report_date",moving_window=14)
nowcast$estimates