Type: | Package |
Title: | Generate Postestimation Quantities for Bayesian MCMC Estimation |
Version: | 0.3.2 |
Date: | 2021-11-10 |
Description: | An implementation of functions to generate and plot postestimation quantities after estimating Bayesian regression models using Markov chain Monte Carlo (MCMC). Functionality includes the estimation of the Precision-Recall curves (see Beger, 2016 <doi:10.2139/ssrn.2765419>), the implementation of the observed values method of calculating predicted probabilities by Hanmer and Kalkan (2013) <doi:10.1111/j.1540-5907.2012.00602.x>, the implementation of the average value method of calculating predicted probabilities (see King, Tomz, and Wittenberg, 2000 <doi:10.2307/2669316>), and the generation and plotting of first differences to summarize typical effects across covariates (see Long 1997, ISBN:9780803973749; King, Tomz, and Wittenberg, 2000 <doi:10.2307/2669316>). This package can be used with MCMC output generated by any Bayesian estimation tool including 'JAGS', 'BUGS', 'MCMCpack', and 'Stan'. |
URL: | https://github.com/ShanaScogin/BayesPostEst |
BugReports: | https://github.com/ShanaScogin/BayesPostEst/issues |
License: | GPL-3 |
Imports: | carData, caTools, coda (≥ 0.13), dplyr (≥ 0.5.0), ggplot2, ggridges, reshape2, rlang, stats, texreg, tidyr (≥ 0.5.1), HDInterval, ROCR, graphics, grDevices, R2jags, runjags, rstanarm, rjags, MCMCpack, R2WinBUGS, brms |
Depends: | R (≥ 3.5.0) |
Encoding: | UTF-8 |
LazyData: | TRUE |
LazyLoad: | TRUE |
Suggests: | datasets, knitr, rmarkdown, rstan (≥ 2.10.1), testthat, covr |
VignetteBuilder: | knitr |
RoxygenNote: | 7.1.2 |
SystemRequirements: | JAGS (http://mcmc-jags.sourceforge.io) |
NeedsCompilation: | no |
Packaged: | 2021-11-10 20:56:50 UTC; shanascogin |
Author: | Johannes Karreth |
Maintainer: | Shana Scogin <shanarscogin@gmail.com> |
Repository: | CRAN |
Date/Publication: | 2021-11-11 08:10:05 UTC |
BayesPostEst Overview
Description
This package currently has nine main functions that can be used to generate and plot postestimation quantities after estimating Bayesian regression models using MCMC. The package combines functions written originally for Johannes Karreth's workshop on Bayesian modeling at the ICPSR Summer program. Currently BayesPostEst focuses mostly on generalized linear regression models for binary outcomes (logistic and probit regression). The vignette for this package has a walk-through of each function in action. Please refer to that to get an overview of all the functions, or visit the documentation for a specific function of your choice. Johannes Karreth's website (http://www.jkarreth.net) also has resources for getting started with Bayesian analysis, fitting models, and presenting results.
Main Functions
-
mcmcAveProb()
-
mcmcObsProb()
-
mcmcFD()
-
mcmcMargEff()
-
mcmcRocPrc()
-
mcmcRocPrcGen()
-
mcmcTab()
-
mcmcReg()
-
plot.mcmcFD()
Deprecated functions in package BayesPostEst.
Description
The functions listed below are deprecated and will be defunct in
the near future. When possible, alternative functions with similar
functionality are also mentioned. Help pages for deprecated functions are
available at help("-deprecated")
.
R function to plot first differences generated from MCMC output.
For more on this method, see the documentation for mcmcFD()
, Long (1997,
Sage Publications), and King, Tomz, and Wittenberg (2000, American Journal
of Political Science 44(2): 347-361). For a description of this type of plot,
see Figure 1 in Karreth (2018, International Interactions 44(3): 463-90).
Usage
mcmcFDplot(fdfull, ROPE = NULL)
Arguments
fdfull |
Output generated from |
ROPE |
defaults to NULL. If not NULL, a numeric vector of length two, defining the Region of Practical Equivalence around 0. See Kruschke (2013, Journal of Experimental Psychology 143(2): 573-603) for more on the ROPE. |
Value
a density plot of the differences in probabilities. The plot is made with ggplot2 and can be
passed on as an object to customize. Annotated numbers show the percent of posterior draws with
the same sign as the median estimate (if ROPE = NULL
) or on the same side of the
ROPE as the median estimate (if ROPE
is specified).
mcmcFDplot
For mcmcFDplot
, use plot.mcmcFD
.
References
Karreth, Johannes. 2018. “The Economic Leverage of International Organizations in Interstate Disputes.” International Interactions 44 (3): 463-90. https://doi.org/10.1080/03050629.2018.1389728.
King, Gary, Michael Tomz, and Jason Wittenberg. 2000. “Making the Most of Statistical Analyses: Improving Interpretation and Presentation.” American Journal of Political Science 44 (2): 347–61. http://www.jstor.org/stable/2669316.
Kruschke, John K. 2013. “Bayesian Estimation Supersedes the T-Test.” Journal of Experimental Psychology: General 142 (2): 573–603. https://doi.org/10.1037/a0029146.
Long, J. Scott. 1997. Regression Models for Categorical and Limited Dependent Variables. Thousand Oaks: Sage Publications.
See Also
Examples
if (interactive()) {
## simulating data
set.seed(1234)
b0 <- 0.2 # true value for the intercept
b1 <- 0.5 # true value for first beta
b2 <- 0.7 # true value for second beta
n <- 500 # sample size
X1 <- runif(n, -1, 1)
X2 <- runif(n, -1, 1)
Z <- b0 + b1 * X1 + b2 * X2
pr <- 1 / (1 + exp(-Z)) # inv logit function
Y <- rbinom(n, 1, pr)
df <- data.frame(cbind(X1, X2, Y))
## formatting the data for jags
datjags <- as.list(df)
datjags$N <- length(datjags$Y)
## creating jags model
model <- function() {
for(i in 1:N){
Y[i] ~ dbern(p[i]) ## Bernoulli distribution of y_i
logit(p[i]) <- mu[i] ## Logit link function
mu[i] <- b[1] +
b[2] * X1[i] +
b[3] * X2[i]
}
for(j in 1:3){
b[j] ~ dnorm(0, 0.001) ## Use a coefficient vector for simplicity
}
}
params <- c("b")
inits1 <- list("b" = rep(0, 3))
inits2 <- list("b" = rep(0, 3))
inits <- list(inits1, inits2)
## fitting the model with R2jags
set.seed(123)
fit <- R2jags::jags(data = datjags, inits = inits,
parameters.to.save = params, n.chains = 2, n.iter = 2000,
n.burnin = 1000, model.file = model)
## preparing data for mcmcFD()
xmat <- model.matrix(Y ~ X1 + X2, data = df)
mcmc <- coda::as.mcmc(fit)
mcmc_mat <- as.matrix(mcmc)[, 1:ncol(xmat)]
## plotting with mcmcFDplot()
full <- mcmcFD(modelmatrix = xmat,
mcmcout = mcmc_mat,
fullsims = TRUE)
# suppress deprecated warning for R check
suppressWarnings(mcmcFDplot(full))
}
Compute ROC and PR curve points
Description
Faster replacements for calculating ROC and PR curve data than with
ROCR::prediction()
and ROCR::performance()
Usage
compute_roc(yvec, pvec)
compute_pr(yvec, pvec)
Details
Replacements to use instead of a combination of ROCR::prediction()
and ROCR::performance()
to calculate ROC and PR curves. These functions are
about 10 to 20 times faster when using mcmcRocPrc()
with curves = TRUE
and/or fullsims = TRUE
.
See this issue on GH (ShanaScogin/BayesPostEst#25) for more general details.
And here is a note with specific performance benchmarks, compared to the old approach relying on ROCR.
Try to identify link function
Description
Try to identify link function
Usage
identify_link_function(obj)
Arguments
obj |
stanfit object |
Value
Either "logit" or "probit"; if neither can be identified the function will return 'NA_character_'.
Try to identify if a stanfit model is a binary choice model
Description
Try to identify if a stanfit model is a binary choice model
Usage
is_binary_model(obj)
Arguments
obj |
stanfit object |
Fitted JAGS interactive linear model
Description
A fitted JAGS linear model with interaction term generated with [R2jags::jags()]. See the example code below for how it was created. Used in examples and for testing.
Usage
jags_interactive
Format
A class "rjags" object created by [R2jags::jags()]
Examples
if (interactive()) {
data("sim_data_interactive")
## formatting the data for jags
datjags <- as.list(sim_data_interactive)
datjags$N <- length(datjags$Y)
## creating jags model
model <- function() {
for(i in 1:N){
Y[i] ~ dnorm(mu[i], sigma) ## Bernoulli distribution of y_i
mu[i] <- b[1] +
b[2] * X1[i] +
b[3] * X2[i] +
b[4] * X1[i] * X2[i]
}
for(j in 1:4){
b[j] ~ dnorm(0, 0.001) ## Use a coefficient vector for simplicity
}
sigma ~ dexp(1)
}
params <- c("b")
inits1 <- list("b" = rep(0, 4))
inits2 <- list("b" = rep(0, 4))
inits <- list(inits1, inits2)
## fitting the model with R2jags
set.seed(123)
jags_interactive <- R2jags::jags(data = datjags, inits = inits,
parameters.to.save = params, n.chains = 2,
n.iter = 2000, n.burnin = 1000,
model.file = model)
}
Fitted JAGS interactive linear model with categorical moderator
Description
A fitted JAGS linear model with interaction term generated with [R2jags::jags()]. See the example code below for how it was created. Used in examples and for testing.
Usage
jags_interactive_cat
Format
A class "rjags" object created by [R2jags::jags()]
Examples
if (interactive()) {
data("sim_data_interactive_cat")
## formatting the data for jags
datjags <- as.list(sim_data_interactive_cat)
datjags$N <- length(datjags$Y)
## creating jags model
model <- function() {
for(i in 1:N){
Y[i] ~ dnorm(mu[i], sigma) ## Bernoulli distribution of y_i
mu[i] <- b[1] +
b[2] * X1[i] +
b[3] * X3[i] +
b[4] * X1[i] * X3[i]
}
for(j in 1:4){
b[j] ~ dnorm(0, 0.001) ## Use a coefficient vector for simplicity
}
sigma ~ dexp(1)
}
params <- c("b")
inits1 <- list("b" = rep(0, 4))
inits2 <- list("b" = rep(0, 4))
inits <- list(inits1, inits2)
## fitting the model with R2jags
set.seed(123)
jags_interactive_cat <- R2jags::jags(data = datjags, inits = inits,
parameters.to.save = params, n.chains = 2,
n.iter = 2000, n.burnin = 1000,
model.file = model)
}
Fitted JAGS logit model
Description
A fitted JAGS logit model generated with [R2jags::jags()]. See the example code below for how it was created. Used in examples and for testing.
Usage
jags_logit
Format
A class "rjags" object created by [R2jags::jags()]
Examples
if (interactive()) {
data("sim_data")
## formatting the data for jags
datjags <- as.list(sim_data)
datjags$N <- length(datjags$Y)
## creating jags model
model <- function() {
for(i in 1:N){
Y[i] ~ dbern(p[i]) ## Bernoulli distribution of y_i
logit(p[i]) <- mu[i] ## Logit link function
mu[i] <- b[1] +
b[2] * X1[i] +
b[3] * X2[i]
}
for(j in 1:3){
b[j] ~ dnorm(0, 0.001) ## Use a coefficient vector for simplicity
}
}
params <- c("b")
inits1 <- list("b" = rep(0, 3))
inits2 <- list("b" = rep(0, 3))
inits <- list(inits1, inits2)
## fitting the model with R2jags
set.seed(123)
jags_logit <- R2jags::jags(data = datjags, inits = inits,
parameters.to.save = params, n.chains = 2,
n.iter = 2000, n.burnin = 1000, model.file = model)
}
Fitted JAGS probit model
Description
A fitted JAGS probit model generated with [R2jags::jags()]. See the example code below for how it was created. Used in examples and for testing.
Usage
jags_probit
Format
A class "rjags" object created by [R2jags::jags()]
Examples
if (interactive()) {
data("sim_data")
## formatting the data for jags
datjags <- as.list(sim_data)
datjags$N <- length(datjags$Y)
## creating jags model
model <- function() {
for(i in 1:N){
Y[i] ~ dbern(p[i]) ## Bernoulli distribution of y_i
probit(p[i]) <- mu[i] ## Update with probit link function
mu[i] <- b[1] +
b[2] * X1[i] +
b[3] * X2[i]
}
for(j in 1:3){
b[j] ~ dnorm(0, 0.001) ## Use a coefficient vector for simplicity
}
}
params <- c("b")
inits1 <- list("b" = rep(0, 3))
inits2 <- list("b" = rep(0, 3))
inits <- list(inits1, inits2)
## fitting the model with R2jags
set.seed(123)
jags_probit <- R2jags::jags(data = datjags, inits = inits,
parameters.to.save = params, n.chains = 2,
n.iter = 2000, n.burnin = 1000, model.file = model)
}
Predicted Probabilities using Bayesian MCMC estimates for the "Average" Case
Description
This function calculates predicted probabilities for "average" cases after a Bayesian logit or probit model. As "average" cases, this function calculates the median value of each predictor. For an explanation of predicted probabilities for "average" cases, see e.g. King, Tomz & Wittenberg (2000, American Journal of Political Science 44(2): 347-361).
Usage
mcmcAveProb(
modelmatrix,
mcmcout,
xcol,
xrange,
xinterest,
link = "logit",
ci = c(0.025, 0.975),
fullsims = FALSE
)
Arguments
modelmatrix |
model matrix, including intercept (if the intercept is among the
parameters estimated in the model). Create with model.matrix(formula, data).
Note: the order of columns in the model matrix must correspond to the order of columns
in the matrix of posterior draws in the |
mcmcout |
posterior distributions of all logit coefficients,
in matrix form. This can be created from rstan, MCMCpack, R2jags, etc. and transformed
into a matrix using the function as.mcmc() from the coda package for |
xcol |
column number of the posterior draws ( |
xrange |
name of the vector with the range of relevant values of the explanatory variable for which to calculate associated Pr(y = 1). |
xinterest |
semi-optional argument. Name of the explanatory variable for which
to calculate associated Pr(y = 1). If |
link |
type of generalized linear model; a character vector set to |
ci |
the bounds of the credible interval. Default is |
fullsims |
logical indicator of whether full object (based on all MCMC draws
rather than their average) will be returned. Default is |
Details
This function calculates predicted probabilities for "average" cases after a Bayesian logit or probit model. For an explanation of predicted probabilities for "average" cases, see e.g. King, Tomz & Wittenberg (2000, American Journal of Political Science 44(2): 347-361)
Value
if fullsims = FALSE
(default), a tibble with 4 columns:
x: value of variable of interest, drawn from
xrange
median_pp: median predicted Pr(y = 1) when variable of interest is set to x, holding all other predictors to average (median) values
lower_pp: lower bound of credible interval of predicted probability at given x
upper_pp: upper bound of credible interval of predicted probability at given x
if fullsims = TRUE
, a tibble with 3 columns:
Iteration: number of the posterior draw
x: value of variable of interest, drawn from
xrange
pp: average predicted Pr(y = 1) when variable of interest is set to x, holding all other predictors to average (median) values
References
King, Gary, Michael Tomz, and Jason Wittenberg. 2000. “Making the Most of Statistical Analyses: Improving Interpretation and Presentation.” American Journal of Political Science 44 (2): 347–61. http://www.jstor.org/stable/2669316
Examples
if (interactive()) {
## simulating data
set.seed(123)
b0 <- 0.2 # true value for the intercept
b1 <- 0.5 # true value for first beta
b2 <- 0.7 # true value for second beta
n <- 500 # sample size
X1 <- runif(n, -1, 1)
X2 <- runif(n, -1, 1)
Z <- b0 + b1 * X1 + b2 * X2
pr <- 1 / (1 + exp(-Z)) # inv logit function
Y <- rbinom(n, 1, pr)
df <- data.frame(cbind(X1, X2, Y))
## formatting the data for jags
datjags <- as.list(df)
datjags$N <- length(datjags$Y)
## creating jags model
model <- function() {
for(i in 1:N){
Y[i] ~ dbern(p[i]) ## Bernoulli distribution of y_i
logit(p[i]) <- mu[i] ## Logit link function
mu[i] <- b[1] +
b[2] * X1[i] +
b[3] * X2[i]
}
for(j in 1:3){
b[j] ~ dnorm(0, 0.001) ## Use a coefficient vector for simplicity
}
}
params <- c("b")
inits1 <- list("b" = rep(0, 3))
inits2 <- list("b" = rep(0, 3))
inits <- list(inits1, inits2)
## fitting the model with R2jags
library(R2jags)
set.seed(123)
fit <- jags(data = datjags, inits = inits,
parameters.to.save = params, n.chains = 2, n.iter = 2000,
n.burnin = 1000, model.file = model)
### average value approach
library(coda)
xmat <- model.matrix(Y ~ X1 + X2, data = df)
mcmc <- as.mcmc(fit)
mcmc_mat <- as.matrix(mcmc)[, 1:ncol(xmat)]
X1_sim <- seq(from = min(datjags$X1),
to = max(datjags$X1),
length.out = 10)
ave_prob <- mcmcAveProb(modelmatrix = xmat,
mcmcout = mcmc_mat,
xrange = X1_sim,
xcol = 2)
}
Coefficient Plots for MCMC Output
Description
Coefficient plots for MCMC output using ggplot2
Usage
mcmcCoefPlot(
mod,
pars = NULL,
pointest = "mean",
ci = 0.95,
hpdi = FALSE,
sort = FALSE,
plot = TRUE,
regex = FALSE
)
Arguments
mod |
Bayesian model object generated by R2jags, rjags, R2WinBUGS, R2OpenBUGS, MCMCpack, rstan, rstanarm, and brms. |
pars |
a scalar or vector of the parameters you wish to include in the table.
By default, |
pointest |
a character indicating whether to use the mean or median for point estimates in the table. |
ci |
a scalar indicating the confidence level of the uncertainty intervals. |
hpdi |
a logical indicating whether to use highest posterior density intervals
or equal tailed credible intervals to capture uncertainty; default |
sort |
logical indicating whether to sort the point estimates to produce
a caterpillar or dot plot; default |
plot |
logical indicating whether to return a |
regex |
use regular expression matching with |
Value
a ggplot
object or a tidy DataFrame.
Author(s)
Rob Williams, jayrobwilliams@gmail.com
Examples
if (interactive()) {
## simulating data
set.seed(123456)
b0 <- 0.2 # true value for the intercept
b1 <- 0.5 # true value for first beta
b2 <- 0.7 # true value for second beta
n <- 500 # sample size
X1 <- runif(n, -1, 1)
X2 <- runif(n, -1, 1)
Z <- b0 + b1 * X1 + b2 * X2
pr <- 1 / (1 + exp(-Z)) # inv logit function
Y <- rbinom(n, 1, pr)
df <- data.frame(cbind(X1, X2, Y))
## formatting the data for jags
datjags <- as.list(df)
datjags$N <- length(datjags$Y)
## creating jags model
model <- function() {
for(i in 1:N){
Y[i] ~ dbern(p[i]) ## Bernoulli distribution of y_i
logit(p[i]) <- mu[i] ## Logit link function
mu[i] <- b[1] +
b[2] * X1[i] +
b[3] * X2[i]
}
for(j in 1:3){
b[j] ~ dnorm(0, 0.001) ## Use a coefficient vector for simplicity
}
}
params <- c("b")
inits1 <- list("b" = rep(0, 3))
inits2 <- list("b" = rep(0, 3))
inits <- list(inits1, inits2)
## fitting the model with R2jags
set.seed(123)
fit <- R2jags::jags(data = datjags, inits = inits,
parameters.to.save = params, n.chains = 2, n.iter = 2000,
n.burnin = 1000, model.file = model)
## generating coefficient plot with all non-auxiliary parameters
mcmcCoefPlot(fit)
}
First Differences of a Bayesian Logit or Probit model
Description
R function to calculate first differences after a Bayesian logit or probit model. First differences are a method to summarize effects across covariates. This quantity represents the difference in predicted probabilities for each covariate for cases with low and high values of the respective covariate. For each of these differences, all other variables are held constant at their median. For more, see Long (1997, Sage Publications) and King, Tomz, and Wittenberg (2000, American Journal of Political Science 44(2): 347-361).
Usage
mcmcFD(
modelmatrix,
mcmcout,
link = "logit",
ci = c(0.025, 0.975),
percentiles = c(0.25, 0.75),
fullsims = FALSE
)
Arguments
modelmatrix |
model matrix, including intercept (if the intercept is among the
parameters estimated in the model). Create with model.matrix(formula, data).
Note: the order of columns in the model matrix must correspond to the order of columns
in the matrix of posterior draws in the |
mcmcout |
posterior distributions of all logit coefficients,
in matrix form. This can be created from rstan, MCMCpack, R2jags, etc. and transformed
into a matrix using the function as.mcmc() from the coda package for |
link |
type of generalized linear model; a character vector set to |
ci |
the bounds of the credible interval. Default is |
percentiles |
values of each predictor for which the difference in Pr(y = 1)
is to be calculated. Default is |
fullsims |
logical indicator of whether full object (based on all MCMC draws
rather than their average) will be returned. Default is |
Value
An object of class mcmcFD
. If fullsims = FALSE
(default),
a data frame with five columns:
median_fd: median first difference
lower_fd: lower bound of credible interval of the first difference
upper_fd: upper bound of credible interval of the first difference
VarName: name of the variable as found in
modelmatrix
VarID: identifier of the variable, based on the order of columns in
modelmatrix
andmcmcout
. Can be adjusted for plotting
If fullsims = TRUE
, a matrix with as many columns as predictors in the model. Each row
is the first difference for that variable based on one set of posterior draws. Column names are taken
from the column names of modelmatrix
.
References
King, Gary, Michael Tomz, and Jason Wittenberg. 2000. “Making the Most of Statistical Analyses: Improving Interpretation and Presentation.” American Journal of Political Science 44 (2): 347–61. http://www.jstor.org/stable/2669316
Long, J. Scott. 1997. Regression Models for Categorical and Limited Dependent Variables. Thousand Oaks: Sage Publications
Examples
if (interactive()) {
## simulating data
set.seed(1234)
b0 <- 0.2 # true value for the intercept
b1 <- 0.5 # true value for first beta
b2 <- 0.7 # true value for second beta
n <- 500 # sample size
X1 <- runif(n, -1, 1)
X2 <- runif(n, -1, 1)
Z <- b0 + b1 * X1 + b2 * X2
pr <- 1 / (1 + exp(-Z)) # inv logit function
Y <- rbinom(n, 1, pr)
df <- data.frame(cbind(X1, X2, Y))
## formatting the data for jags
datjags <- as.list(df)
datjags$N <- length(datjags$Y)
## creating jags model
model <- function() {
for(i in 1:N){
Y[i] ~ dbern(p[i]) ## Bernoulli distribution of y_i
logit(p[i]) <- mu[i] ## Logit link function
mu[i] <- b[1] +
b[2] * X1[i] +
b[3] * X2[i]
}
for(j in 1:3){
b[j] ~ dnorm(0, 0.001) ## Use a coefficient vector for simplicity
}
}
params <- c("b")
inits1 <- list("b" = rep(0, 3))
inits2 <- list("b" = rep(0, 3))
inits <- list(inits1, inits2)
## fitting the model with R2jags
set.seed(123)
fit <- R2jags::jags(data = datjags, inits = inits,
parameters.to.save = params, n.chains = 2, n.iter = 2000,
n.burnin = 1000, model.file = model)
## running function with logit
xmat <- model.matrix(Y ~ X1 + X2, data = df)
mcmc <- coda::as.mcmc(fit)
mcmc_mat <- as.matrix(mcmc)[, 1:ncol(xmat)]
object <- mcmcFD(modelmatrix = xmat,
mcmcout = mcmc_mat)
object
}
Marginal Effects Plots for MCMC Output
Description
Marginal effects plots for MCMC output using ggplot2
Usage
mcmcMargEff(
mod,
main,
int,
moderator,
pointest = "mean",
seq = 100,
ci = 0.95,
hpdi = FALSE,
plot = TRUE,
xlab = "Moderator",
ylab = "Marginal Effect"
)
Arguments
mod |
Bayesian model object generated by R2jags, rjags, R2WinBUGS, R2OpenBUGS, MCMCpack, rstan, rstanarm, and brms. |
main |
a character with the name of the parameter of interest in the interaction term. |
int |
a character with the name of the moderating parameter in the interaction term. |
moderator |
a vector of values that the moderating parameter takes on in the data. |
pointest |
a character indicating whether to use the mean or median for point estimates in the plot. |
seq |
a numeric giving the number of moderator values used to generate the marginal effects plot. |
ci |
a scalar indicating the confidence level of the uncertainty intervals. |
hpdi |
a logical indicating whether to use highest posterior density intervals or equal tailed credible intervals to capture uncertainty. |
plot |
logical indicating whether to return a |
xlab |
character giving x axis label if |
ylab |
character giving y axis label if |
Value
a ggplot
object or a tidy DataFrame.
Author(s)
Rob Williams, jayrobwilliams@gmail.com
Examples
if (interactive()) {
## simulating data
set.seed(123456)
b0 <- 0.2 # true value for the intercept
b1 <- 0.5 # true value for first beta
b2 <- 0.7 # true value for second beta
n <- 500 # sample size
X1 <- runif(n, -1, 1)
X2 <- runif(n, -1, 1)
Z <- b0 + b1 * X1 + b2 * X2
## linear model data
Y_linear <- rnorm(n, Z, 1)
df <- data.frame(cbind(X1, X2, Y = Y_linear))
## formatting the data for jags
datjags <- as.list(df)
datjags$N <- length(datjags$Y)
## creating jags model
model <- function() {
for(i in 1:N){
Y[i] ~ dnorm(mu[i], sigma) ## Bernoulli distribution of y_i
mu[i] <- b[1] +
b[2] * X1[i] +
b[3] * X2[i] +
b[4] * X1[i] * X2[i]
}
for(j in 1:4){
b[j] ~ dnorm(0, 0.001) ## Use a coefficient vector for simplicity
}
sigma ~ dexp(1)
}
params <- c("b")
inits1 <- list("b" = rep(0, 4))
inits2 <- list("b" = rep(0, 4))
inits <- list(inits1, inits2)
## fitting the model with R2jags
set.seed(123)
fit <- R2jags::jags(data = datjags, inits = inits,
parameters.to.save = params, n.chains = 2, n.iter = 2000,
n.burnin = 1000, model.file = model)
mcmcMargEff(mod = fit,
main = 'b[2]',
int = 'b[4]',
moderator = sim_data_interactive$X2,
plot = TRUE)
}
Predicted Probabilities using Bayesian MCMC estimates for the Average of Observed Cases
Description
Implements R function to calculate the predicted probabilities for "observed" cases after a Bayesian logit or probit model, following Hanmer & Kalkan (2013) (2013, American Journal of Political Science 57(1): 263-277).
Usage
mcmcObsProb(
modelmatrix,
mcmcout,
xcol,
xrange,
xinterest,
link = "logit",
ci = c(0.025, 0.975),
fullsims = FALSE
)
Arguments
modelmatrix |
model matrix, including intercept (if the intercept is among the
parameters estimated in the model). Create with model.matrix(formula, data).
Note: the order of columns in the model matrix must correspond to the order of columns
in the matrix of posterior draws in the |
mcmcout |
posterior distributions of all logit coefficients,
in matrix form. This can be created from rstan, MCMCpack, R2jags, etc. and transformed
into a matrix using the function as.mcmc() from the coda package for |
xcol |
column number of the posterior draws ( |
xrange |
name of the vector with the range of relevant values of the explanatory variable for which to calculate associated Pr(y = 1). |
xinterest |
semi-optional argument. Name of the explanatory variable for which
to calculate associated Pr(y = 1). If |
link |
type of generalized linear model; a character vector set to |
ci |
the bounds of the credible interval. Default is |
fullsims |
logical indicator of whether full object (based on all MCMC draws
rather than their average) will be returned. Default is |
Details
This function calculates predicted probabilities for "observed" cases after a Bayesian logit or probit model following Hanmer and Kalkan (2013, American Journal of Political Science 57(1): 263-277)
Value
if fullsims = FALSE
(default), a tibble with 4 columns:
x: value of variable of interest, drawn from
xrange
median_pp: median predicted Pr(y = 1) when variable of interest is set to x
lower_pp: lower bound of credible interval of predicted probability at given x
upper_pp: upper bound of credible interval of predicted probability at given x
if fullsims = TRUE
, a tibble with 3 columns:
Iteration: number of the posterior draw
x: value of variable of interest, drawn from
xrange
pp: average predicted Pr(y = 1) of all observed cases when variable of interest is set to x
References
Hanmer, Michael J., & Ozan Kalkan, K. (2013). Behind the curve: Clarifying the best approach to calculating predicted probabilities and marginal effects from limited dependent variable models. American Journal of Political Science, 57(1), 263-277. https://doi.org/10.1111/j.1540-5907.2012.00602.x
Examples
if (interactive()) {
## simulating data
set.seed(12345)
b0 <- 0.2 # true value for the intercept
b1 <- 0.5 # true value for first beta
b2 <- 0.7 # true value for second beta
n <- 500 # sample size
X1 <- runif(n, -1, 1)
X2 <- runif(n, -1, 1)
Z <- b0 + b1 * X1 + b2 * X2
pr <- 1 / (1 + exp(-Z)) # inv logit function
Y <- rbinom(n, 1, pr)
df <- data.frame(cbind(X1, X2, Y))
## formatting the data for jags
datjags <- as.list(df)
datjags$N <- length(datjags$Y)
## creating jags model
model <- function() {
for(i in 1:N){
Y[i] ~ dbern(p[i]) ## Bernoulli distribution of y_i
logit(p[i]) <- mu[i] ## Logit link function
mu[i] <- b[1] +
b[2] * X1[i] +
b[3] * X2[i]
}
for(j in 1:3){
b[j] ~ dnorm(0, 0.001) ## Use a coefficient vector for simplicity
}
}
params <- c("b")
inits1 <- list("b" = rep(0, 3))
inits2 <- list("b" = rep(0, 3))
inits <- list(inits1, inits2)
## fitting the model with R2jags
library(R2jags)
set.seed(123)
fit <- jags(data = datjags, inits = inits,
parameters.to.save = params, n.chains = 2, n.iter = 2000,
n.burnin = 1000, model.file = model)
### observed value approach
library(coda)
xmat <- model.matrix(Y ~ X1 + X2, data = df)
mcmc <- as.mcmc(fit)
mcmc_mat <- as.matrix(mcmc)[, 1:ncol(xmat)]
X1_sim <- seq(from = min(datjags$X1),
to = max(datjags$X1),
length.out = 10)
obs_prob <- mcmcObsProb(modelmatrix = xmat,
mcmcout = mcmc_mat,
xrange = X1_sim,
xcol = 2)
}
LaTeX or HTML regression tables for MCMC Output
Description
This function creates LaTeX or HTML regression tables for MCMC Output using
the texreg
function from the texreg
R package.
Usage
mcmcReg(
mod,
pars = NULL,
pointest = "mean",
ci = 0.95,
hpdi = FALSE,
sd = FALSE,
pr = FALSE,
coefnames = NULL,
gof = numeric(0),
gofnames = character(0),
format = "latex",
file,
regex = FALSE,
...
)
Arguments
mod |
Bayesian model object generated by R2jags, rjags, R2WinBUGS, R2OpenBUGS, MCMCpack, rstan, rstanarm, and brms, or a list of model objects of the same class. |
pars |
a scalar or vector of the parameters you wish to include in the table.
By default, |
pointest |
a character indicating whether to use the mean or median for point estimates in the table. |
ci |
a scalar indicating the confidence level of the uncertainty intervals. |
hpdi |
a logical indicating whether to use highest posterior density
intervals instead of equal tailed credible intervals to capture uncertainty
(default |
sd |
a logical indicating whether to report the standard deviation of
posterior distributions instead of an uncertainty interval
(default |
pr |
a logical indicating whether to report the probability that a
coefficient is in the same direction as the point estimate for that
coefficient (default |
coefnames |
an optional vector or list of vectors containing parameter
names for each model. If there are multiple models, the list must have the same
number of elements as there are models, and the vector of names in each list
element must match the number of parameters. If not supplied, the function
will use the parameter names in the model object(s). Note that this replaces
the standard |
gof |
a named list of goodness of fit statistics, or a list of such lists. |
gofnames |
an optional vector or list of vectors containing
goodness of fit statistic names for each model. Like |
format |
a character indicating |
file |
optional file name to write table to file instead of printing to console. |
regex |
use regular expression matching with |
... |
optional arguments to |
Details
If using custom.coef.map
with more than one model, you should rename
the parameters in the model objects to ensure that different parameters with the
same subscript are not conflated by texreg
e.g. beta[1]
could represent age
in one model and income in another, and texreg
would combine the two if you
do not rename beta[1]
to more informative names in the model objects.
If mod
is a brmsfit
object or list of brmsfit
objects, note that the
default brms
names for coefficients are b_Intercept
and b
, so both of
these should be included in par
if you wish to include the intercept in the
table.
Value
A formatted regression table in LaTeX or HTML format.
Author(s)
Rob Williams, jayrobwilliams@gmail.com
Examples
if (interactive()) {
## simulating data
set.seed(123456)
b0 <- 0.2 # true value for the intercept
b1 <- 0.5 # true value for first beta
b2 <- 0.7 # true value for second beta
n <- 500 # sample size
X1 <- runif(n, -1, 1)
X2 <- runif(n, -1, 1)
Z <- b0 + b1 * X1 + b2 * X2
pr <- 1 / (1 + exp(-Z)) # inv logit function
Y <- rbinom(n, 1, pr)
df <- data.frame(cbind(X1, X2, Y))
## formatting the data for jags
datjags <- as.list(df)
datjags$N <- length(datjags$Y)
## creating jags model
model <- function() {
for(i in 1:N){
Y[i] ~ dbern(p[i]) ## Bernoulli distribution of y_i
logit(p[i]) <- mu[i] ## Logit link function
mu[i] <- b[1] +
b[2] * X1[i] +
b[3] * X2[i]
}
for(j in 1:3){
b[j] ~ dnorm(0, 0.001) ## Use a coefficient vector for simplicity
}
}
params <- c("b")
inits1 <- list("b" = rep(0, 3))
inits2 <- list("b" = rep(0, 3))
inits <- list(inits1, inits2)
## fitting the model with R2jags
set.seed(123)
fit <- R2jags::jags(data = datjags, inits = inits,
parameters.to.save = params,
n.chains = 2,
n.iter = 2000, n.burnin = 1000,
model.file = model)
## generating regression table with all parameters
mcmcReg(fit)
## generating regression table with only betas and custom coefficent names
mcmcReg(fit, pars = c('b'), coefnames = c('Variable 1',
'Variable 2',
'Variable 3'),
regex = TRUE)
## generating regression tables with all betas and custom names
mcmcReg(fit, coefnames = c('Variable 1', 'Variable 2',
'Variable 3', 'deviance'))
}
ROC and Precision-Recall Curves using Bayesian MCMC estimates generalized
Description
This function generates ROC and Precision-Recall curves
after fitting a Bayesian logit or probit regression. For fast calculation for
from an "rjags" object use mcmcRocPrc
Usage
mcmcRocPrcGen(
modelmatrix,
mcmcout,
modelframe,
curves = FALSE,
link = "logit",
fullsims = FALSE
)
Arguments
modelmatrix |
model matrix, including intercept (if the intercept is among the
parameters estimated in the model). Create with model.matrix(formula, data).
Note: the order of columns in the model matrix must correspond to the order of columns
in the matrix of posterior draws in the |
mcmcout |
posterior distributions of all logit coefficients,
in matrix form. This can be created from rstan, MCMCpack, R2jags, etc. and transformed
into a matrix using the function as.mcmc() from the coda package for |
modelframe |
model frame in matrix form. Can be created using as.matrix(model.frame(formula, data)) |
curves |
logical indicator of whether or not to return values to plot the ROC or Precision-Recall
curves. If set to |
link |
type of generalized linear model; a character vector set to |
fullsims |
logical indicator of whether full object (based on all MCMC draws
rather than their average) will be returned. Default is |
Details
This function generates ROC and precision-recall curves after fitting a Bayesian logit or probit model.
Value
This function returns a list with 4 elements:
area_under_roc: area under ROC curve (scalar)
area_under_prc: area under precision-recall curve (scalar)
prc_dat: data to plot precision-recall curve (data frame)
roc_dat: data to plot ROC curve (data frame)
References
Beger, Andreas. 2016. “Precision-Recall Curves.” Available at SSRN: https://ssrn.com/Abstract=2765419. http://dx.doi.org/10.2139/ssrn.2765419.
Examples
if (interactive()) {
# simulating data
set.seed(123456)
b0 <- 0.2 # true value for the intercept
b1 <- 0.5 # true value for first beta
b2 <- 0.7 # true value for second beta
n <- 500 # sample size
X1 <- runif(n, -1, 1)
X2 <- runif(n, -1, 1)
Z <- b0 + b1 * X1 + b2 * X2
pr <- 1 / (1 + exp(-Z)) # inv logit function
Y <- rbinom(n, 1, pr)
df <- data.frame(cbind(X1, X2, Y))
# formatting the data for jags
datjags <- as.list(df)
datjags$N <- length(datjags$Y)
# creating jags model
model <- function() {
for(i in 1:N){
Y[i] ~ dbern(p[i]) ## Bernoulli distribution of y_i
logit(p[i]) <- mu[i] ## Logit link function
mu[i] <- b[1] +
b[2] * X1[i] +
b[3] * X2[i]
}
for(j in 1:3){
b[j] ~ dnorm(0, 0.001) ## Use a coefficient vector for simplicity
}
}
params <- c("b")
inits1 <- list("b" = rep(0, 3))
inits2 <- list("b" = rep(0, 3))
inits <- list(inits1, inits2)
## fitting the model with R2jags
set.seed(123)
fit <- R2jags::jags(data = datjags, inits = inits,
parameters.to.save = params, n.chains = 2, n.iter = 2000,
n.burnin = 1000, model.file = model)
# processing the data
mm <- model.matrix(Y ~ X1 + X2, data = df)
xframe <- as.matrix(model.frame(Y ~ X1 + X2, data = df))
mcmc <- coda::as.mcmc(fit)
mcmc_mat <- as.matrix(mcmc)[, 1:ncol(xframe)]
# using mcmcRocPrcGen
fit_sum <- mcmcRocPrcGen(modelmatrix = mm,
modelframe = xframe,
mcmcout = mcmc_mat,
curves = TRUE,
fullsims = FALSE)
}
Summarize Bayesian MCMC Output R function for summarizing MCMC output in a regression-style table.
Description
Summarize Bayesian MCMC Output
R function for summarizing MCMC output in a regression-style table.
Usage
mcmcTab(
sims,
ci = c(0.025, 0.975),
pars = NULL,
Pr = FALSE,
ROPE = NULL,
regex = FALSE
)
Arguments
sims |
Bayesian model object generated by R2jags, rjags, R2WinBUGS, R2OpenBUGS, MCMCpack, rstan, and rstanarm. |
ci |
desired level for credible intervals; defaults to c(0.025, 0.975). |
pars |
character vector of parameters to be printed; defaults to |
Pr |
print percent of posterior draws with same sign as median; defaults to |
ROPE |
defaults to |
regex |
use regular expression matching with |
Value
a data frame containing MCMC summary statistics.
References
Kruschke, John K. 2013. “Bayesian Estimation Supersedes the T-Test.” Journal of Experimental Psychology: General 142 (2): 573–603. https://doi.org/10.1037/a0029146.
Examples
if (interactive()) {
data("jags_logit")
## printing out table
object <- mcmcTab(jags_logit,
ci = c(0.025, 0.975),
pars = NULL,
Pr = FALSE,
ROPE = NULL)
object
}
Constructor for mcmcRocPrc objects
Description
This function actually does the heavy lifting once we have a matrix of predicted probabilities from a model, plus the vector of observed outcomes. The reason to have it here in a single function is that we don't replicate it in each function that accomodates a JAGS, BUGS, RStan, etc. object.
Usage
new_mcmcRocPrc(pred_prob, yvec, curves, fullsims)
Arguments
pred_prob |
a |
yvec |
a |
curves |
include curve data in output? |
fullsims |
collapse posterior samples into single summary? |
Plot Method for First Differences from MCMC output
Description
The plot
method for first differences generated from MCMC
output by mcmcFD
. For more on this method, see Long
(1997, Sage Publications), and King, Tomz, and Wittenberg (2000, American
Journal of Political Science 44(2): 347-361). For a description of this type
of plot, see Figure 1 in Karreth (2018, International Interactions 44(3): 463-90).
Usage
## S3 method for class 'mcmcFD'
plot(x, ROPE = NULL, ...)
Arguments
x |
Output generated from |
ROPE |
defaults to NULL. If not NULL, a numeric vector of length two, defining the Region of Practical Equivalence around 0. See Kruschke (2013, Journal of Experimental Psychology 143(2): 573-603) for more on the ROPE. |
... |
Value
a density plot of the differences in probabilities. The plot is made with ggplot2 and can be
passed on as an object to customize. Annotated numbers show the percent of posterior draws with
the same sign as the median estimate (if ROPE = NULL
) or on the same side of the
ROPE as the median estimate (if ROPE
is specified).
References
Karreth, Johannes. 2018. “The Economic Leverage of International Organizations in Interstate Disputes.” International Interactions 44 (3): 463-90. https://doi.org/10.1080/03050629.2018.1389728.
King, Gary, Michael Tomz, and Jason Wittenberg. 2000. “Making the Most of Statistical Analyses: Improving Interpretation and Presentation.” American Journal of Political Science 44 (2): 347–61. http://www.jstor.org/stable/2669316.
Kruschke, John K. 2013. “Bayesian Estimation Supersedes the T-Test.” Journal of Experimental Psychology: General 142 (2): 573–603. https://doi.org/10.1037/a0029146.
Long, J. Scott. 1997. Regression Models for Categorical and Limited Dependent Variables. Thousand Oaks: Sage Publications.
See Also
Examples
if (interactive()) {
## simulating data
set.seed(1234)
b0 <- 0.2 # true value for the intercept
b1 <- 0.5 # true value for first beta
b2 <- 0.7 # true value for second beta
n <- 500 # sample size
X1 <- runif(n, -1, 1)
X2 <- runif(n, -1, 1)
Z <- b0 + b1 * X1 + b2 * X2
pr <- 1 / (1 + exp(-Z)) # inv logit function
Y <- rbinom(n, 1, pr)
df <- data.frame(cbind(X1, X2, Y))
## formatting the data for jags
datjags <- as.list(df)
datjags$N <- length(datjags$Y)
## creating jags model
model <- function() {
for(i in 1:N){
Y[i] ~ dbern(p[i]) ## Bernoulli distribution of y_i
logit(p[i]) <- mu[i] ## Logit link function
mu[i] <- b[1] +
b[2] * X1[i] +
b[3] * X2[i]
}
for(j in 1:3){
b[j] ~ dnorm(0, 0.001) ## Use a coefficient vector for simplicity
}
}
params <- c("b")
inits1 <- list("b" = rep(0, 3))
inits2 <- list("b" = rep(0, 3))
inits <- list(inits1, inits2)
## fitting the model with R2jags
set.seed(123)
fit <- R2jags::jags(data = datjags, inits = inits,
parameters.to.save = params, n.chains = 2, n.iter = 2000,
n.burnin = 1000, model.file = model)
## preparing data for mcmcFD()
xmat <- model.matrix(Y ~ X1 + X2, data = df)
mcmc <- coda::as.mcmc(fit)
mcmc_mat <- as.matrix(mcmc)[, 1:ncol(xmat)]
## plotting with mcmcFDplot()
full <- mcmcFD(modelmatrix = xmat,
mcmcout = mcmc_mat,
fullsims = TRUE)
plot(full)
}
ROC and Precision-Recall Curves using Bayesian MCMC estimates
Description
Generate ROC and Precision-Recall curves after fitting a Bayesian logit or
probit regression using rstan::stan()
, rstanarm::stan_glm()
,
R2jags::jags()
, R2WinBUGS::bugs()
, MCMCpack::MCMClogit()
, or other
functions that provide samples from a posterior density.
Usage
## S3 method for class 'mcmcRocPrc'
print(x, ...)
## S3 method for class 'mcmcRocPrc'
plot(x, n = 40, alpha = 0.5, ...)
## S3 method for class 'mcmcRocPrc'
as.data.frame(
x,
row.names = NULL,
optional = FALSE,
what = c("auc", "roc", "prc"),
...
)
mcmcRocPrc(object, curves = FALSE, fullsims = FALSE, ...)
## Default S3 method:
mcmcRocPrc(object, curves, fullsims, yvec, ...)
## S3 method for class 'jags'
mcmcRocPrc(
object,
curves = FALSE,
fullsims = FALSE,
yname,
xnames,
posterior_samples,
...
)
## S3 method for class 'rjags'
mcmcRocPrc(object, curves = FALSE, fullsims = FALSE, yname, xnames, ...)
## S3 method for class 'runjags'
mcmcRocPrc(object, curves = FALSE, fullsims = FALSE, yname, xnames, ...)
## S3 method for class 'stanfit'
mcmcRocPrc(object, curves = FALSE, fullsims = FALSE, data, xnames, yname, ...)
## S3 method for class 'stanreg'
mcmcRocPrc(object, curves = FALSE, fullsims = FALSE, ...)
## S3 method for class 'brmsfit'
mcmcRocPrc(object, curves = FALSE, fullsims = FALSE, ...)
## S3 method for class 'bugs'
mcmcRocPrc(
object,
curves = FALSE,
fullsims = FALSE,
data,
xnames,
yname,
type = c("logit", "probit"),
...
)
## S3 method for class 'mcmc'
mcmcRocPrc(
object,
curves = FALSE,
fullsims = FALSE,
data,
xnames,
yname,
type = c("logit", "probit"),
force = FALSE,
...
)
Arguments
x |
a |
... |
Used by methods |
n |
plot method: if 'fullsims = TRUE', how many sample curves to draw? |
alpha |
plot method: alpha value for plotting sampled curves; between 0 and 1 |
row.names |
see [base::as.data.frame()] |
optional |
see [base::as.data.frame()] |
what |
which information to extract and convert to a data frame? |
object |
A fitted binary choice model, e.g. "rjags" object
(see |
curves |
logical indicator of whether or not to return values to plot
the ROC or Precision-Recall curves. If set to |
fullsims |
logical indicator of whether full object (based on all MCMC
draws rather than their average) will be returned. Default is |
yvec |
A |
yname |
( |
xnames |
( |
posterior_samples |
a "mcmc" object with the posterior samples |
data |
the data that was used in the 'stan(data = ?, ...)' call |
type |
"logit" or "probit" |
force |
for MCMCpack models, suppress warning if the model does not appear to be a binary choice model? |
Details
If only the average AUC-ROC and PR are of interest, setting
curves = FALSE
and fullsims = FALSE
can greatly speed up calculation
time. The curve data (curves = TRUE
) is needed for plotting. The plot
method will always plot both the ROC and PR curves, but the underlying
data can easily be extracted from the output for your own plotting;
see the documentation of the value returned below.
The default method works with a matrix of predicted probabilities and the vector of observed incomes as input. Other methods accommodate some of the common Bayesian modeling packages like rstan (which returns class "stanfit"), rstanarm ("stanreg"), R2jags ("jags"), R2WinBUGS ("bugs"), and MCMCpack ("mcmc"). Even if a package-specific method is not implemented, the default method can always be used as a fallback by manually calculating the matrix of predicted probabilities for each posterior sample.
Note that MCMCpack returns generic "mcmc" output that is annotated with
some additional information as attributes, including the original function
call. There is no inherent way to distinguish any other kind of "mcmc"
object from one generated by a proper MCMCpack modeling function, but as a
basic precaution, mcmcRocPrc()
will check the saved call and return an
error if the function called was not MCMClogit()
or MCMCprobit()
.
This behavior can be suppressed by setting force = TRUE
.
Value
Returns a list with length 2 or 4, depending on the on the "curves" and "fullsims" argument values:
"area_under_roc":
numeric()
; either length 1 iffullsims = FALSE
, or one value for each posterior sample otherwise"area_under_prc":
numeric()
; either length 1 iffullsims = FALSE
, or one value for each posterior sample otherwise"prc_dat": only if
curves = TRUE
; a list with length 1 iffullsims = FALSE
, longer otherwise"roc_dat": only if
curves = TRUE
; a list with length 1 iffullsims = FALSE
, longer otherwise
References
Beger, Andreas. 2016. “Precision-Recall Curves.” Available at doi: 10.2139/ssrn.2765419
Examples
if (interactive()) {
# load simulated data and fitted model (see ?sim_data and ?jags_logit)
data("jags_logit")
# using mcmcRocPrc
fit_sum <- mcmcRocPrc(jags_logit,
yname = "Y",
xnames = c("X1", "X2"),
curves = TRUE,
fullsims = FALSE)
fit_sum
plot(fit_sum)
# Equivalently, we can calculate the matrix of predicted probabilities
# ourselves; using the example from ?jags_logit:
library(R2jags)
data("sim_data")
yvec <- sim_data$Y
xmat <- sim_data[, c("X1", "X2")]
# add intercept to the X data
xmat <- as.matrix(cbind(Intercept = 1L, xmat))
beta <- as.matrix(as.mcmc(jags_logit))[, c("b[1]", "b[2]", "b[3]")]
pred_mat <- plogis(xmat %*% t(beta))
# the matrix of predictions has rows matching the number of rows in the data;
# the column are the predictions for each of the 2,000 posterior samples
nrow(sim_data)
dim(pred_mat)
# now we can call mcmcRocPrc; the default method works with the matrix
# of predictions and vector of outcomes as input
mcmcRocPrc(object = pred_mat, curves = TRUE, fullsims = FALSE, yvec = yvec)
}
Simulated data for examples
Description
Simulated data to fit example models against
Usage
sim_data
Format
a data.frame
Examples
## simulating data
set.seed(123456)
b0 <- 0.2 # true value for the intercept
b1 <- 0.5 # true value for first beta
b2 <- 0.7 # true value for second beta
n <- 500 # sample size
X1 <- runif(n, -1, 1)
X2 <- runif(n, -1, 1)
Z <- b0 + b1 * X1 + b2 * X2
pr <- 1 / (1 + exp(-Z)) # inv logit function
Y <- rbinom(n, 1, pr)
sim_data <- data.frame(cbind(X1, X2, Y))
Simulated data for examples
Description
Simulated data to fit example models against
Usage
sim_data_interactive
Format
a data.frame
Examples
set.seed(123456)
b0 <- 0.2 # true value for the intercept
b1 <- 0.5 # true value for first beta
b2 <- 0.7 # true value for second beta
b3 <- -0.3 # true value for second beta
n <- 500 # sample size
X1 <- runif(n, -1, 1)
X2 <- runif(n, -1, 1)
Z_interactive <- b0 + b1 * X1 + b2 * X2 + b3 * (X1 * X2)
Y_interactive <- rnorm(n, Z_interactive, 1)
sim_data_interactive <- data.frame(cbind(X1, X2, Y = Y_interactive))
Simulated data for examples
Description
Simulated data to fit example models against
Usage
sim_data_interactive_cat
Format
a data.frame
Examples
set.seed(123456)
b0 <- 0.2 # true value for the intercept
b1 <- 0.5 # true value for first beta
b2 <- 0.7 # true value for second beta
b3 <- -0.3 # true value for second beta
n <- 500 # sample size
X1 <- runif(n, -1, 1)
X3 <- rbinom(n, 5, .23)
Z_interactive_cat <- b0 + b1 * X1 + b2 * X3 + b3 * (X1 * X3)
Y_interactive_cat <- rnorm(n, Z_interactive_cat, 1)
sim_data_interactive_cat <- data.frame(cbind(X1, X3, Y = Y_interactive_cat))