Title: | Bayesian Marginal Effects for 'brms' Models |
Version: | 0.2.0 |
URL: | https://joshuawiley.com/brmsmargins/, https://github.com/JWiley/brmsmargins |
BugReports: | https://github.com/JWiley/brmsmargins/issues |
Description: | Calculate Bayesian marginal effects, average marginal effects, and marginal coefficients (also called population averaged coefficients) for models fit using the 'brms' package including fixed effects, mixed effects, and location scale models. These are based on marginal predictions that integrate out random effects if necessary (see for example <doi:10.1186/s12874-015-0046-6> and <doi:10.1111/biom.12707>). |
License: | GPL (≥ 3) |
Encoding: | UTF-8 |
RoxygenNote: | 7.2.0 |
Depends: | R (≥ 4.0.0) |
Imports: | methods, stats, data.table (≥ 1.12.0), extraoperators (≥ 0.1.1), brms, bayestestR, Rcpp, posterior |
Suggests: | testthat (≥ 3.0.0), covr, withr, knitr, rmarkdown, margins, betareg |
Config/testthat/edition: | 3 |
Config/testthat/parallel: | true |
LinkingTo: | RcppArmadillo, Rcpp |
VignetteBuilder: | knitr |
NeedsCompilation: | yes |
Packaged: | 2022-05-18 19:46:16 UTC; jwile |
Author: | Joshua F. Wiley |
Maintainer: | Joshua F. Wiley <jwiley.psych@gmail.com> |
Repository: | CRAN |
Date/Publication: | 2022-05-18 22:10:03 UTC |
Average Over Posterior Predictions
Description
Internal function that averages over posterior predictions
using either rowMeans()
or rowBootMeans()
, the latter
being useful to incorporate uncertainty from the
inputs being used to generate predictions.
Usage
.averagePosterior(posterior, resample = 0L, seed = FALSE)
Arguments
posterior |
A posterior matrix type object. It is assumed that different predictions to be averaged over are on different columns. Different posterior draws are on different rows. |
resample |
An integer indicating the number of
bootstrap resamples of the posterior predictions to
use when calculating summaries. Defaults to |
seed |
A seed for random number generation. Defaults to |
Value
A vector of the averaged posterior.
Check Object Class is a Table
Description
Internal utility function confirm that an object
has the attributes needed to be used as data.
Currently it should be a tbl
,
data.frame
, or data.table
.
Usage
.checktab(x, requireNames = TRUE)
Arguments
x |
An object to be evaluated. |
requireNames |
A logical, whether names are
required. Defaults to |
Value
An empty string if no issues. Otherwise, a non zero string with warning/error messages.
Extract the Link from a brms
Model
Description
Internal utility function to take a brmsfit
object
and extract the link for a specific dpar
.
Usage
.extractlink(object, dpar)
Arguments
object |
A |
dpar |
The dpar for which the link should be extracted. |
Value
A character string, the link.
Convert a Link Function Name to a List
Description
Internal utility function used in prediction()
.
Takes a link function name as a character string,
the type of effect to be used, and the desired back transformation
and returns a list with all the options needed to execute the desired
options in prediction()
.
Usage
.links(
link,
effects = c("fixedonly", "includeRE", "integrateoutRE"),
backtrans = c("response", "linear", "identity", "invlogit", "exp", "square",
"inverse")
)
Arguments
link |
The link named in a |
effects |
A character string, the type of effect desired |
backtrans |
A character string, the type of back transformation |
Value
A list with eight elements.
- scale
A character string giving the argument to be passed to
fitted()
.- ilink
A character string giving the name of the inverse link function.
- ifun
Inverse link function as an
R
function.- ilinknum
An integer giving the inverse link / transformation to be applied in
integratere()
, needed as this is a C++ function and cannot use theR
based inverse link function.
Calculate Percent of Observations Within or Without a Window
Description
This is an internal helper function to calculate and label
the percentage of a posterior distribution that falls within
the Region of Practical Equivalence (ROPE) or
at or beyond a Minimally Important Difference (MID).
It is designed to fail gracefully if no window given, and to
give some useful labels about the windows / range used.
Intended for use internally as part of brmsmargins
.
Usage
.percent(x, window = NULL, within = TRUE)
Arguments
x |
A vector of values to evaluate. Required. |
window |
An optional numeric vector giving a window. |
within |
A logical value indicating whether to calculate the
percentage within the window (if |
Value
A list with the Window
, if specified else NULL
,
the Percent
of observations, and a Label
specifying the
exact window used in human readable format.
Check Assertions about a brmsfit
Model Object
Description
These are a set of internal utility functions.
They are not intended for general use.
Instead, they are intended to be called in circumstances
where the expected result is TRUE
.
All of them are designed to try to give informative error
messages if the assertion is not met.
All of them result in a stop()
error if the assertion is not met.
Usage
.assertbrmsfit(object)
.assertgaussian(object)
.assertfamily(object)
.assertdpar(object, dpar)
.assertlink(object, dpar)
Arguments
object |
A |
dpar |
Required for |
Details
.assertbrmsfit
asserts that the object should be of classbrmsfit
..assertgaussian
asserts that all random effects are Gaussian..assertfamily
asserts that the distribution (family) of the outcome is a currently supported family. Only applies when integrating out random effects..assertlink
asserts that the link function is a currently supported link function. Only applies when integrating out random effects.
Value
An invisible, logical TRUE
if the assertion is met.
An (informative) error message if the assertion is not met.
Calculate Marginal Effects from 'brms' Models
Description
This function is designed to help calculate marginal effects
including average marginal effects (AMEs) from brms
models.
Arguments are labeled as required when it is required that the
user directly specify the argument. Arguments are labeled as
optional when either the argument is optional or there are
sensible default values so that users do not typically need to specify
the argument.
Usage
brmsmargins(
object,
at = NULL,
wat = NULL,
add = NULL,
newdata = model.frame(object),
CI = 0.99,
CIType = "HDI",
contrasts = NULL,
ROPE = NULL,
MID = NULL,
subset = NULL,
dpar = NULL,
seed,
verbose = FALSE,
...
)
Arguments
object |
A required argument specifying a fitted |
at |
An optional argument (but note, either |
wat |
An optional list with named elements including one element named,
“ID” with a single character string, the name of the variable
in the model frame that is the ID variable. Additionally,
there should be one or more named elements, named after variables
in the model (and specified in the |
add |
An optional argument (but note, either |
newdata |
An optional argument specifying an object inheriting
from data frame indicating the baseline values to use for predictions and AMEs.
It uses a sensible default: the model frame from the |
CI |
An optional argument with a numeric value specifying the width
of the credible interval. Defaults to |
CIType |
An optional argument, a character string specifying the
type of credible interval (e.g., highest density interval). It is passed down to
|
contrasts |
An optional argument specifying a contrast matrix.
The posterior predictions matrix
is post multiplied by the contrast matrix, so they must be conformable.
The posterior predictions matrix has a separate column for each row in the
|
ROPE |
An optional argument, that can either be left as |
MID |
An optional argument, that can either left as |
subset |
An optional argument, a character string that is a
valid |
dpar |
An optional argument giving the parameter passed on to the |
seed |
An optional argument that controls whether (and if so what) random seed
to use. This does not matter when using fixed effects only. However,
when using Monte Carlo integration to integrate out random effects from
mixed effects models, it is critical if you are looking at a continuous
marginal effect with some small offset value as otherwise the
Monte Carlo error from one set of predictions to another may exceed
the true predicted difference.
If |
verbose |
An optional argument, a logical value whether to print
more verbose messages. Defaults to |
... |
An optional argument, additional arguments passed on to
|
Details
The main parts required for the function are a fitted model object,
(via the object
argument) a dataset to be used for prediction,
(via the newdata
argument which defaults to the model frame),
and a dataset passed to either at
or add
.
The steps are as follows:
Check that the function inputs (model object, data, etc.) are valid.
Take the dataset from the
newdata
argument and either add the values from the first row ofadd
or replace the values using the first row ofat
. Only variables specified inat
oradd
are modified. Other variables are left as is.Use the
fitted()
function to generate predictions based on this modified dataset. Ifeffects
is set to “fixedonly” (meaning only generate predictions using fixed effects) or to “includeRE” (meaning generate predictions using fixed and random effects), then predictions are generated entirely using thefitted()
function and are, typically back transformed to the response scale. For mixed effects models with fixed and random effects whereeffects
is set to “integrateoutRE”, thenfitted()
is only used to generate predictions using the fixed effects on the linear scale. For each prediction generated, the random effects are integrated out by drawingk
random samples from the model assumed random effect(s) distribution. These are added to the fixed effects predictions, back transformed, and then averaged over allk
random samples to perform numerical Monte Carlo integration.All the predictions for each posterior draw, after any back transformation has been applied, are averaged, resulting in one, marginal value for each posterior draw. These are marginal predictions. They are average marginal predictions if averaging over the sample dataset, or may be marginal predictions at the means, if the initial input dataset used mean values, etc.
Steps two to four are repeated for each row of
at
oradd
. Results are combined into a matrix where the columns are different rows fromat
oradd
and the rows are different posterior draws.If contrasts were specified, using a contrast matrix, the marginal prediction matrix is post multiplied by the contrast matrix. Depending on the choice(s) of
add
orat
and the values in the contrast matrix, these can then be average marginal effects (AMEs) by using numerical integration (add
with 0 and a very close to 0 value) or discrete difference (at
with say 0 and 1 as values) for a given predictor(s).The marginal predictions and the contrasts, if specified are summarized.
Although brmsmargins()
is focused on helping to calculate
marginal effects, it can also be used to generate marginal predictions,
and indeed these marginal predictions are the foundation of any
marginal effect estimates. Through manipulating the input data,
at
or add
and the contrast matrix, other types of estimates
averaged or weighting results in specific ways are also possible.
Value
A list with four elements.
Posterior
Posterior distribution of all predictions. These predictions default to fixed effects only, but by specifying options toprediction()
they can include random effects or be predictions integrating out random effects.Summary
A summary of the predictions.Contrasts
Posterior distribution of all contrasts, if a contrast matrix was specified.ContrastSummary
A summary of the posterior distribution of all contrasts, if specified
References
Pavlou, M., Ambler, G., Seaman, S., & Omar, R. Z. (2015) doi:10.1186/s12874-015-0046-6 “A note on obtaining correct marginal predictions from a random intercepts model for binary outcomes” and Skrondal, A., & Rabe-Hesketh, S. (2009) doi:10.1111/j.1467-985X.2009.00587.x “Prediction in multilevel generalized linear models” and Norton EC, Dowd BE, Maciejewski ML. (2019) doi:10.1001/jama.2019.1954 “Marginal Effects—Quantifying the Effect of Changes in Risk Factors in Logistic Regression Models”
Examples
## Not run:
#### Testing ####
## sample data and logistic model with brms
set.seed(1234)
Tx <- rep(0:1, each = 50)
ybin <- c(rep(0:1, c(40,10)), rep(0:1, c(10,40)))
logitd <- data.frame(Tx = Tx, ybin = ybin)
logitd$x <- rnorm(100, mean = logitd$ybin, sd = 2)
mbin <- brms::brm(ybin ~ Tx + x, data = logitd, family = brms::bernoulli())
summary(mbin)
## now check AME for Tx
tmp <- brmsmargins(
object = mbin,
at = data.table::data.table(Tx = 0:1),
contrasts = matrix(c(-1, 1), nrow = 2),
ROPE = c(-.05, +.05),
MID = c(-.10, +.10))
tmp$Summary
tmp$ContrastSummary ## Tx AME
## now check AME for Tx with bootstrapping the AME population
tmpalt <- brmsmargins(
object = mbin,
at = data.table::data.table(Tx = 0:1),
contrasts = matrix(c(-1, 1), nrow = 2),
ROPE = c(-.05, +.05),
MID = c(-.10, +.10),
resample = 100L)
tmpalt$Summary
tmpalt$ContrastSummary ## Tx AME
## now check AME for continuous predictor, x
## use .01 as an approximation for first derivative
## 1 / .01 in the contrast matrix to get back to a one unit change metric
tmp2 <- brmsmargins(
object = mbin,
add = data.table::data.table(x = c(0, .01)),
contrasts = matrix(c(-1/.01, 1/.01), nrow = 2),
ROPE = c(-.05, +.05),
MID = c(-.10, +.10))
tmp2$ContrastSummary ## x AME
if (FALSE) {
library(lme4)
data(sleepstudy)
fit <- brms::brm(Reaction ~ 1 + Days + (1 + Days | Subject),
data = sleepstudy,
cores = 4)
summary(fit, prob = 0.99)
tmp <- brmsmargins(
object = fit,
at = data.table::data.table(Days = 0:1),
contrasts = matrix(c(-1, 1), nrow = 2),
ROPE = c(-.05, +.05),
MID = c(-.10, +.10), CIType = "ETI", effects = "integrateoutRE", k = 5L)
tmp$Summary
tmp$ContrastSummary
}
## End(Not run)
Personal Preference Based Bayesian Summary
Description
Returns a summary of a posterior distribution for a single
parameter / value. It is based on personal preference. Notably, it does not
only use bayestestR::describe_posterior
, an excellent function,
because of the desire to also describe the percentage of the full posterior
distribution that is at or exceeding the value of a
Minimally Important Difference (MID). MIDs are used in clinical studies with outcome
measures where there are pre-defined differences that are considered clinically
important, which is distinct from the ROPE or general credible intervals capturing
uncertainty.
Usage
bsummary(x, CI = 0.99, CIType = "HDI", ROPE = NULL, MID = NULL)
Arguments
x |
The posterior distribution of a parameter |
CI |
A numeric value indicating the desired width of the credible interval.
Defaults to |
CIType |
A character string indicating the type of credible interval, passed on
to the |
ROPE |
Either left as |
MID |
Either left as |
Value
A data.table
with the mean, M
- M
the mean of the posterior samples
- Mdn
the median of the posterior samples
- LL
the lower limit of the credible interval
- UL
the upper limit of the credible interval
- PercentROPE
the percentage of posterior samples falling into the ROPE
- PercentMID
the percentage of posterior samples falling at or beyond the MID
- CI
the width of the credible interval used
- CIType
the type of credible interval used (e.g., highest density interval)
- ROPE
a label describing the values included in the ROPE
- MID
a label describing the values included in the MID
References
Kruschke, J. K. (2018). doi:10.1177/2515245918771304 “Rejecting or accepting parameter values in Bayesian estimation”
Examples
bsummary(rnorm(1000))
bsummary(rnorm(1000), ROPE = c(-.5, .5), MID = c(-1, 1))
Build the Variable Names or Data Objects for Estimation
Description
These are a set of internal utility functions. They are not intended for general use.
Usage
.namesL(block, number)
.buildL(data, block, number, dpar)
.namesSD(ranef, block, dpar)
.buildSD(data, ranef, block, dpar)
.namesZ(block, number, dpar)
.buildZ(data, block, number, dpar)
Arguments
block |
Which random effect block to use. An integer. |
number |
The number of elements in that random effect block. An integer. |
data |
A data object. For example the result of |
dpar |
Which dpar to use. Does not apply to the L matrix. |
ranef |
A data set with information about the model object random effects.
Only used for |
Details
.namesL
Generate names of an L matrix frombrms
. Create the variable names for the Cholesky decomposition of the random effects correlation matrix inbrms
. Note thatbrms
returns the lower triangular matrix and we want the upper triangular matrix, so the names are transposed. The results can then be passed to thetab2mat
function to convert the row vector into a matrix..buildL
Returns the L matrix object. Rows are posterior draws..namesSD
Create the names of random effect standard deviation estimates..buildSD
Return matrix of random effect standard deviation estimates. Rows are posterior draws..namesZ
Create the names of random effects data for predictions..buildZ
Return matrix of data for random effect predictions.
Value
A character vector for all .names
functions or a matrix
for all .build
functions.
Integrate over Multivariate Normal Random Effects
Description
Used in the process of Monte Carlo integration over multivariate normal random effects. This generates the random draws from the multivariate normal distribution and multiplies these by the data. Not intended to be called directly by most users.
Usage
integratemvn(X, k, sd, chol)
integratemvnR(X, k, sd, chol)
Arguments
X |
A numeric matrix of the data to be multiplied by the random effects |
k |
An integer, the number of random samples to be used for numerical integration |
sd |
A numeric vector of the standard deviations |
chol |
A numeric matrix, which should be the Cholesky decomposition of the correlation matrix of the multivariate normal distribution. |
Value
A numeric matrix with random values
Functions
-
integratemvnR
: PureR
implementation ofintegratemvn
Examples
integratemvn(
X = matrix(1, 1, 2),
k = 100L,
sd = c(10, 5),
chol = chol(matrix(c(1, .5, .5, 1), 2)))
integratemvn(matrix(1, 1, 1), 100L, c(5), matrix(1))
Integrate over Random Effects
Description
Used to conduct Monte Carlo integration over Gaussian random effects. Not intended to be called directly by most users.
Usage
integratere(d, sd, L, k, yhat, backtrans)
integratereR(d, sd, L, k, yhat, backtrans)
Arguments
d |
A list with model matrices for each random effect block. |
sd |
A list with standard deviation matrices for each random effect block where rows are different posterior draws. |
L |
A list with matrices for each random effect block containing the parts of the L matrix, the Cholesky decomposition of the random effect correlation matrix. |
k |
An integer, the number of samples for Monte Carlo integration. |
yhat |
A matrix of the fixed effects predictions |
backtrans |
An integer, indicating the type of back transformation. 0 indicates inverse logit (e.g., for logistic regression). 1 indicates exponential (e.g., for poisson or negative binomial regression or if outcome was natural log transformed). 2 indicates square (e.g., if outcome was square root transformed). 3 indicates inverse (e.g., if outcome was inverse transformed such as Gamma regression) Any other integer results in no transformation. -9 is recommended as the option for no transformation as any future transformations supported will be other, positive integers. |
Value
A numeric matrix with the Monte Carlo integral calculated.
Functions
-
integratereR
: PureR
implementation ofintegratere
Examples
integratere(
d = list(matrix(1, 1, 1)),
sd = list(matrix(1, 2, 1)),
L = list(matrix(1, 2, 1)),
k = 10L,
yhat = matrix(0, 2, 1),
backtrans = 0L)
Check a brmsfit
Object has Random Effects
Description
Internal utility function to check whether a brmsfit
object has any random effects or not.
Usage
is.random(object)
Arguments
object |
An object to be evaluated. |
Value
TRUE
if any random effects present.
FALSE
if no random effects present.
Fast Linear Regression
Description
Used to get marginal coefficients off of a generalized linear mixed model.
Usage
lmcpp(X, y)
Arguments
X |
A numeric model matrix. If intercept is desired, it must already have been added as a column. |
y |
A numeric matrix. A single column if one response variable or multiple columns where each column is a different response, such as a for marginal coefficients where each column is a different MCMC sample. |
Value
A numeric matrix with the coefficient.
Examples
lmcpp(cbind(1, mtcars$hp, mtcars$am), as.matrix(mtcars[, c("mpg", "qsec")]))
Marginal Coefficients from a 'brms' Model
Description
Calculate marginal coefficients from a brms
generalized linear mixed model using the method proposed by Hedeker (2018).
Usage
marginalcoef(
object,
summarize = TRUE,
posterior = FALSE,
index,
backtrans = c("response", "linear", "identity", "invlogit", "exp", "square",
"inverse"),
k = 100L,
...
)
Arguments
object |
A fitted brms model object that includes random effects. Required. |
summarize |
A logical value, whether or not to
calculate summaries of the posterior predictions.
Defaults to |
posterior |
A logical value whether or not to
save and return the posterior samples. Defaults
to |
index |
An optional integer vector, giving the posterior draws to be used in the calculations. If omitted, defaults to all posterior draws. |
backtrans |
A character string indicating the type of back transformation to be applied. Can be one of “response” meaning to use the response scale, “linear” or “identity” meaning to use the linear predictor scale, or a specific back transformation desired, from a possible list of “invlogit”, “exp”, “square”, or “inverse”. Custom back transformations should only be needed if, for example, the outcome variable was transformed prior to fitting the model. |
k |
An integer providing the number of random draws to use for
integrating out the random effects. Only relevant when |
... |
Additional arguments passed to |
Value
A list with Summary
and Posterior
.
Some of these may be NULL
depending on the arguments used.
References
Hedeker, D., du Toit, S. H., Demirtas, H. & Gibbons, R. D. (2018) doi:10.1111/biom.12707 “A note on marginalization of regression parameters from mixed models of binary outcomes”
Marginal Posterior Predictions from a 'brms' Model
Description
Calculate marginal predictions from a brms
model.
Marginal predictions average over the input data for each posterior draw.
Marginal predictions for models with random effects will integrate
over random effects.
Arguments are labeled as required when it is required that the
user directly specify the argument. Arguments are labeled as
optional when either the argument is optional or there are
sensible default values so that users do not typically need to specify
the argument.
Usage
prediction(
object,
data,
summarize = TRUE,
posterior = FALSE,
index,
dpar = NULL,
resample = 0L,
resampleseed = FALSE,
effects = c("fixedonly", "includeRE", "integrateoutRE"),
backtrans = c("response", "linear", "identity", "invlogit", "exp", "square",
"inverse"),
k = 100L,
raw = FALSE,
...
)
Arguments
object |
A required argument specifying a fitted
|
data |
A required argument specifying a data frame or
data table passed to |
summarize |
An optional argument, a logical value, whether
or not to calculate summaries of the posterior predictions.
Defaults to |
posterior |
An optional argument, a logical value whether
or not to save and return the posterior samples. Defaults
to |
index |
An optional argument, an integer vector, giving the posterior draws to be used in the calculations. If omitted, defaults to all posterior draws. |
dpar |
An optional argument, the parameter passed on to the
|
resample |
An optional argument, an integer indicating the
number of bootstrap resamples of the posterior predictions to
use when calculating summaries. Defaults to |
resampleseed |
An optional argument, a seed for random number
generation. Defaults to |
effects |
An optional argument, a character string indicating the type of prediction to be made. Can be one of “fixedonly” meaning only use fixed effects, “includeRE” meaning that random effects should be included in the predictions, or “integrateoutRE” meaning that random effects should be integrated out / over in the predictions. It defaults to “fixedonly” so is not typically required for a user to specify it. |
backtrans |
An optional argument, a character string indicating the type of back transformation to be applied. Can be one of “response” meaning to use the response scale, “linear” or “identity” meaning to use the linear predictor scale, or a specific back transformation desired, from a possible list of “invlogit”, “exp”, “square”, or “inverse”. Custom back transformations should only be needed if, for example, the outcome variable was transformed prior to fitting the model. It defaults to “response” so is not typically required for a user to specify it. |
k |
An optional argument, an integer providing the number of
random draws to use for integrating out the random effects.
Only relevant when |
raw |
An optional argument, a logical value indicating whether to
return the raw output or to average over the Monte Carlo samples.
Defaults to |
... |
An optional argument, additional arguments passed
to |
Value
A list with Summary
and Posterior
.
Some of these may be NULL
depending on the arguments used.
References
Pavlou, M., Ambler, G., Seaman, S., & Omar, R. Z. (2015) doi:10.1186/s12874-015-0046-6 “A note on obtaining correct marginal predictions from a random intercepts model for binary outcomes” and Skrondal, A., & Rabe-Hesketh, S. (2009) doi:10.1111/j.1467-985X.2009.00587.x “Prediction in multilevel generalized linear models”
Bootstrap Row Means
Description
This takes a numeric matrix, bootstrap resamples each row, and then calculates the mean. The intended use case is for Bayesian posterior predictions from sample data. Instead of directly calculating the average marginal effect (AME) across all observed values, these can be bootstrapped, so that uncertainty in the target population, and thus the AME in the target population, can be incorporated. Model uncertainty is already assumed to be handled by the different posterior samples, which are assumed to be across rows.
Usage
rowBootMeans(x)
Arguments
x |
A numeric matrix |
Value
A numeric vector with the simple bootstrapped row means of the matrix
Examples
x <- matrix(1:9, byrow = TRUE, 3)
replicate(10, rowBootMeans(x))
Convert a Row of a Table to a Square Matrix
Description
Utility function to convert a row matrix to a square matrix.
Used as the brms
package returns things like the Cholesky
decomposition matrix as separate columns where rows are posterior draws.
Not intended to be called directly by most users.
Usage
tab2mat(X)
tab2matR(X)
Arguments
X |
a matrix |
Value
A numeric matrix with one row.
Functions
-
tab2matR
: PureR
implementation oftab2mat
Examples
tab2mat(matrix(1:4, 1))
tab2mat(matrix(1:9, 1))