Type: | Package |
Title: | Pipeline for Debiased Target Trial Emulation |
Version: | 0.1.0 |
Description: | Supports propensity score-based methods—including matching, stratification, and weighting—for estimating causal treatment effects. It also implements calibration using negative control outcomes to enhance robustness. 'debiasedTrialEmulation' facilitates effect estimation for both binary and time-to-event outcomes, supporting risk ratio (RR), odds ratio (OR), and hazard ratio (HR) as effect measures. It integrates statistical modeling and visualization tools to assess covariate balance, equipoise, and bias calibration. Additional methods—including approaches to address immortal time bias, information bias, selection bias, and informative censoring—are under development. Users interested in these extended features are encouraged to contact the package authors. |
Imports: | dplyr,janitor,cobalt,MatchIt,geex,glmnet,survival,ggplot2,ParallelLogger,EmpiricalCalibration,purrr |
Depends: | R (≥ 3.5.0) |
License: | GPL-2 | GPL-3 [expanded from: GPL (≥ 2)] |
Encoding: | UTF-8 |
LazyData: | true |
RoxygenNote: | 7.3.2 |
NeedsCompilation: | no |
Author: | Bingyu Zhang [aut, cre], Yiwen Lu [aut], Dazheng Zhang [aut], Yuqing Lei [aut], Tingyin Wang [aut], Siqi Chen [aut], Yong Chen [aut] |
Maintainer: | Bingyu Zhang <bingyuz7@sas.upenn.edu> |
Packaged: | 2025-05-20 16:36:38 UTC; yiwenlu |
Repository: | CRAN |
Date/Publication: | 2025-05-23 18:22:05 UTC |
Compute Weights for Stratification
Description
Computes inverse probability weights for stratification-based adjustment.
Usage
Compute_weight(data)
Arguments
data |
A data frame containing strata IDs and treatment assignments. |
Value
A data frame with row IDs and computed weights.
Compute Standardized Mean Differences (SMD)
Description
Computes the standardized mean differences for covariates before and after adjustment.
Usage
GetSMD(data, treat, weights = NULL, std = TRUE)
Arguments
data |
A data frame containing covariates. |
treat |
A binary variable indicating treatment assignment. |
weights |
Optional weight vector. |
std |
Logical; whether to standardize. |
Value
A data frame with covariate names and their standardized mean differences.
Target Trial Emulation (TTE) Pipeline
Description
Implements a Target Trial Emulation pipeline using propensity score methods, including matching, weighting, and stratification.
Usage
TTE_pipeline(data, xvars, yvars, ncovars = NULL, ps_type, outcome_measure)
Arguments
data |
A dataset containing treatment assignment, covariates, and outcomes. |
xvars |
A character vector of covariate names used for propensity score estimation. |
yvars |
A character vector of primary outcome variable names. |
ncovars |
Optional. A character vector of negative control outcome variable names. |
ps_type |
The propensity score method: "Matching", "Stratification", or "Weighting". |
outcome_measure |
The outcome measure to estimate: "RR" (Risk Ratio), "OR" (Odds Ratio), or "HR" (Hazard Ratio). |
Value
An object of class "TTE" containing the propensity score analysis results
Examples
library("dplyr")
data(demo_data)
xvars <- c("eth_cat", "age_cat", "sex", "cohort_entry_month",
"obese", "pmca_index", "n_ed", "n_inpatient",
"n_tests", "imm_date_diff_grp", "medical_1", "medical_2",
"medical_3", "medical_4", "medical_5")
yvars1 <- colnames(demo_data %>% select(starts_with("visits_")))
yvars2 <- colnames(demo_data %>% select(starts_with("event_")))
# without negative controls
TTE_pipeline(demo_data, xvars=xvars, yvars=yvars1, ps_type="Matching", outcome_measure="RR")
# with negative controls
ncovars1 <- colnames(demo_data %>% select(starts_with("nco_visits_")))
TTE_pipeline(
demo_data,
xvars = xvars,
yvars = yvars1,
ncovars = ncovars1,
ps_type = "Matching",
outcome_measure = "RR"
)
Negative Control Calibration for Target Trial Emulation
Description
Performs calibration with negative control outcomes to further reduce confounding bias in Target Trial Emulation results.
Usage
calibrate_TTE(tte_obj = NULL, custom_results = NULL, custom_nco_results = NULL)
Arguments
tte_obj |
Optional. An object of class "TTE" from the TTE_pipeline function. If provided, the function will use the negative control results from this object. Either this or both custom_results and custom_nco_results must be provided. |
custom_results |
Optional. A data frame containing the primary outcome results if no TTE object is available. Must contain columns 'names_outcome', 'logEst', and 'seLogEst'. |
custom_nco_results |
Optional. A data frame containing the negative control outcome results if no TTE object is available. Must contain columns 'names_outcome', 'logEst', and 'seLogEst'. |
Value
An object of class "dTTE" containing both the TTE results and calibration results
Examples
library("dplyr")
data(demo_data)
# First run TTE pipeline
xvars <- c("eth_cat", "age_cat", "sex", "cohort_entry_month",
"obese", "pmca_index", "n_ed", "n_inpatient",
"n_tests", "imm_date_diff_grp", "medical_1", "medical_2",
"medical_3", "medical_4", "medical_5")
yvars1 <- colnames(demo_data %>% select(starts_with("visits_")))
ncovars1 <- colnames(demo_data %>% select(starts_with("nco_visits_")))
tte_result <- TTE_pipeline(demo_data, xvars=xvars, yvars=yvars1, ncovars=ncovars1,
ps_type="Matching", outcome_measure="RR")
# Then calibrate results
dtte_result <- calibrate_TTE(tte_obj = tte_result)
# Alternatively, provide custom results
df_results <- data.frame(
names_outcome = c("outcome1", "outcome2"),
logEst = c(0.1, -0.2),
seLogEst = c(0.05, 0.08)
)
df_nco_results <- data.frame(
names_outcome = c("nco1", "nco2", "nco3"),
logEst = c(0.02, -0.03, 0.01),
seLogEst = c(0.03, 0.04, 0.02)
)
dtte_result <- calibrate_TTE(custom_results = df_results, custom_nco_results = df_nco_results)
Compute Preference Score
Description
Computes preference scores based on the propensity scores.
Usage
computePreferenceScore(data, unfilteredData = NULL)
Arguments
data |
A data frame containing propensity scores. |
unfilteredData |
Optional dataset to compute proportions from. |
Value
A data frame with added preference scores.
Compute Propensity Score Weights
Description
Computes inverse probability treatment weights (IPTW) for ATE or ATT estimation.
Usage
computeWeights(population, estimator = "ate")
Arguments
population |
A data frame containing treatment assignments and propensity scores. |
estimator |
Type of estimator, either "ate" (average treatment effect) or "att" (average treatment effect on the treated). |
Value
A vector of computed weights.
Example Dataset for Debiased Trial Emulation
Description
'demo_data' is a simulated dataset used to demonstrate the functionality of the 'debiasedTrialEmulation' package. It includes patient demographic information, treatment assignment, covariates, clinical outcomes, and negative control outcomes for evaluating treatment effects using propensity score methods.
The dataset contains 50,000 observations and 93 variables, including:
- **Demographic variables**: Ethnicity, age, sex, and cohort entry month. - **Treatment assignment**: Binary treatment indicator. - **Covariates**: Baseline health conditions and healthcare utilization variables. - **Primary outcomes**: Binary and time-to-event outcomes related to cardiovascular health. - **Negative control outcomes (NCOs)**: Outcomes used for bias calibration.
Usage
data(demo_data)
Format
A data frame with 50,000 rows and 93 variables
Estimate Treatment Effects using Propensity Score Matching
Description
Computes effect estimates using propensity score matching to reduce confounding.
Computes effect estimates using propensity score stratification to adjust for confounding.
Computes effect estimates using propensity score weighting to balance covariates between treatment groups.
Usage
estEffect_matching(form, data, yvars, ncovars, distance, outcome_measure)
estEffect_stratification(form, data, yvars, ncovars, distance, outcome_measure)
estEffect_weighting(form, data, yvars, ncovars, distance, outcome_measure)
Arguments
form |
A formula specifying the treatment assignment model. |
data |
A dataset containing covariates and treatment assignment. |
yvars |
A character vector of outcome variable names. |
ncovars |
A character vector of negative control outcome variable names. |
distance |
The method for estimating propensity scores ("glm"). |
outcome_measure |
The outcome measure to estimate: "RR" (Risk Ratio), "OR" (Odds Ratio), or "HR" (Hazard Ratio). |
Value
List of components
List of components
List of components
Estimate Odds Ratio (OR) after Propensity Score Matching
Description
Computes the odds ratio for a binary outcome after applying propensity score matching.
Computes the risk ratio for a binary outcome after applying propensity score matching.
Computes the hazard ratio for a time-to-event outcome after propensity score matching.
Computes OR, RR, and HR for a binary or time-to-event outcome after stratification.
Computes the risk ratio for a binary outcome after applying propensity score stratification.
Computes the hazard ratio for a time-to-event outcome after propensity score stratification.
Computes the odds ratio for a binary outcome after applying propensity score weighting.
Computes the risk ratio for a binary outcome after applying propensity score weighting.
Computes the hazard ratio for a time-to-event outcome after applying propensity score weighting.
Usage
get_OR_matching(data, names_outcome)
get_RR_matching(data, names_outcome)
get_HR_matching(data, names_outcome)
get_OR_stratification(data, names_outcome)
get_RR_stratification(data, names_outcome)
get_HR_stratification(data, names_outcome)
get_OR_weighting(data, names_outcome, IPTW)
get_RR_weighting(data, names_outcome, IPTW)
get_HR_weighting(data, names_outcome, IPTW)
Arguments
data |
A dataset containing treatment assignment and outcome variables. |
names_outcome |
A character vector of outcome variable names. |
IPTW |
A numeric vector of inverse probability of treatment weights. |
Value
A data frame with the estimated log odds ratio, standard error, and p-value.
Plot Method for TTE Objects
Description
Plots diagnostic graphs for the TTE pipeline output.
Usage
## S3 method for class 'TTE'
plot(x, which = c("SMD", "Equipoise"), ...)
Arguments
x |
An object of class "TTE". |
which |
A character vector specifying which plot(s) to display. Options are "SMD" (Standardized Mean Differences) and "Equipoise" (Equipoise plot). Default shows all plots. |
... |
Additional arguments passed to plotting functions. |
Value
No return value, printed to the console.
Plot Method for dTTE Objects (Calibration Only)
Description
Plots only the calibration graph for the dTTE pipeline output.
Usage
## S3 method for class 'dTTE'
plot(x, ...)
Arguments
x |
An object of class "dTTE". |
... |
Additional arguments passed to plotting functions. |
Value
No return value, printed to the console.
Plot Standardized Mean Differences (SMD) for Matching
Description
Generates a plot of standardized mean differences before and after propensity score matching.
Generates a plot showing the distribution of preference scores to assess equipoise after matching.
Generates an SMD plot to compare balance before and after stratification.
Generates a plot showing the distribution of preference scores to assess equipoise after stratification.
Generates an SMD plot to compare balance before and after propensity score weighting.
Generates a plot showing the distribution of preference scores to assess equipoise after weighting.
Creates a plot of propensity score distributions using density or histogram visualization.
Creates a plot showing the top covariates with the largest standardized mean differences before and after matching.
Usage
plot_SMD_matching(m.out)
plot_Equipoise_matching(data, m.out)
plot_SMD_stratification(stratifiedPop, xvars)
plot_Equipoise_stratification(data)
plot_SMD_weighting(data)
plot_Equipoise_weighting(data)
plotPs(
data,
unfilteredData = NULL,
scale = "preference",
type = "density",
binWidth = 0.05,
targetLabel = "Target",
comparatorLabel = "Comparator",
showCountsLabel = FALSE,
showAucLabel = FALSE,
showEquiposeLabel = FALSE,
equipoiseBounds = c(0.3, 0.7),
unitOfAnalysis = "subjects",
title = NULL,
fileName = NULL
)
plotCovariateBalanceOfTopVariables(
balance,
n = 20,
maxNameWidth = 100,
title = NULL,
fileName = NULL,
beforeLabel = "before matching",
afterLabel = "after matching"
)
Arguments
unfilteredData |
A logical indicating whether to include unfiltered data in the plot. |
maxNameWidth |
An integer specifying the maximum width for variable names. |
m.out |
The output from 'MatchIt', containing matched data. |
data |
A dataset containing treatment and propensity scores. |
stratifiedPop |
The dataset containing stratified propensity scores. |
xvars |
The covariate names to assess balance. |
scale |
The scale to use: "preference" or "propensity". |
type |
The type of plot: "density", "histogramCount", or "histogramProportion". |
binWidth |
The bin width for histograms (default = 0.05). |
targetLabel |
Label for the treated group. |
comparatorLabel |
Label for the control group. |
showCountsLabel |
Logical; whether to show sample counts. |
showAucLabel |
Logical; whether to show AUC. |
showEquiposeLabel |
Logical; whether to indicate equipoise range. |
equipoiseBounds |
A numeric vector of two values defining the equipoise range (default = c(0.3, 0.7)). |
unitOfAnalysis |
Unit label for counts (e.g., "subjects"). |
title |
Optional title for the plot. |
fileName |
Optional file name to save the plot. |
balance |
A data frame containing standardized mean differences. |
n |
Number of top covariates to display. |
beforeLabel |
Label for pre-matching imbalance. |
afterLabel |
Label for post-matching balance. |
Value
A 'ggplot2' object showing balance improvement after matching.
Print Method for TTE Objects
Description
Prints a concise summary of the TTE pipeline output.
Usage
## S3 method for class 'TTE'
print(x, ...)
Arguments
x |
An object of class "TTE". |
... |
Additional arguments (currently ignored). |
Value
No return value, printed to the console.
Print Method for dTTE Objects
Description
Prints a concise summary of the dTTE pipeline output.
Usage
## S3 method for class 'dTTE'
print(x, ...)
Arguments
x |
An object of class "dTTE". |
... |
Additional arguments (currently ignored). |
Value
No return value, printed to the console.
Stratify Population by Propensity Score
Description
Assigns individuals to strata based on their propensity scores.
Usage
stratifyByPs(
population,
numberOfStrata = 5,
stratificationColumns = c(),
baseSelection = "all"
)
Arguments
population |
A data frame containing row IDs, treatment assignments, and propensity scores. |
numberOfStrata |
Number of strata to create. |
stratificationColumns |
Additional columns to use for stratification. |
baseSelection |
Defines which group is used to determine strata cutoffs ("all", "target", or "comparator"). |
Value
A data frame with stratum assignments.
Summary Method for TTE Objects
Description
Provides a detailed summary of the TTE pipeline output.
Usage
## S3 method for class 'TTE'
summary(x, ...)
Arguments
x |
An object of class "TTE". |
... |
Additional arguments (currently ignored). |
Value
No return value, printed to the console.
Summary Method for dTTE Objects
Description
Provides a detailed summary of the dTTE pipeline output.
Usage
## S3 method for class 'dTTE'
summary(x, ...)
Arguments
x |
An object of class "dTTE". |
... |
Additional arguments (currently ignored). |
Value
No return value, printed to the console.
Trim Propensity Scores
Description
Trims propensity scores by removing extreme values at both ends of the distribution.
Usage
trimByPsQuantile(propensityScore, trimFraction = 0.05)
Arguments
propensityScore |
A numeric vector of propensity scores. |
trimFraction |
Fraction of extreme values to trim (default 0.05). |
Value
A vector of indices indicating which scores to keep.