Type: | Package |
Title: | Robust Estimation of the Proportion of Treatment Effect Explained by Surrogate Marker Information |
Version: | 3.2 |
Author: | Layla Parast |
Maintainer: | Layla Parast <parast@austin.utexas.edu> |
Description: | Provides functions to estimate the proportion of treatment effect on the primary outcome that is explained by the treatment effect on the surrogate marker. |
License: | GPL-2 | GPL-3 [expanded from: GPL] |
Imports: | stats, survival, Matrix |
NeedsCompilation: | no |
Packaged: | 2024-01-23 18:30:29 UTC; parastlm |
Depends: | R (≥ 3.5.0) |
Repository: | CRAN |
Date/Publication: | 2024-01-23 19:52:49 UTC |
Calculates the augmented estimator of the proportion of treatment effect explained by the surrogate marker information measured at a specified time and primary outcome information up to that specified time
Description
This function calculates the augmented version of the proportion of treatment effect on the primary outcome explained by the surrogate marker information measured at t_0
and primary outcome information up to t_0
. Variance estimates and 95 % confidence intervals for the augmented estimates are provided automatically; three versions of the confidence interval are provided: a normal approximation based interval, a quantile based interval and Fieller's confidence interval, all using perturbation-resampling. The user can also request an estimate of the incremental value of surrogate marker information.
Usage
Aug.R.s.surv.estimate(xone, xzero, deltaone, deltazero, sone, szero, t,
weight.perturb = NULL, landmark, extrapolate = FALSE, transform = FALSE,
basis.delta.one, basis.delta.zero, basis.delta.s.one = NULL,
basis.delta.s.zero = NULL, incremental.value = FALSE, approx = T)
Arguments
xone |
numeric vector, the observed event times in the treatment group, X = min(T,C) where T is the time of the primary outcome and C is the censoring time. |
xzero |
numeric vector, the observed event times in the control group, X = min(T,C) where T is the time of the primary outcome and C is the censoring time. |
deltaone |
numeric vector, the event indicators for the treatment group, D = I(T<C) where T is the time of the primary outcome and C is the censoring time. |
deltazero |
numeric vector, the event indicators for the control group, D = I(T<C) where T is the time of the primary outcome and C is the censoring time. |
sone |
numeric vector; surrogate marker measurement at |
szero |
numeric vector; surrogate marker measurement at |
t |
the time of interest. |
weight.perturb |
weights used for perturbation resampling. |
landmark |
the landmark time |
extrapolate |
TRUE or FALSE; indicates whether the user wants to use extrapolation. |
transform |
TRUE or FALSE; indicates whether the user wants to use a transformation for the surrogate marker. |
basis.delta.one |
either a vector of length |
basis.delta.zero |
either a vector of length |
basis.delta.s.one |
either a vector of length |
basis.delta.s.zero |
either a vector of length |
incremental.value |
TRUE or FALSE; indicates whether the user would like to see the incremental value of the surrogate marker information, default is FALSE. |
approx |
TRUE or FALSE indicating whether an approximation should be used when calculating the probability of censoring; most relevant in settings where the survival time of interest for the primary outcome is greater than the last observed event but before the last censored case, default is TRUE. |
Details
Please see R.s.surv.estimate documention for details about the estimates before augmentation is performed. Recent work has shown that augmentation can lead to improvements in efficiency by taking advantage of the association between baseline information, denoted here as Z
, and the primary outcome. This function calculates the augmented estimates of the quantities of interest. For example, the augmented version of \hat{\Delta}(t)
is defined as:
\hat{\Delta}(t)^{AUG} = \hat{\Delta}(t) + \gamma \{n_1^{-1}\sum_{i=1}^{n_1}h(Z_{1i})-n_0^{-1}\sum_{i=1}^{n_0}h(Z_{0i}) \}
where Z_{gi}, i=1, 2, \cdots, n_g
are i.i.d. random vectors of baseline covariates from treatment group g
and h(\cdot)
is a basis transformation given a priori. Due to treatment randomization, \{n_1^{-1}\sum_{i=1}^{n_1}h(Z_{1i})-n_0^{-1}\sum_{i=1}^{n_0}h(Z_{0i}) \}
converges to zero in probability as the sample size goes to infinity and thus the augmented estimator converges to the same limit as the original counterparts. The quantity \gamma
is selected such that the variance of \hat{\Delta}(t)^{AUG}
is minimized. That is, \gamma = (\Xi_{12}) ( \Xi_{22} ) ^{-1}
where
\Xi_{12} = \mbox{cov} \{ \hat{\Delta}(t), n_1^{-1}\sum_{i=1}^{n_1}h(Z_{1i})-n_0^{-1}\sum_{i=1}^{n_0}h(Z_{0i}) \},
\Xi_{22} = \mbox{var} \{n_1^{-1}\sum_{i=1}^{n_1}h(Z_{1i})-n_0^{-1}\sum_{i=1}^{n_0}h(Z_{0i})\}
and thus we can obtain \hat{\Delta}(t)^{AUG}
by replacing \gamma
with a consistent estimator, \hat{\gamma}
obtained using perturbation-resampling. A similar approach is used to obtain \hat{\Delta}_S(t)^{AUG}
and thus construct
\hat{R}_S(t,t_0)^{AUG}=1-\frac{\hat{\Delta}_S(t,t_0)^{AUG}}{\hat{\Delta}(t)^{AUG}}.
When only a single Z_{gi}
is provided in the basis argument, the following basis is used in this function: h(Z_{gi}) = (1, Z_{gi}, Z_{gi}^2)'.
Value
A list is returned:
aug.delta |
the estimate, |
aug.delta.s |
the estimate, |
aug.R.s |
the estimate, |
aug.delta.var |
the variance estimate of |
aug.delta.s.var |
the variance estimate of |
aug.R.s.var |
the variance estimate of |
conf.int.normal.aug.delta |
a vector of size 2; the 95% confidence interval for |
conf.int.quantile.aug.delta |
a vector of size 2; the 95% confidence interval for |
conf.int.normal.aug.delta.s |
a vector of size 2; the 95% confidence interval for |
conf.int.quantile.aug.delta.s |
a vector of size 2; the 95% confidence interval for |
conf.int.normal.R.s |
a vector of size 2; the 95% confidence interval for |
conf.int.quantile.aug.R.s |
a vector of size 2; the 95% confidence interval for |
conf.int.fieller.aug.R.s |
a vector of size 2; the 95% confidence interval for |
aug.delta.t |
the estimate, |
aug.R.t |
the estimate, |
aug.incremental.value |
the estimate, |
aug.delta.t.var |
the variance estimate of |
aug.R.t.var |
the variance estimate of |
aug.incremental.value.var |
the variance estimate of |
aug.conf.int.normal.delta.t |
a vector of size 2; the 95% confidence interval for |
aug.conf.int.quantile.delta.t |
a vector of size 2; the 95% confidence interval for |
aug.conf.int.normal.R.t |
a vector of size 2; the 95% confidence interval for |
aug.conf.int.quantile.R.t |
a vector of size 2; the 95% confidence interval for |
aug.conf.int.fieller.R.t |
a vector of size 2; the 95% confidence interval for |
aug.conf.int.normal.iv |
a vector of size 2; the 95% confidence interval for |
aug.conf.int.quantile.iv |
a vector of size 2; the 95% confidence interval for |
Note
If the treatment effect is not significant, the user will receive the following message: "Warning: it looks like the treatment effect is not significant; may be difficult to interpret the residual treatment effect in this setting". If the observed support of the surrogate marker for the control group is outside the observed support of the surrogate marker for the treatment group, the user will receive the following message: "Warning: observed supports do not appear equal, may need to consider a transformation or extrapolation".
Author(s)
Layla Parast
References
Tian L, Cai T, Zhao L,Wei L. On the covariate-adjusted estimation for an overall treatment difference with data from a randomized comparative clinical trial. Biostatistics 2012; 13(2): 256-273.
Garcia TP, Ma Y, Yin G. Efficiency improvement in a class of survival models through model-free covariate incorporation. Lifetime Data Analysis 2011; 17(4): 552-565.
Zhang M, Tsiatis AA, Davidian M. Improving efficiency of inferences in randomized clinical trials using auxiliary covariates. Biometrics 2008; 64(3): 707-715.
Parast, L., Cai, T., & Tian, L. (2017). Evaluating surrogate marker information using censored data. Statistics in Medicine, 36(11), 1767-1782.
Examples
#computationally intensive
#Aug.R.s.surv.estimate(xone = d_example_surv$x1, xzero = d_example_surv$x0,
#deltaone = d_example_surv$delta1, deltazero = d_example_surv$delta0,
#sone = d_example_surv$s1, szero = d_example_surv$s0, t=3, landmark = 1,
#basis.delta.one = d_example_surv$z1 , basis.delta.zero = d_example_surv$z0)
Calculates kernel matrix
Description
Helper function; this calculates the kernel matrix
Usage
Kern.FUN(zz, zi, bw)
Arguments
zz |
zz |
zi |
zi |
bw |
bandwidth |
Value
the kernel matrix
Author(s)
Layla Parast
Calculates the proportion of treatment effect explained by multiple surrogate markers measured at a specified time and primary outcome information up to that specified time
Description
This function calculates the proportion of treatment effect on the primary outcome explained by multiple surrogate markers measured at t_0
and primary outcome information up to t_0
. The user can also request a variance estimate, estimated using perturbating-resampling, and a 95% confidence interval. If a confidence interval is requested three versions are provided: a normal approximation based interval, a quantile based interval and Fieller's confidence interval, all using perturbation-resampling. The user can also request an estimate of the incremental value of the surrogate marker information.
Usage
R.multiple.surv(xone, xzero, deltaone, deltazero, sone, szero, type =1, t,
weight.perturb = NULL, landmark, extrapolate = FALSE, transform = FALSE,
conf.int = FALSE, var = FALSE, incremental.value = FALSE, approx = T)
Arguments
xone |
numeric vector, the observed event times in the treatment group, X = min(T,C) where T is the time of the primary outcome and C is the censoring time. |
xzero |
numeric vector, the observed event times in the control group, X = min(T,C) where T is the time of the primary outcome and C is the censoring time. |
deltaone |
numeric vector, the event indicators for the treatment group, D = I(T<C) where T is the time of the primary outcome and C is the censoring time. |
deltazero |
numeric vector, the event indicators for the control group, D = I(T<C) where T is the time of the primary outcome and C is the censoring time. |
sone |
matrix of numeric values; surrogate marker measurements at |
szero |
matrix of numeric values; surrogate marker measurements at |
type |
type of estimate; options are 1 = two-stage robust estimator, 2 = weighted two-stage robust estimator, 3 = double-robust estimator, 4 = two-stage model-based estimator, 5 = weighted estimator, 6 = double-robust model-based estimator; default is 1. |
t |
the time of interest. |
weight.perturb |
weights used for perturbation resampling. |
landmark |
the landmark time |
extrapolate |
TRUE or FALSE; indicates whether the user wants to use extrapolation. |
transform |
TRUE or FALSE; indicates whether the user wants to use a transformation for the surrogate marker pseudo-score. |
conf.int |
TRUE or FALSE; indicates whether a 95% confidence interval for delta is requested, default is FALSE. |
var |
TRUE or FALSE; indicates whether a variance estimate is requested, default is FALSE. |
incremental.value |
TRUE or FALSE; indicates whether the user would like to see the incremental value of the surrogate marker information, default is FALSE. |
approx |
TRUE or FALSE indicating whether an approximation should be used when calculating the probability of censoring; most relevant in settings where the survival time of interest for the primary outcome is greater than the last observed event but before the last censored case, default is TRUE. |
Details
Let G \in \{A, B\}
be the binary treatment indicator and we assume that subjects are randomly assigned to either treatment group A
or B
at baseline. Let T
denote the time to the occurrence of the primary outcome, death for example, and S = (S_1,S_2,...,S_k)^{T}
denote the vector of k
surrogate markers measured at a given time t_0
. Let T^{(g)}
and S^{(g)}
denote the counterfactual event time and surrogate marker measurements under treatment G = g
for g \in \{A, B\}
. In practice, we only observe (T, S)=(T^{(A)}, S^{(A)})
or (T^{(B)}, S^{(B)})
depending on whether G=A
or B.
The treatment effect, \Delta(t)
, is the treatment difference in survival rates at time t > t_0
,
\Delta(t)=E\{ I(T^{(A)}>t)\} - E\{I(T^{(B)}>t)\} = P(T^{(A)}>t) - P(T^{(B)}>t)
where I(\cdot)
is the indicator function. For individuals who are censored or experience the primary outcome before t_0
, we assume that their S
information is not available.
The surrogate marker information at time t_0
is defined as a combination of the observed information on I(T >t_0)
and the observed S
at t_0
, denoted by Q_{t_0} = \{Q_{t_0}^{(g)}, g = A, B\}
, where
Q_{t_0} ^{(g)} = \{ T ^{(g)} \wedge t_0, S ^{(g)} I(T ^{(g)} > t_0)\}
. With information on Q_{t_0}
, the residual treatment effect is defined as:
\Delta_{S}(t,t_0) = E\{ I(T ^{(A)} > t) - I(T ^{(B)}>t) \mid Q_{t_0}^{(A)} = Q_{t_0}^{(B)} \}
=P(T^{(B)} > t_0) \int_{S} \psi_A(t \mid S, t_0) dF_{B}(S \mid t_0) -P(T^{(B)}> t)
where S = (s_1, ..., s_k)^{T}
,
F_{g}(S \mid t_0) = P(S^{(g)}\le S \mid T^{(g)}>t_0),
\psi_g(t\mid S,t_0) = P(T^{(g)}> t\mid S^{(g)}=S, T^{(g)}> t_0).
The proportion of the treatment effect on the primary outcome that is explained by the treatment effect on Q_{t_0}
is R_{S}(t,t_0)=1-\Delta_{S}(t,t_0)/\Delta
. This function provides 6 different estimators for R_{S}(t,t_0)
using censored data.
Due to censoring, the observed data consist of n
observations \{(G_i, X_{i}, \delta_{i}, S_{i}I(X_{i} > t_0)), i=1,...,n\}
from the two treatment groups, where X_{i} = \min(T_{i}, C_{i})
, \delta_{i} = I(T_{i} < C_{i})
, and C_{i}
denotes the censoring time for the i
th subject. We assume independent censoring i.e., (T_i, S_i) \perp C_i \mid G_i
. For ease of notation, we also let \{(X_{gi}, \delta_{gi}, S_{gi}I(X_{gi} > t_0)), i=1,...,n_g\}
denote n_g = \sum_{i=1}^n I(G_i = g)
observations from treatment group g \in \{A,B\},
where X_{gi}=\min(T_{gi}, C_{gi})
and \delta_{gi}=I(T_{gi}<C_{gi}).
Without loss of generality, we assume that \bar{\pi}_g = n_g/n \to \pi_g \in (0,1)
as n\to \infty
. Throughout, we estimate the treatment effect \Delta(t) =P(T^{(A)}>t) - P(T^{(B)}>t)
as
\hat{\Delta}(t) = n_A^{-1} \sum_{i=1}^{n_A} \frac{I(X_{Ai}>t)}{\hat{W}^C_A(t)} - n_B^{-1} \sum_{i=1}^{n_B} \frac{I(X_{Bi}>t)}{\hat{W}^C_B(t)}
where \hat{W}^C_g(t)
is the Kaplan-Meier estimator of W^C_g(t) = P(C_{g} > t)
.
We first describe the two-stage robust estimator which involves a two-stage procedure combining the use of a working model and a nonparametric estimation procedure for \Delta_{S}(t,t_0)
. The idea is simply to summarize S
into a univariate score U
and then construct a nonparametric estimator for R_S(t,t_0)
treating U
as S
. To construct U
, we approximate the conditional distribution of T^{(A)} \mid S^{(A)}, T^{(A)} > t_0
by using a working semiparametric model such as the landmark proportional hazards model q_A(S) = P(T^{(A)} >t \mid T^{(A)} >t_0, S^{(A)}) = \exp\{-\Lambda_0(t|t_0) \exp(\beta ^{T} S^{(A)}) \}, t>t_0,
where \Lambda_0(t|t_0)
is the unspecified baseline cumulative hazard function for T^{(A)}
conditional on \{T^{(A)} > t_0\}
and \beta
is an unknown vector of coefficients. Let \hat{\beta}
be the maximizer of the corresponding log partial likelihood function and \hat{\Lambda}_0(t|t_0)
be the Breslow-type estimate of baseline hazard. If one were to assume that this working model is correctly specified, then a consistent estimate of \Delta_{S}(t,t_0)
would simply be: \hat{\Delta}_{S}^M=n_B^{-1} \sum_{i=1}^{n_B} [ \exp \{ -\hat{\Lambda}_0(t|t_0)\exp(\hat{\beta} ^{T} S_{Bi}) \} \frac{I(X_{Bi} > t_0)}{\hat{W}^C_B(t_0)} - \frac{I(X_{Bi} > t)}{\hat{W}^C_B(t)}].
We refer to this estimate as the two-stage model-based estimator (option 4 for type).
Instead of relying on correct specification of this model, we use the resulting score U = \beta_0^{T}S
as a univariate “pseudo-marker" to summarize the k
surrogates. In the second stage, to estimate \Delta_{S}(t,t_0)
, we apply a nonparametric approach with S
represented by the univariate marker U
. Specifically, we use a kernel Nelson-Aalen estimator to nonparametrically estimate \phi_A(t|u, t_0)=P(T^{(A)}> t\mid U^{(A)}=u, T^{(A)}> t_0) = \exp\{-\Lambda_A(t|u, t_0 )\}
as \hat \phi_A(t|u, t_0) = \exp\{-\hat{\Lambda}_A(t|u, t_0) \}
, where
\hat{\Lambda}_A(t|u, t_0) = \int_{t_0}^t \frac{\sum_{i=1}^{n_A} I(X_{Ai}>t_0) K_h\{\gamma(\hat{U}_{Ai}) - \gamma(u)\}dN_{Ai}(z)}{\sum_{i=1}^{n_A} I(X_{Ai}>t_0) K_h\{\gamma(\hat{U}_{Ai}) - \gamma(u)\} Y_{Ai}(z)},
\hat{U}_{Ai} = \hat{\beta} ^{T} S_{Ai}
, \hat{U}_{Bi} = \hat{\beta} ^{T} S_{Bi}
, Y_{Ai} = I(X_{Ai} \geq t)
, N_{Ai}(t) = I(X_{Ai} \leq t) \delta_{Ai}, K(\cdot)
is a smooth symmetric density function, K_h(x) = K(x/h)/h,
and \gamma(\cdot)
is a given monotone transformation function. We then estimate \Delta_{S}(t,t_0)
as
\hat{\Delta}_{S}(t,t_0) = n_B^{-1} \sum_{i=1}^{n_B} [\hat{\phi}_A(t|\hat{U}_{Bi},t_0) \frac{I(X_{Bi} > t_0)}{\hat{W}^C_B(t_0)} - \frac{I(X_{Bi} > t)}{\hat{W}^C_B(t)}]
and \hat{R}_{S}(t,t_0) =1- \hat{\Delta}_{S}(t,t_0)/\hat{\Delta}(t).
We refer to this estimate as the two-stage robust estimator (option 1 for type).
The next estimator borrows ideas from the extensive causal inference literature focusing on double robust estimators two-stage weighted estimator with a propensity score weight explicitly balancing the two treatment groups with respect to the distribution of S
. The weighting enables us to “adjust" the distribution of S^{(A)}
before constructing the conditional survival estimate \hat \phi_A(t|u, t_0).
This approach results in a double-robust estimator of \Delta_{S}(t, t_0)
, which is consistent when either U^{(A)}
captures all the information about the relationship between I(T^{(A)}\ge t)
and S^{(A)}
or the propensity score model for \pi(S,t_0)=P(G_i=B|S_i=S, T_{i} > t_0)
is correctly specified. While \pi(S,t_0)
depends on t_0
, for simplicity, we drop t_0
from our notation and simply use \pi(S)
.
Regression models can be imposed to obtain estimates for \pi(S)
. For example, a simple logistic regression model can be imposed for
\tilde{\pi}(S)=P(G_i=B|S_i=S, X_{i}>t_0)
with
logit\{\tilde{\pi}(S)\} = \alpha_0 + \alpha_1 ^{T} S,
where \alpha_0
and \alpha_1
are estimated only among those with X_{gi} > t_0
to account for censoring. The propensity score of interest, \pi(S),
can be derived from \tilde{\pi}(S)
directly since
logit\{\pi(S)\}=logit\{\tilde{\pi}(S)\} + \log\{W_A^C(t_0)/W_B^C(t_0)\},
which follows from the assumption that (T_{gi}, S_{gi}) \perp C_{gi}.
We then modify the above expression by weighting observations with the estimated L(S_{Ai})=\pi(S_{Ai})/\{1-\pi(S_{Ai})\}
and obtain
\hat{\Lambda}^w_A(t|u, t_0) = \int_{t_0}^t \frac{\sum_{i=1}^{n_A} \hat{L}(S_{Ai}) I(X_{Ai}>t_0) K_h\{\gamma(\hat{U}_{Ai}) - \gamma(u)\}dN_{Ai}(z)}{\sum_{i=1}^{n_A} \hat{L}(S_{Ai}) I(X_{Ai}>t_0) K_h\{\gamma(\hat{U}_{Ai}) - \gamma(u)\} Y_{Ai}(z)},
,
where \hat{L}(S_{gi}) = \exp(\hat{\alpha}_0+\hat{\alpha}_1^{T} S_{gi})\hat{W}^C_B(t_0)/\hat{W}^C_A(t_0).
Subsequently, we define
\hat{\Delta}^w_{S}(t,t_0) = n_B^{-1} \sum_{i=1}^{n_B} [\hat{\phi}^w_A(t|\hat{U}_{Bi},t_0) \frac{I(X_{Bi} > t_0)}{\hat{W}^C_B(t_0)} - \frac{I(X_{Bi} > t)}{\hat{W}^C_B(t)}]
and \hat{R}^w_{S}(t,t_0) =1- \hat{\Delta}^w_{S}(t,t_0)/\hat{\Delta}(t)
where \hat \phi^w_A(t|t_0,u) = \exp\{-\hat{\Lambda}^w_A(t|t_0,u) \}.
We refer to this estimate as the weighted two-stage robust estimator (option 2 for type).
While the two-stage weighted estimator reflects one way to enhance the robustness of an initial estimator, the idea of combining a propensity-score type model and a regression-type model has certainly been extensively studied in the causal inference literature and a more familiar double-robust estimator can be constructed as: \hat{\Delta}_{S}^{DR}(t,t_0) = n^{-1} [\sum_{i=1}^{n_A}\frac{I(X_{Ai}>t)}{\hat{W}_{A}^C(t)\bar{\pi}_B} \hat{L}(S_{Ai}) - \sum_{i=1}^{n_B} \frac{I(X_{Bi}>t)}{\hat{W}_{B}^C(t)\bar{\pi}_B} ] - n^{-1} [ \sum_{i=1}^{n_A} \frac{ \hat{\phi}_A(t\mid\hat{U}_{Ai},t_0) I(X_{Ai} > t_0) }{\hat{W}_{A}^C(t_0)\bar{\pi}_B} \hat{L}(S_{Ai}) - \sum_{i=1}^{n_B}\frac{ \hat{\phi}_A(t\mid \hat{U}_{Bi},t_0) I(X_{Bi} > t_0) }{\hat{W}_{B}^C(t_0)\bar{\pi}_B} ]
and \hat{R}_{S}^{DR}(t,t_0) =1- \hat{\Delta}_{S}^{DR}(t,t_0)/\hat{\Delta}(t)
, where \hat{\phi}_A(t|\hat{U}_{gi},t_0)
is the (unweighted) estimate of \phi_A(t\mid u,t_0)
used in \hat{\Delta}_{S}(t,t_0)
. We refer to this estimate as the double robust estimator (option 3 for type).
The weighted estimator (option type 5) is defined as:
\hat{\Delta}^{PS}_{S}(t,t_0) = n^{-1} \sum_{i=1}^n \{\frac{I(X_{i}>t)}{\hat{W}_{G_i}^C(t)\bar{\pi}_B} [ I(G_i = A)\hat{L}(S_{Ai}) - I(G_i = B ) ] \}
and \hat{R}^{PS}_{S}(t,t_0) =1- \hat{\Delta}^{PS}_{S}(t,t_0)/\hat{\Delta}(t).
This estimator completely relies on the correct specification of \pi(S)
. The double-robust model-based estimator (option 6 for type) is defined as \hat{\Delta}_{S}^{DR2}(t,t_0)
and \hat{R}_{S}^{DR2}(t,t_0) =1- \hat{\Delta}_{S}^{DR2}(t,t_0)/\hat{\Delta}(t)
which are constructed parallel to the construction of \hat{\Delta}_{S}^{DR}(t,t_0)
i.e., a combination of \hat{\Delta}_{S}^M(t,t_0)
and \hat{R}^{PS}_{S}(t,t_0)
.
Variance estimates are obtained using perturbation resampling. If a confidence interval is requested three versions are provided: a normal approximation based interval, a quantile based interval and Fieller's confidence interval, all using perturbation-resampling. An estimate of the incremental value of the surrogate marker information can also be requested; this essentially compared the proportion explained by the surrogate information vs. the proportion explained by T
alone up to t_0
. Details can be found in Parast, L., Cai, T., & Tian, L. (2021). Evaluating multiple surrogate markers with censored data. Biometrics, 77(4), 1315-1327.
Value
A list is returned:
delta |
the estimate, |
delta.s |
the residual treatment effect estimate, |
R.s |
the estimated proportion of treatment effect explained by the set of markers, |
delta.var |
the variance estimate of |
delta.s.var |
the variance estimate of |
R.s.var |
the variance estimate of |
conf.int.normal.delta |
a vector of size 2; the 95% confidence interval for |
conf.int.quantile.delta |
a vector of size 2; the 95% confidence interval for |
conf.int.normal.delta.s |
a vector of size 2; the 95% confidence interval for |
conf.int.quantile.delta.s |
a vector of size 2; the 95% confidence interval for |
conf.int.normal.R.s |
a vector of size 2; the 95% confidence interval for |
conf.int.quantile.R.s |
a vector of size 2; the 95% confidence interval for |
conf.int.fieller.R.s |
a vector of size 2; the 95% confidence interval for |
delta.t |
the estimate, |
R.t |
the estimated proportion of treatment effect explained by survival only, |
incremental.value |
the estimate of the incremental value of the surrogate markers, |
delta.t.var |
the variance estimate of |
R.t.var |
the variance estimate of |
incremental.value.var |
the variance estimate of |
conf.int.normal.delta.t |
a vector of size 2; the 95% confidence interval for |
conf.int.quantile.delta.t |
a vector of size 2; the 95% confidence interval for |
conf.int.normal.R.t |
a vector of size 2; the 95% confidence interval for |
conf.int.quantile.R.t |
a vector of size 2; the 95% confidence interval for |
conf.int.fieller.R.t |
a vector of size 2; the 95% confidence interval for |
conf.int.normal.iv |
a vector of size 2; the 95% confidence interval for |
conf.int.quantile.iv |
a vector of size 2; the 95% confidence interval for |
Note
If the treatment effect is not significant, the user will receive the following message: "Warning: it looks like the treatment effect is not significant; may be difficult to interpret the residual treatment effect in this setting".
Author(s)
Layla Parast
References
Parast, L., Cai, T., & Tian, L. (2017). Evaluating surrogate marker information using censored data. Statistics in Medicine, 36(11), 1767-1782.
Parast, L., Cai, T., & Tian, L. (2021). Evaluating multiple surrogate markers with censored data. Biometrics, 77(4), 1315-1327.
Examples
data(d_example_multiple)
names(d_example_multiple)
## Not run:
R.multiple.surv(xone = d_example_multiple$x1, xzero = d_example_multiple$x0, deltaone =
d_example_multiple$delta1, deltazero = d_example_multiple$delta0, sone =
as.matrix(d_example_multiple$s1), szero = as.matrix(d_example_multiple$s0),
type =1, t = 1, landmark=0.5)
R.multiple.surv(xone = d_example_multiple$x1, xzero = d_example_multiple$x0, deltaone =
d_example_multiple$delta1, deltazero = d_example_multiple$delta0, sone =
as.matrix(d_example_multiple$s1), szero = as.matrix(d_example_multiple$s0),
type =1, t = 1, landmark=0.5, conf.int = T)
R.multiple.surv(xone = d_example_multiple$x1, xzero = d_example_multiple$x0, deltaone =
d_example_multiple$delta1, deltazero = d_example_multiple$delta0, sone =
as.matrix(d_example_multiple$s1), szero = as.matrix(d_example_multiple$s0),
type =3, t = 1, landmark=0.5)
## End(Not run)
Calculates the proportion of treatment effect explained
Description
This function calculates the proportion of treatment effect on the primary outcome explained by the treatment effect on the surrogate marker(s). This function is intended to be used for a fully observed continuous outcome. The user can also request a variance estimate and a 95% confidence interval, both estimated using perturbating-resampling. If a confidence interval is requested three versions are provided: a normal approximation based interval, a quantile based interval, and Fieller's confidence interval.
Usage
R.s.estimate(sone, szero, yone, yzero, var = FALSE, conf.int = FALSE,
weight.perturb = NULL, number = "single", type = "robust",extrapolate = FALSE,
transform = FALSE,warn.te = FALSE, warn.support = FALSE)
Arguments
sone |
numeric vector or matrix; surrogate marker for treated observations, assumed to be continuous. If there are multiple surrogates then this should be a matrix with |
szero |
numeric vector; surrogate marker for control observations, assumed to be continuous.If there are multiple surrogates then this should be a matrix with |
yone |
numeric vector; primary outcome for treated observations, assumed to be continuous. |
yzero |
numeric vector; primary outcome for control observations, assumed to be continuous. |
var |
TRUE or FALSE; indicates whether a variance estimate is requested, default is FALSE. |
conf.int |
TRUE or FALSE; indicates whether a 95% confidence interval is requested, default is FALSE |
weight.perturb |
a |
number |
specifies the number of surrogate markers; choices are "multiple" or "single", default is "single" |
type |
specifies the type of estimation; choices are "robust" or "model" or "freedman", default is "robust" |
extrapolate |
TRUE or FALSE; indicates whether the user wants to use extrapolation. |
transform |
TRUE or FALSE; indicates whether the user wants to use a transformation for the surrogate marker. |
warn.te |
value to control warnings; user does not need to specify. |
warn.support |
value to control warnings; user does not need to specify. |
Details
Let Y^{(1)}
and Y^{(0)}
denote the primary outcome under the treatment and primary outcome under the control,respectively. Let S^{(1)}
and S^{(0)}
denote the surrogate marker under the treatment and the surrogate marker under the control,respectively. The residual treatment effect is defined as
\Delta_S=\int_{-\infty}^{\infty} E(Y^{(1)}|S^{(1)}=s) dF_0(s) - \int_{-\infty}^{\infty} E(Y^{(0)}|S^{(0)}=s) dF_0(s),
where \Delta_S(s)= E(Y^{(1)}|S^{(1)}=s)-E(Y^{(0)}|S^{(0)}=s)
and F_0(\cdot)
is the marginal cumulative distribution function of S^{(0)}
, the surrogate marker measure under the control. The proportion of treatment effect explained by the surrogate marker, which we denote by R_S
, can be expressed using a contrast between \Delta_S
and \Delta
:
R_S=\{\Delta-\Delta_S\}/\Delta=1-\Delta_S/\Delta.
The definition and estimation of \Delta
is described in the delta.estimate documentation.
A flexible model-based approach to estimate \Delta_S
in the single marker setting is to specify:
E(S^{(0)})=\alpha_0 \quad\mbox{and}\quad E(S^{(1)})-E(S^{(0)}) = \alpha_1,
E(Y^{(0)} | S^{(0)}) = \beta_0 + \beta_1 S^{(0)} \quad \mbox{and} \quad E(Y^{(1)} | S^{(1)}) = (\beta_0 +\beta_2)+ (\beta_1+\beta_3) S^{(1)}.
It can be shown that when these models hold, \Delta_S = \beta_2 + \beta_3 \alpha_0
. Thus, reasonable estimates for \Delta_S
and R_S
using this approach would be \hat{\Delta}_S = \hat{\beta}_2 + \hat{\beta}_3 \hat{\alpha}_0
and \hat{R}_S = 1-\hat{\Delta}_S / \hat{\Delta}.
For robust estimation of \Delta_S
in the single marker setting, we estimate \mu_1(s) = E(Y^{(1)}|S^{(1)}=s)
nonparametrically using kernel smoothing:
\hat{\mu}_1(s) = \frac{\sum_{i=1}^{n_1} K_h\left (S_{1i}-s \right ) Y_{1i} }{\sum_{i=1}^{n_1} K_h\left (S_{1i}-s \right )}
where S_{1i}
is the observed S^{(1)}
for person i
, Y_{1i}
is the observed Y^{(1)}
for person i
, K(\cdot)
is a smooth symmetric density function with finite support, K_h(\cdot)=K(\cdot/h)/h
and h
is a specified bandwidth. As in most nonparametric functional estimation procedures, the choice of the smoothing parameter h
is critical. To eliminate the impact of the bias of the conditional mean function on the resulting estimator, we require the standard undersmoothing assumption of h=O(n_1^{-\delta})
with \delta \in (1/4,1/3).
To obtain an appropriate h
we first use bw.nrd to obtain h_{opt}
; and then we let h = h_{opt}n_1^{-c_0}
with c_0 = 0.25
. We then estimate \Delta_S
as
\hat{\Delta}_S= \sum_{i=1}^{n_0} \frac{\hat{\mu}_1(S_{0i})- Y_{0i}}{n_0}
where S_{0i}
is the observed S^{(0)}
for person i
and Y_{0i}
is the observed Y^{(0)}
for person i
. Lastly, we estimate R_S
as \hat{R}_S = 1-\hat{\Delta}_S/\hat{\Delta}
.
This function also allows for estimation of R_S
using Freedman's approach. Let Y
denote the primary outcome, S
denote the surrogate marker, and G
denote the treatment group (0 for control, 1 for treatment). Freedman's approach to calculating the proportion of treatment effect explained by the surrogate marker is to fit the following two regression models:
E(Y|G) = \gamma_0 + \gamma_1 I(G=1) \quad \mbox{and} \quad E(Y|G, S) = \gamma_{0S} + \gamma_{1S}I(G=1) + \gamma_{2S} S
and estimating the proportion of treatment effect explained, denoted by R_S
, as 1-\hat{\gamma}_{1S}/\hat{\gamma}_1
.
This function also estimates R_S
in a multiple marking setting. A flexible model-based approach to estimate \Delta_S
in the multiple marker setting is to specify models for E(Y|G, S)
and E(S_j | G)
for each S_j
in S = \{S_1,...S_p\}
(where p is the number of surrogate markers). Without loss of generality, consider the case where there are three surrogate markers, S = \{S_1, S_2, S_3\}
and one specifies the following linear models:
E(Y^{(0)} | S^{(0)}) = \beta_0 + \beta_1 S_1^{(0)} + \beta_2 S_2^{(0)} + \beta_3 S_3^{(0)}
E(Y^{(1)} | S^{(1)}) = (\beta_0+\beta_4) + (\beta_1+\beta_5) S_1^{(1)} + (\beta_2+\beta_6) S_2^{(1)} + (\beta_3+\beta_7) S_3^{(1)}
E(S_j^{(0)}) = \alpha_j, ~~~~j=1,2,3.
It can be shown that when these models hold
\Delta_{S} = \beta_4 + \beta_5\alpha_1 + \beta_6 \alpha_2 + \beta_7 \alpha_3.
Thus, reasonable estimates for \Delta_{S}
and R_{S}
here would be easily obtained by replacing the unknown regression coefficients in the models above by their consistent estimators.
For robust estimation of S \Delta_S
in the multiple marker setting, we use a two-stage procedure combining the model-based approach and the nonparametric estimation procedure from the single marker setting. Specifically, we use a working semiparametric model:
E(Y^{(1)}|S^{(1)}=S)=\beta_0 + \beta_1 S_1^{(1)} + \beta_2 S_2^{(1)} + \beta_3 S_3^{(1)}
and define Q^{(1)} = \hat{\beta}_0 + \hat{\beta}_1 S_1^{(1)} + \hat{\beta}_2 S_2^{(1)} + \hat{\beta}_3 S_3^{(1)}
and Q^{(0)} = \hat{\beta}_0 + \hat{\beta}_1 S_1^{(0)} + \hat{\beta}_2 S_2^{(0)} + \hat{\beta}_3 S_3^{(0)}
to reduce the dimension of S
in the first stage and in the second stage, we apply the robust approach used in the single marker setting to estimate its surrogacy.
To use Freedman's approach in the presence of multiple markers, the markers are simply additively entered into the second regression model.
Variance estimation and confidence interval construction are performed using perturbation-resampling. Specifically, let \left \{ V^{(b)} = (V_{11}^{(b)}, ...V_{1n_1}^{(b)}, V_{01}^{(b)}, ...V_{0n_0}^{(b)})^T, b=1,....,D \right \}
be n \times D
independent copies of a positive random variables V
from a known distribution with unit mean and unit variance. Let
\hat{\Delta}^{(b)} = \frac{ \sum_{i=1}^{n_1} V_{1i}^{(b)} Y_{1i}}{ \sum_{i=1}^{n_1} V_{1i}^{(b)}} - \frac{ \sum_{i=1}^{n_0} V_{0i}^{(b)} Y_{0i}}{ \sum_{i=1}^{n_0} V_{0i}^{(b)}}.
The variance of \hat{\Delta}
is obtained as the empirical variance of \{\hat{\Delta}^{(b)}, b = 1,...,D\}.
In this package, we use weights generated from an Exponential(1) distribution and use D=500
. Variance estimates for \hat{\Delta}_S
and \hat{R}_S
are calculated similarly. We construct two versions of the 95\%
confidence interval for each estimate: one based on a normal approximation confidence interval using the estimated variance and another taking the 2.5th and 97.5th empirical percentile of the perturbed quantities. In addition, we use Fieller's method to obtain a third confidence interval for R_S
as
\left\{1-r: \frac{(\hat{\Delta}_S-r\hat{\Delta})^2}{\hat{\sigma}_{11}-2r\hat\sigma_{12}+r^2\hat\sigma_{22}} \le c_{\alpha}\right\},
where \hat{\Sigma}=(\hat\sigma_{ij})_{1\le i,j\le 2}
and c_\alpha
is the (1-\alpha)
th percentile of
\left\{\frac{\{\hat{\Delta}^{(b)}_S-(1-\hat R_S)\hat{\Delta}^{(b)}\}^2}{\hat{\sigma}_{11}-2(1-\hat R_S)\hat\sigma_{12}+(1-\hat R_S)^2\hat\sigma_{22}}, b=1, \cdots, C\right\}
where \alpha=0.05
.
Note that if the observed supports for S are not the same, then \hat{\mu}_1(s)
for S_{0i} = s
outside the support of S_{1i}
may return NA (depending on the bandwidth). If extrapolation = TRUE, then the \hat{\mu}_1(s)
values for these surrogate values are set to the closest non-NA value. If transform = TRUE, then S_{1i}
and S_{0i}
are transformed such that the new transformed values, S^{tr}_{1i}
and S^{tr}_{0i}
are defined as: S^{tr}_{gi} = F([S_{gi} - \mu]/\sigma)
for g=0,1
where F(\cdot)
is the cumulative distribution function for a standard normal random variable, and \mu
and \sigma
are the sample mean and standard deviation, respectively, of (S_{1i}, S_{0i})^T
.
Value
A list is returned:
R.s |
the estimate, |
R.s.var |
the variance estimate of |
conf.int.normal.R.s |
a vector of size 2; the 95% confidence interval for |
conf.int.quantile.R.s |
a vector of size 2; the 95% confidence interval for |
conf.int.fieller.R.s |
a vector of size 2; the 95% confidence interval for |
For all options other then "freedman", the following are also returned:
delta |
the estimate, |
delta.s |
the estimate, |
delta.var |
the variance estimate of |
delta.s.var |
the variance estimate of |
conf.int.normal.delta |
a vector of size 2; the 95% confidence interval for |
conf.int.quantile.delta |
a vector of size 2; the 95% confidence interval for |
conf.int.normal.delta.s |
a vector of size 2; the 95% confidence interval for |
conf.int.quantile.delta.s |
a vector of size 2; the 95% confidence interval for |
Note
If the treatment effect is not significant, the user will receive the following message: "Warning: it looks like the treatment effect is not significant; may be difficult to interpret the proportion of treatment effect explained in this setting". In the single marker case with the robust estimation approach, if the observed support of the surrogate marker for the control group is outside the observed support of the surrogate marker for the treatment group, the user will receive the following message: "Warning: observed supports do not appear equal, may need to consider a transformation or extrapolation"
Author(s)
Layla Parast
References
Freedman, L. S., Graubard, B. I., & Schatzkin, A. (1992). Statistical validation of intermediate endpoints for chronic diseases. Statistics in medicine, 11(2), 167-178.
Parast, L., McDermott, M., Tian, L. (2016). Robust estimation of the proportion of treatment effect explained by surrogate marker information. Statistics in Medicine, 35(10):1637-1653.
Wang, Y., & Taylor, J. M. (2002). A measure of the proportion of treatment effect explained by a surrogate marker. Biometrics, 58(4), 803-812.
Fieller, Edgar C. (1954). Some problems in interval estimation. Journal of the Royal Statistical Society. Series B (Methodological), 175-185.
Fieller, E. C. (1940). The biological standardization of insulin. Supplement to the Journal of the Royal Statistical Society, 1-64.
Examples
data(d_example)
names(d_example)
R.s.estimate(yone=d_example$y1, yzero=d_example$y0, sone=d_example$s1.a, szero=d_example$s0.a,
number = "single", type = "robust")
R.s.estimate(yone=d_example$y1, yzero=d_example$y0, sone=cbind(d_example$s1.a,d_example$s1.b,
d_example$s1.c), szero=cbind(d_example$s0.a, d_example$s0.b, d_example$s0.c),
number = "multiple", type = "model")
Calculates the proportion of treatment effect explained correcting for measurement error in the surrogate marker
Description
This function calculates the proportion of treatment effect on the primary outcome explained by the treatment effect on a surrogate marker, correcting for measurement error in the surrogate marker. This function is intended to be used for a fully observed continuous outcome. The user must specify what type of estimation they would like (parametric or nonparametric estimation of the proportion explained, denoted by R) and what estimator they would like (see below for details).
Usage
R.s.estimate.me(sone, szero, yone, yzero, parametric = FALSE, estimator = "n",
me.variance, extrapolate = TRUE, transform = FALSE, naive = FALSE, Ronly = TRUE)
Arguments
sone |
numeric vector or matrix; surrogate marker for treated observations, assumed to be continuous. If there are multiple surrogates then this should be a matrix with |
szero |
numeric vector; surrogate marker for control observations, assumed to be continuous.If there are multiple surrogates then this should be a matrix with |
yone |
numeric vector; primary outcome for treated observations, assumed to be continuous. |
yzero |
numeric vector; primary outcome for control observations, assumed to be continuous. |
parametric |
TRUE or FALSE; indicates whether the user wants the parametric approach to be used (TRUE) or nonparametric (FALSE). |
estimator |
options are "d","q","n" for parametric and "q","n" for nonparametric; "d" stands for the disattenuated estimator, "q" stands for the SIMEX estimator with quadratic extrapolation, "n" stands for the SIMEX estimator with a nonlinear extrapolation. Note that the nonlinear extrapolation may have convergence issues with a small sample size; if this occurs, please consider using quadratic extrapolation instead. |
me.variance |
the variance of the measurement error; must be provided. |
extrapolate |
TRUE or FALSE; indicates whether the user wants to use extrapolation. |
transform |
TRUE or FALSE; indicates whether the user wants to use a transformation for the surrogate marker. |
naive |
TRUE or FALSE; indicates whether the user wants the naive estimate (not correcting for measurement error) to also be calculated |
Ronly |
TRUE or FALSE; indicates whether the user wants only R (and corresponding variance and confidence intervals) to be returned. |
Details
While there are many methods available to quantify the value of a surrogate marker, most assume that the marker is measured without error. This function calculates the proportion of treatment effect on the primary outcome explained by the treatment effect on a surrogate marker, correcting for measurement error in the surrogate marker. The user can choose either the parametric framework or nonparametric framework for estmation. Within the parametric framework there are three options for measurement error correction: the disattenuated estimator, the SIMEX estimator with quadratic extrapolation, and the SIMEX estimator with nonlinear extrapolation. Within the nonparametric framework there are two options for measurement error correction: the SIMEX estimator with quadratic extrapolation and the SIMEX estimator with nonlinear extrapolation. We describe each below.
Let G
be the binary treatment indicator with G=1
indicating treatment and G=0
indicating control (or placebo). We assume throughout that subjects are randomly assigned to treatment or control at baseline. Let Y
and S
denote the continuous primary outcome and continuous surrogate marker, respectively, where S
is measured post-baseline and is assumed to be a biomarker, clinical measurement, psychological test score, or other physiological measurement. In the absence of measurement error, the observed data consists of \{Y_i, S_i, G_i\}
for i \in \{1,...,n\}
. With measurement error, instead of observing S
we observe W = S + U
, where E(U|S) = 0
and the variance of U
is \sigma_u^2
. Such measurement error may be attributable to, for example, laboratory error. Thus, our observed data will consist of \{Y_i, W_i, G_i\}
for i \in \{1,...,n\}
. Throughout, we assume that \sigma_u^2
is known. Here, we are interested in estimating the proprtion of the treatment effect on the primary outcome that is explained by the treatment effect on the surrogate marker, denoted as R_S
.
To estimate R_S
parametrically, we assume the following models E(Y|G) = \beta_0 + \beta_1 G
and E(Y|G,S) = \beta_0^* + \beta_1^*G + \beta_2^* S
. It can be shown that if these models hold, R_S=1-\beta_1^*/\beta_1
. When W = S+U
is available instead of S
, this measurement error does not affect estimation of \beta_1
, but it does affect estimation of \beta_1^*
, and \beta_2^*
. Since estimation of R_S
relies on estimation of \beta_1
and \beta_1^*
, we focus on the effect of measurement error on \beta_1^*
estimation. The attenuation bias for \hat \beta_1^*
and \hat R
can be written out in closed form when the proportion of treatment effect is parametrically estimated as described above, when these specified models hold, and when the surrogate marker S
is measured with error. There exist two methods to eliminate this bias when estimating R_S
. Taking advantage of the fact that we can express the attenuation bias in closed form, the first is a straightforward disattenuated estimator: \hat \beta _{1A} = \hat{\beta}_1^* - \frac{ \hat{\beta}_2^* \{\Omega^2_{W} \Omega_{GW}-\Omega_{GW}(\Omega^2_{W} - \sigma_u^2)\}}{\Omega^2_{G}(\Omega^2_{W} - \sigma_u^2)-\Omega_{GW}\Omega_{GW}}
and \hat{R}_{A} = 1- \left [ \hat{\beta}_1^* - \frac{ \hat{\beta}_2^* \{\Omega^2_{W} \Omega_{GW}-\Omega_{GW}(\Omega^2_{W} - \sigma_u^2)\}}{\Omega^2_{G}(\Omega^2_{W} - \sigma_u^2)-\Omega_{GW}\Omega_{GW}} \right] / \hat{\beta}_1
where \Omega^2
denotes the sample variance or covariance.
The second method to eliminate this bias uses Simulation Extrapolation (SIMEX) estimation, which is a simulation-based method that involves first generating additional measurement error and observing how it affects the bias of the parameter estimate of interest, and then extrapolating this information to a setting with no measurement error. To incorporate SIMEX estimation within our surrogate marker framework, we define W_{b,i}(\lambda) = W_i + \lambda^{1/2} \sigma_u \epsilon_{i,b}
for b=1,...,B
where B=50
, \epsilon_{i,b} \sim N(0,1)
, \sigma_u
is assumed known, and \lambda \in (0,0.25,0.5,0.75,1.0,
1.25,1.5,1.75,2.0)
and for each iteration b
and \lambda
value, obtaining \hat \beta_{1b}^*(\lambda)
by fitting the regression model: E(Y \mid W_b(\lambda),S) = \beta_{0b}^* + \beta_{1b}^* W_{b}(\lambda) + \beta_{2b}^* S.
We then calculate the average estimate for each quantity over the iterations b=1,...,B
for each \lambda
value, denoted as \hat \beta^*_{1,S,\sigma^2_u(1+\lambda)} = \sum_{b=1}^B \hat \beta_{1b}^*(\lambda)
. The second step, extrapolation, takes these average estimates for each \lambda
value and extrapolates using a function G(\Gamma, \lambda)
to obtain the estimated quantity if \lambda=-1
. For the extrapolation step, we use both a quadratic extrapolation and nonlinear extrapolation i.e., we solve for \Gamma = (\alpha_0, \alpha_1, \alpha_2)^T
in \hat \beta^*_{1,S,\sigma^2_u(1+\lambda)} = \alpha_0 + \alpha_1 \lambda + \alpha_2 \lambda^2
and \hat \beta^*_{1,S,\sigma^2_u(1+\lambda)}= \alpha_0 + \alpha_1 /( \alpha_2 + \lambda)
, respectively. Using the estimates of \alpha_0, \alpha_1, \alpha_2
, we calculate the predicted \hat \beta^*_{1,S,\sigma^2_u(1+\lambda)}
when \lambda = -1
. In essence, the simulations add successively larger measurement errors of size (1+\lambda)\sigma^2_u
and then extrapolate to the case when \lambda = -1
such that the measurement error is 0. We denote the resulting estimator of \beta_1^*
as \hat{\beta}^*_{1,SIMEX} = G(\hat \Gamma, -1)
and define \hat{R}_{SIMEX} = 1- \hat{\beta}^*_{1,SIMEX}/ \hat \beta_1.
While the parametric approach to estimate the proportion of treatment effect explained by S
is most commonly used in clinical practice, previous work has demonstrated biased results when the assumed models are not correctly specified. An alternative approach involves estimating the treatment effect, \Delta
, and residual treatment effect, \Delta_S
, as R_S
is defined as 1-\Delta/\Delta_S
. The quantity \Delta
can be estimated simply by \hat{\Delta} = n_1^{-1}\sum_{i=1}^{n} Y_i I(G_i = 1) - n_0^{-1}\sum_{i=1}^{n} Y_i I(G_i = 0)
, where n_1
and n_0
denote the number of individuals in the treatment and control groups, respectively. The quantity \Delta_S
can be estimated nonparametrically using kernel smoothing as \hat{\Delta}_S = n_0^{-1} \sum_{i: G_i = 0}\hat{\mu}_1(S_i) - n_0^{-1}\sum_{i=1}^{n} Y_i I(G_i = 0)
where \hat{\mu}_1(s) = \{ \sum_{j: G_j = 1} K_h(S_j - s)Y_j \}/ \{\sum_{j:G_j = 1} K_h(S_j - s)\}
, K(\cdot)
is a smooth symmetric density function with finite support, K_h(\cdot)=K(\cdot/h)/h
and h
is a specified bandwidth such that h=O(n_1^{-\nu})
with \nu \in (1/4,1/2).
When W = S + U
is available instead of S
, estimation of \Delta
is not affected whereas estimation of \Delta_S
is affected and thus, the nonparametric estimation procedure described above results in a biased estimate of R_S
. Unlike the parametric approach, the attenuation bias cannot be expressed in closed form. Within this nonparametric framework, SIMEX estimation can be used to correct for measurement error. We implement the estimation procedure as described above where we first generate additional measurement error to obtain W_{b,i}(\lambda)
and for each iteration b
and \lambda
values obtain \hat{\Delta}_{S,b}(\lambda) = n_0^{-1} \sum_{i: G_i = 0} \left \{ \frac{\sum_{j: G_j = 1} K_h(W_{b,j}(\lambda) - W_{b,i}(\lambda))Y_j}{\sum_{j:G_j = 1} K_h(W_{b,j}(\lambda)- W_{b,i}(\lambda))} \right \} - n_0^{-1}\sum_{i=1}^{n} Y_i I(G_i = 0).
We then calculate the average estimate for each quantity over the iterations b=1,...,B
for each \lambda
value, denoted as \hat{\Delta}_{S,\sigma_u^2(1+\lambda)} = \sum_{b=1}^B \hat{\Delta}_{S,b}(\lambda)
and extrapolate using a function G(\Gamma, \lambda)
; we specifically use the quadratic and nonlinear functions as in the parametric setting. We denote the resulting estimator of \Delta_S
as \hat{\Delta}_{S,SIMEX} = G(\hat \Gamma, -1)
and define \hat{R}_{S,SIMEX} = 1- \hat{\Delta}_{S,SIMEX} / \hat \Delta.
In this function, parametric estimation is equivalent to Freedman's approach in the R.s.estimate documentation; nonparametric estimation is equivalent to the robust approach in the R.s.estimate documentation. Variance estimates for all estimators are calculated in this function based on derived closed form variance expressions. For all approaches, confidence intervals for \Delta_S
can be constructed using a normal approximation; confidence intervals for R_S
can be constructed using either a normal approximation or using Fieller's method, all of which are provided in this function. Details regarding the asymptotic properties of these estimators and closed form variance calculation can be found in: Parast, L., Garcia, T. P., Prentice, R. L., & Carroll, R. J. (2022). Robust methods to correct for measurement error when evaluating a surrogate marker. Biometrics, 78(1), 9-23.
Value
A list is returned:
R.naive |
the naive estimate of the proportion of treatment effect explained by the surrogate marker; only if naive = TRUE |
R.naive.var |
the estimated variance of the naive estimate of the proportion of treatment effect explained by the surrogate marker; only if naive = TRUE |
R.naive.CI.normal |
the 95% confidence interval using the normal approximation for the naive estimate of the proportion of treatment effect explained by the surrogate marker; only if naive = TRUE |
R.naive.CI.fieller |
the 95% confidence interval using Fieller's approach for the naive estimate of the proportion of treatment effect explained by the surrogate marker; only if naive = TRUE |
B1star.naive |
the naive estimate of the adjusted regression coefficient for treatment; only if naive = TRUE and Ronly = FALSE and parametric = TRUE |
B1star.naive.var |
the estimated variance of the naive estimate of the adjusted regression coefficient for treatment; only if naive = TRUE and Ronly = FALSE and parametric = TRUE |
B1star.naive.CI.normal |
the 95% confidence interval using the normal approximation for the naive estimate of the adjusted regression coefficient for treatment; only if naive = TRUE and Ronly = FALSE and parametric = TRUE |
deltas.naive |
the naive estimate of the residual treatment effect; only if naive = TRUE and Ronly = FALSE and parametric = FALSE |
deltas.naive.var |
the estimated variance of the naive estimate of the residual treatment effect; only if naive = TRUE and Ronly = FALSE and parametric = FALSE |
deltas.naive.CI.normal |
the 95% confidence interval using the normal approximation for the naive estimate of the residual treatment effect; only if naive = TRUE and Ronly = FALSE and parametric = FALSE |
R.corrected.dis |
the corrected disattenuated estimate of the proportion of treatment effect explained by the surrogate marker; only if parametric = TRUE and estimator ="d" |
R.corrected.var.dis |
the estimated variance of the corrected disattenuated estimate of the proportion of treatment effect explained by the surrogate marker; only if naive = TRUE |
R.corrected.CI.normal.dis |
the 95% confidence interval using the normal approximation for the corrected disattenuated estimate of the proportion of treatment effect explained by the surrogate marker; only if parametric = TRUE and estimator ="d" |
R.corrected.CI.fieller.dis |
the 95% confidence interval using Fieller's approach for the corrected disattenuated estimate of the proportion of treatment effect explained by the surrogate marker; only if parametric = TRUE and estimator ="d" |
B1star.corrected.dis |
the corrected disattenuated estimate of the adjusted regression coefficient for treatment; only if parametric = TRUE and estimator = "d" and Ronly = FALSE |
B1star.corrected.var.dis |
the estimated variance of the corrected disattenuated estimate of the adjusted regression coefficient for treatment; only if parametric = TRUE and estimator = "d" and Ronly = FALSE |
B1star.corrected.CI.normal.dis |
the 95% confidence interval using the normal approximation for the corrected disattenuated estimate of the adjusted regression coefficient for treatment; only if parametric = TRUE and estimator = "d" and Ronly = FALSE |
R.corrected.q |
the corrected SIMEX (quadratic) estimate of the proportion of treatment effect explained by the surrogate marker; only if estimator = "q" |
R.corrected.var.q |
the estimated variance of the corrected SIMEX (quadratic) estimate of the proportion of treatment effect explained by the surrogate marker; only if estimator = "q" |
R.corrected.CI.normal.q |
the 95% confidence interval using the normal approximation for the corrected SIMEX (quadratic) estimate of the proportion of treatment effect explained by the surrogate marker; only if estimator = "q" |
R.corrected.CI.fieller.q |
the 95% confidence interval using Fieller's approach for the corrected SIMEX (quadratic) estimate of the proportion of treatment effect explained by the surrogate marker; only if estimator = "q" |
B1star.corrected.q |
the corrected SIMEX (quadratic) estimate of the adjusted regression coefficient for treatment; only if estimator = "q" and Ronly = FALSE and parametric = TRUE |
B1star.corrected.var.q |
the estimated variance of the corrected SIMEX (quadratic) estimate of the adjusted regression coefficient for treatment; only if estimator = "q" and Ronly = FALSE and parametric = TRUE |
B1star.corrected.CI.normal.q |
the 95% confidence interval using the normal approximation for the corrected SIMEX (quadratic) estimate of the adjusted regression coefficient for treatment; only if estimator = "q" and Ronly = FALSE and parametric = TRUE |
deltas.corrected.q |
the corrected SIMEX (quadratic) estimate of the residual treatment effect; only if estimator = "q" and Ronly = FALSE and parametric = FALSE |
deltas.corrected.var.q |
the estimated variance of the corrected SIMEX (quadratic) estimate of the residual treatment effect; only if estimator = "q" and Ronly = FALSE and parametric = FALSE |
deltas.corrected.CI.normal.q |
the 95% confidence interval using the normal approximation for the corrected SIMEX (quadratic) estimate of the residual treatment effect; only if estimator = "q" and Ronly = FALSE and parametric = FALSE |
R.corrected.nl |
the corrected SIMEX (nonlinear) estimate of the proportion of treatment effect explained by the surrogate marker; only if estimator = "q" |
R.corrected.var.nl |
the estimated variance of the corrected SIMEX (nonlinear) estimate of the proportion of treatment effect explained by the surrogate marker; only if estimator = "q" |
R.corrected.CI.normal.nl |
the 95% confidence interval using the normal approximation for the corrected SIMEX (nonlinear) estimate of the proportion of treatment effect explained by the surrogate marker; only if estimator = "q" |
R.corrected.CI.fieller.nl |
the 95% confidence interval using Fieller's approach for the corrected SIMEX (nonlinear) estimate of the proportion of treatment effect explained by the surrogate marker; only if estimator = "q" |
B1star.corrected.nl |
the corrected SIMEX (nonlinear) estimate of the adjusted regression coefficient for treatment; only if estimator = "q" and Ronly = FALSE and parametric = TRUE |
B1star.corrected.var.nl |
the estimated variance of the corrected SIMEX (nonlinear) estimate of the adjusted regression coefficient for treatment; only if estimator = "q" and Ronly = FALSE and parametric = TRUE |
B1star.corrected.CI.normal.nl |
the 95% confidence interval using the normal approximation for the corrected SIMEX (nonlinear) estimate of the adjusted regression coefficient for treatment; only if estimator = "q" and Ronly = FALSE and parametric = TRUE |
deltas.corrected.nl |
the corrected SIMEX (nonlinear) estimate of the residual treatment effect; only if estimator = "q" and Ronly = FALSE and parametric = FALSE |
deltas.corrected.var.nl |
the estimated variance of the corrected SIMEX (nonlinear) estimate of the residual treatment effect; only if estimator = "q" and Ronly = FALSE and parametric = FALSE |
deltas.corrected.CI.normal.nl |
the 95% confidence interval using the normal approximation for the corrected SIMEX (nonlinear) estimate of the residual treatment effect; only if estimator = "q" and Ronly = FALSE and parametric = FALSE |
Author(s)
Layla Parast
References
Parast, L., Garcia, T. P., Prentice, R. L., & Carroll, R. J. (2022). Robust methods to correct for measurement error when evaluating a surrogate marker. Biometrics, 78(1), 9-23.
Examples
data(d_example_me)
names(d_example_me)
R.s.estimate.me(yone=d_example_me$y1, yzero=d_example_me$y0, sone=d_example_me$s1,
szero=d_example_me$s0, parametric = TRUE, estimator = "d", me.variance = 0.5,
naive= TRUE, Ronly = FALSE)
R.s.estimate.me(yone=d_example_me$y1, yzero=d_example_me$y0, sone=d_example_me$s1,
szero=d_example_me$s0, parametric = TRUE, estimator = "q", me.variance = 0.5,
naive= FALSE, Ronly = TRUE)
#estimating measurement error variance with replicates
replicates = rbind(cbind(d_example_me$s1_rep1, d_example_me$s1_rep2,
d_example_me$s1_rep3), cbind(d_example_me$s0_rep1, d_example_me$s0_rep2,
d_example_me$s0_rep3))
mean.i = apply(replicates,1,mean, na.rm = TRUE)
num.i = apply(replicates,1,function(x) sum(!is.na(x)))
var.u = sum((replicates-mean.i)^2, na.rm = TRUE)/sum(num.i)
var.u
R.s.estimate.me(yone=d_example_me$y1, yzero=d_example_me$y0, sone=d_example_me$s1,
szero=d_example_me$s0, parametric = TRUE, estimator = "d", me.variance = var.u,
naive= TRUE, Ronly = FALSE)
R.s.estimate.me(yone=d_example_me$y1, yzero=d_example_me$y0,
sone=d_example_me$s1, szero=d_example_me$s0, parametric = FALSE, estimator = "q",
me.variance = 0.5, naive= FALSE, Ronly = TRUE)
Calculates the proportion of treatment effect explained by the surrogate marker information measured at a specified time and primary outcome information up to that specified time
Description
This function calculates the proportion of treatment effect on the primary outcome explained by the surrogate marker information measured at t_0
and primary outcome information up to t_0
. The user can also request a variance estimate, estimated using perturbating-resampling, and a 95% confidence interval. If a confidence interval is requested three versions are provided: a normal approximation based interval, a quantile based interval and Fieller's confidence interval, all using perturbation-resampling. The user can also request an estimate of the incremental value of surrogate marker information.
Usage
R.s.surv.estimate(xone, xzero, deltaone, deltazero, sone, szero, t,
weight.perturb = NULL, landmark, extrapolate = FALSE, transform = FALSE,
conf.int = FALSE, var = FALSE, incremental.value = FALSE, approx = T)
Arguments
xone |
numeric vector, the observed event times in the treatment group, X = min(T,C) where T is the time of the primary outcome and C is the censoring time. |
xzero |
numeric vector, the observed event times in the control group, X = min(T,C) where T is the time of the primary outcome and C is the censoring time. |
deltaone |
numeric vector, the event indicators for the treatment group, D = I(T<C) where T is the time of the primary outcome and C is the censoring time. |
deltazero |
numeric vector, the event indicators for the control group, D = I(T<C) where T is the time of the primary outcome and C is the censoring time. |
sone |
numeric vector; surrogate marker measurement at |
szero |
numeric vector; surrogate marker measurement at |
t |
the time of interest. |
weight.perturb |
weights used for perturbation resampling. |
landmark |
the landmark time |
extrapolate |
TRUE or FALSE; indicates whether the user wants to use extrapolation. |
transform |
TRUE or FALSE; indicates whether the user wants to use a transformation for the surrogate marker. |
conf.int |
TRUE or FALSE; indicates whether a 95% confidence interval for delta is requested, default is FALSE. |
var |
TRUE or FALSE; indicates whether a variance estimate is requested, default is FALSE. |
incremental.value |
TRUE or FALSE; indicates whether the user would like to see the incremental value of the surrogate marker information, default is FALSE. |
approx |
TRUE or FALSE indicating whether an approximation should be used when calculating the probability of censoring; most relevant in settings where the survival time of interest for the primary outcome is greater than the last observed event but before the last censored case, default is TRUE. |
Details
Let G
be the binary treatment indicator with G=1
for treatment and G=0
for control and we assume throughout that subjects are randomly assigned to a treatment group at baseline. Let T^{(1)}
and T^{(0)}
denote the time of the primary outcome of interest, death for example, under the treatment and under the control, respectively. Let S^{(1)}
and S^{(0)}
denote the surrogate marker measured at time t_0
under the treatment and the control, respectively.
The residual treatment effect is defined as
\Delta_S(t,t_0) = P(T^{(0)} > t_0) \left \{\int \psi_1(t | s, t_0) dF_0(s | t_0)-P(T^{(0)}> t| T^{(0)}> t_0) \right \}
where F_0(\cdot | t_0)
is the cumulative distribution function of S^{(0)}
conditional on T^{(0)}> t_0
and \psi_1(t | s,t_0) = P(T^{(1)}> t | S^{(1)}=s, T ^{(1)} > t_0)
. The proportion of treatment effect explained by the surrogate marker information measured at t_0
and primary outcome information up to t_0
, which we denote by R_S(t,t_0)
, can be expressed using a contrast between \Delta_S(t,t_0)
and \Delta(t)
:
R_S(t,t_0)=\{\Delta(t)-\Delta_S(t,t_0)\}/\Delta(t)=1-\Delta_S(t,t_0)/\Delta(t).
The definition and estimation of \Delta(t)
is described in the delta.surv.estimate documentation.
Due to censoring, our data consist of n_1
observations \{(X_{1i}, \delta_{1i}, S_{1i}), i=1,...,n_1\}
from the treatment group G=1
and n_0
observations \{(X_{0i}, \delta_{0i}, S_{0i}), i=1,...,n_0\}
from the control group G=0
where X_{gi} = \min(T_{gi}, C_{ gi})
, \delta_{gi} = I(T_{gi} < C_{gi})
, C_{gi}
denotes the censoring time, and S_{gi}
denotes the surrogate marker information measured at time t_0
, for g= 1,0
, for individual i
. Note that if X_{gi} < t_0
, then S_{gi}
should be NA (not available).
To estimate \Delta_S(t,t_0)
, we use a nonparametric kernel Nelson-Aalen estimator to estimate \psi_1(t | s,t_0)
as \hat{\psi}_1(t| s,t_0) = \exp\{-\hat{\Lambda}_1(t| s,t_0) \}
, where
\hat{\Lambda}_1(t| s,t_0) = \int_{t_0}^t \frac{\sum_{i=1}^{n_1} I(X_{1i}>t_0) K_h\{\gamma(S_{1i}) - \gamma(s)\}dN_{1i}(z)}{\sum_{i=1}^{n_1} I(X_{1i}>t_0) K_h\{\gamma(S_{1i}) - \gamma(s)\} Y_{1i}(z)},
is a consistent estimate of \Lambda_1(t| s,t_0 ) = -\log [\psi_1(t| s,t_0)], Y_{1i}(t) = I(X_{1i} \geq t), N_{1i}(t) = I(X_{1i} \leq t) \delta_i, K(\cdot)
is a smooth symmetric density function, K_h(x) = K(x/h)/h, \gamma(\cdot)
is a given monotone transformation function, and h
is a specified bandwidth. To obtain an appropriate h
we first use bw.nrd to obtain h_{opt}
; and then we let h = h_{opt}n_1^{-c_0}
with c_0 = 0.11
.
Since F_0(s | t_0) = P(S_{0i} \le s | X_{0i} > t_0)
, we empirically estimate F_0(s | t_0)
using all subjects with X_{0i} > t_0
as
\hat{F}_0(s | t_0) = \frac{\sum_{i=1}^{n_0} I(S_{0i} \le s, X_{0i} > t_0)}{\sum_{i=1}^{n_0} I(X_{0i} > t_0)}.
Subsequently, we construct an estimator for \Delta_{S}(t,t_0)
as
\hat{\Delta}_S(t,t_0) = n_0^{-1} \sum_{i=1}^{n_0} \left[\hat{\psi}_1(t| S_{0i},t_0) \frac{I(X_{0i} > t_0)}{\hat{W}^C_0(t_0)} - \frac{I(X_{0i} > t)}{\hat{W}^C_0(t)}\right]
where \hat{W}^C_g(\cdot)
is the Kaplan-Meier estimator of survival for censoring for g=1,0.
Finally, we estimate R_S(t,t_0)
as \hat{R}_S(t,t_0) =1- \hat{\Delta}_S(t,t_0)/\hat{\Delta}(t).
Variance estimation and confidence interval construction are performed using perturbation-resampling. Specifically, let \left \{ V^{(b)} = (V_{11}^{(b)}, ...V_{1n_1}^{(b)}, V_{01}^{(b)}, ...V_{0n_0}^{(b)})^T, b=1,....,D \right \}
be n \times D
independent copies of a positive random variables V
from a known distribution with unit mean and unit variance. Let
\hat{\Delta}^{(b)}(t) = \frac{ \sum_{i=1}^{n_1} V_{1i}^{(b)} I(X_{1i}>t)}{ \sum_{i=1}^{n_1} V_{1i}^{(b)} \hat{W}_1^{C(b)}(t)} -\frac{ \sum_{i=1}^{n_0} V_{0i}^{(b)} I(X_{0i}>t)}{ \sum_{i=1}^{n_0} V_{0i}^{(b)} \hat{W}_0^{C(b)}(t)}.
In this package, we use weights generated from an Exponential(1) distribution and use D=500
. The variance of \hat{\Delta}(t)
is obtained as the empirical variance of \{\hat{\Delta}(t)^{(b)}, b = 1,...,D\}
. Variance estimates for \hat{\Delta}_S(t,t_0)
and \hat{R}_S(t,t_0)
are calculated similarly. We construct two versions of the 95\%
confidence interval for each estimate: one based on a normal approximation confidence interval using the estimated variance and another taking the 2.5th and 97.5th empirical percentile of the perturbed quantities. In addition, we use Fieller's method to obtain a third confidence interval for R_S(t,t_0)
as
\left\{1-r: \frac{(\hat{\Delta}_S(t,t_0)-r\hat{\Delta}(t))^2}{\hat{\sigma}_{11}-2r\hat\sigma_{12}+r^2\hat\sigma_{22}} \le c_{\alpha}\right\},
where \hat{\Sigma}=(\hat\sigma_{ij})_{1\le i,j\le 2}
and c_\alpha
is the (1-\alpha)
th percentile of
\left\{\frac{\{\hat{\Delta}^{(b)}_S(t)-(1-\hat R_S(t,t_0))\hat{\Delta}(t)^{(b)}\}^2}{\hat{\sigma}_{11}-2(1-\hat R_S(t,t_0))\hat\sigma_{12}+(1-\hat R_S(t,t_0))^2\hat\sigma_{22}}, b=1, \cdots, C\right\}
where \alpha=0.05
.
Since the definition of R_S(t,t_0)
considers the surrogate information as a combination of both S
information and T
information up to t_0
, a logical inquiry would be how to assess the incremental value of the S
information in terms of the proportion of treatment effect explained, when added to T
information up to t_0
. The proportion of treatment effect explained by T
information up to t_0
only is denoted as R_T(t,t_0)
and is described in the documentation for R.t.surv.estimate. The incremental value of S
information is defined as:
IV_S(t,t_0) = R_S(t,t_0) - R_T(t,t_0) = \frac{\Delta_T(t,t_0) - \Delta_S(t,t_0)}{\Delta (t)}.
For estimation of R_T(t_0)
, see documentation for R.t.surv.estimate. The quantity IV_S(t,t_0)
is then estimated by \hat{IV}_S(t,t_0) = \hat{R}_S(t,t_0) - \hat{R}_T(t,t_0)
. Perturbation-resampling is used for variance estimation and confidence interval construction for this quantity, similar to the other quantities in this package.
Note that if the observed supports for S are not the same, then \hat{\Lambda}_1(t| s,t_0)
for S_{0i} = s
outside the support of S_{1i}
may return NA (depending on the bandwidth). If extrapolation = TRUE, then the \hat{\Lambda}_1(t| s,t_0)
values for these surrogate values are set to the closest non-NA value. If transform = TRUE, then S_{1i}
and S_{0i}
are transformed such that the new transformed values, S^{tr}_{1i}
and S^{tr}_{0i}
are defined as: S^{tr}_{gi} = F([S_{gi} - \mu]/\sigma)
for g=0,1
where F(\cdot)
is the cumulative distribution function for a standard normal random variable, and \mu
and \sigma
are the sample mean and standard deviation, respectively, of \{(S_{1i}, S_{0i})^T, i \quad s.t. X_{gi} > t_0\}
.
Value
A list is returned:
delta |
the estimate, |
delta.s |
the estimate, |
R.s |
the estimate, |
delta.var |
the variance estimate of |
delta.s.var |
the variance estimate of |
R.s.var |
the variance estimate of |
conf.int.normal.delta |
a vector of size 2; the 95% confidence interval for |
conf.int.quantile.delta |
a vector of size 2; the 95% confidence interval for |
conf.int.normal.delta.s |
a vector of size 2; the 95% confidence interval for |
conf.int.quantile.delta.s |
a vector of size 2; the 95% confidence interval for |
conf.int.normal.R.s |
a vector of size 2; the 95% confidence interval for |
conf.int.quantile.R.s |
a vector of size 2; the 95% confidence interval for |
conf.int.fieller.R.s |
a vector of size 2; the 95% confidence interval for |
delta.t |
the estimate, |
R.t |
the estimate, |
incremental.value |
the estimate, |
delta.t.var |
the variance estimate of |
R.t.var |
the variance estimate of |
incremental.value.var |
the variance estimate of |
conf.int.normal.delta.t |
a vector of size 2; the 95% confidence interval for |
conf.int.quantile.delta.t |
a vector of size 2; the 95% confidence interval for |
conf.int.normal.R.t |
a vector of size 2; the 95% confidence interval for |
conf.int.quantile.R.t |
a vector of size 2; the 95% confidence interval for |
conf.int.fieller.R.t |
a vector of size 2; the 95% confidence interval for |
conf.int.normal.iv |
a vector of size 2; the 95% confidence interval for |
conf.int.quantile.iv |
a vector of size 2; the 95% confidence interval for |
Note
If the treatment effect is not significant, the user will receive the following message: "Warning: it looks like the treatment effect is not significant; may be difficult to interpret the residual treatment effect in this setting". If the observed support of the surrogate marker for the control group is outside the observed support of the surrogate marker for the treatment group, the user will receive the following message: "Warning: observed supports do not appear equal, may need to consider a transformation or extrapolation".
Author(s)
Layla Parast
References
Parast, L., Cai, T., & Tian, L. (2017). Evaluating surrogate marker information using censored data. Statistics in Medicine, 36(11), 1767-1782.
Examples
data(d_example_surv)
names(d_example_surv)
Calculates the proportion of treatment effect explained by the primary outcome information up to a specified time
Description
This function calculates the proportion of treatment effect on the primary outcome explained by the treatment effect on the primary outcome up to t_0
. The user can also request a variance estimate, estimated using perturbating-resampling, and a 95% confidence interval. If a confidence interval is requested three versions are provided: a normal approximation based interval, a quantile based interval and Fieller's confidence interval, all using perturbation-resampling.
Usage
R.t.surv.estimate(xone, xzero, deltaone, deltazero, t, weight.perturb = NULL,
landmark, var = FALSE, conf.int = FALSE, approx = T)
Arguments
xone |
numeric vector, the observed event times in the treatment group, X = min(T,C) where T is the time of the primary outcome and C is the censoring time. |
xzero |
numeric vector, the observed event times in the control group, X = min(T,C) where T is the time of the primary outcome and C is the censoring time. |
deltaone |
numeric vector, the event indicators for the treatment group, D = I(T<C) where T is the time of the primary outcome and C is the censoring time. |
deltazero |
numeric vector, the event indicators for the control group, D = I(T<C) where T is the time of the primary outcome and C is the censoring time. |
t |
the time of interest. |
weight.perturb |
weights used for perturbation resampling. |
landmark |
the landmark time |
var |
TRUE or FALSE; indicates whether a variance estimate for delta is requested, default is FALSE. |
conf.int |
TRUE or FALSE; indicates whether a 95% confidence interval for delta is requested, default is FALSE. |
approx |
TRUE or FALSE indicating whether an approximation should be used when calculating the probability of censoring; most relevant in settings where the survival time of interest for the primary outcome is greater than the last observed event but before the last censored case, default is TRUE. |
Details
Let G
be the binary treatment indicator with G=1
for treatment and G=0
for control and we assume throughout that subjects are randomly assigned to a treatment group at baseline. Let T
denote the time of the primary outcome of interest, death for example. We use potential outcomes notation such that T^{(g)}
denotes the time of the primary outcome under treatment G = g
. The proportion of treatment effect explained by T observed up to t_0
only is R_T(t,t_0) = 1-\Delta_T(t,t_0)/\Delta(t)
where
\Delta_T(t, t_0) = P(T^{(0)}>t_0)P(T^{(1)}>t\mid T^{(1)}>t_0)-P(T^{(0)}>t).
To estimate R_T(t,t_0)
, we use the estimator \hat{R}_T(t,t_0) = 1-\hat{\Delta}_T(t,t_0)/\hat{\Delta}(t)
where \hat{\Delta}_T(t,t_0) = \hat{\phi}_0(t_0)\hat{\phi}_1(t)/\hat{\phi}_1(t_0) - \hat{\phi}_0(t)
and \hat{\phi}_g(u) = n_g^{-1} \sum_{i=1}^{n_g} \frac{I(X_{gi}>u)}{\hat{W}^C_g(u)}
for g=1,0
where \widehat{W}^C_g(\cdot)
is the Kaplan-Meier estimator of survival for censoring for g=1,0.
Value
A list is returned:
delta |
the estimate, |
delta.t |
the estimate, |
R.t |
the estimate, |
delta.var |
the variance estimate of |
delta.t.var |
the variance estimate of |
R.t.var |
the variance estimate of |
conf.int.normal.delta |
a vector of size 2; the 95% confidence interval for |
conf.int.quantile.delta |
a vector of size 2; the 95% confidence interval for |
conf.int.normal.delta.t |
a vector of size 2; the 95% confidence interval for |
conf.int.quantile.delta.t |
a vector of size 2; the 95% confidence interval for |
conf.int.normal.R.t |
a vector of size 2; the 95% confidence interval for |
conf.int.quantile.R.t |
a vector of size 2; the 95% confidence interval for |
conf.int.fieller.R.t |
a vector of size 2; the 95% confidence interval for |
Note
If the treatment effect is not significant, the user will receive the following message: "Warning: it looks like the treatment effect is not significant; may be difficult to interpret the residual treatment effect in this setting".
Author(s)
Layla Parast
References
Parast, L., Cai, T., & Tian, L. (2017). Evaluating surrogate marker information using censored data. Statistics in Medicine, 36(11), 1767-1782.
Examples
data(d_example_surv)
names(d_example_surv)
Repeats a row.
Description
Helper function; this function creates a matrix that repeats vc, dm times where each row is equal to the vc vector.
Usage
VTM(vc, dm)
Arguments
vc |
the vector to repeat. |
dm |
number of rows. |
Value
a matrix that repeats vc, dm times where each row is equal to the vc vector
Augmentation function
Description
Augmentation function
Usage
augment.est.vector(point.delta, perturb.delta, treat.ind, basis, weights)
Arguments
point.delta |
|
perturb.delta |
|
treat.ind |
|
basis |
|
weights |
Author(s)
Layla Parast
calculates closed form variance estimate for R
Description
calculates closed form variance estimate for R; used in R function that corrects for measurement error
Usage
calculate.var.np(s1, s0, y1, y0, extrapolate = TRUE)
Arguments
s1 |
numeric vector or matrix; surrogate marker for treated observations, assumed to be continuous. If there are multiple surrogates then this should be a matrix with |
s0 |
numeric vector; surrogate marker for control observations, assumed to be continuous.If there are multiple surrogates then this should be a matrix with |
y1 |
numeric vector; primary outcome for treated observations, assumed to be continuous. |
y0 |
numeric vector; primary outcome for control observations, assumed to be continuous. |
extrapolate |
TRUE or FALSE; indicates whether the user wants to use extrapolation. |
Value
total |
matrix needed for variance calculation |
psionly |
matrix needed for variance calculation |
Author(s)
Layla Parast
References
Parast, L., McDermott, M., Tian, L. (2016). Robust estimation of the proportion of treatment effect explained by surrogate marker information. Statistics in Medicine, 35(10):1637-1653.
Parast, L., Garcia, T. P., Prentice, R. L., & Carroll, R. J. (2022). Robust methods to correct for measurement error when evaluating a surrogate marker. Biometrics, 78(1), 9-23.
Calculates censoring probability for weighting
Description
Helper function; calculates censoring probability needed for inverse probability of censoring weighting
Usage
censor.weight(data.x, data.delta, t, weight = NULL, approx = T)
Arguments
data.x |
numeric vector, the observed event time: X = min(T, C) where T is the time of the primary outcome, C is the censoring time |
data.delta |
numeric vector of 0/1, the censoring indicator: D = I(T<C) where T is the time of the primary outcome, C is the censoring time |
t |
number, the time of interest |
weight |
a numeric vector or matrix of weights used for perturbation-resampling, default is null. |
approx |
TRUE or FALSE indicating whether an approximation should be used when calculating the probability of censoring; most relevant in settings where the survival time of interest for the primary outcome is greater than the last observed event but before the last censored case, default is TRUE. |
Details
Computes the Kaplan Meier estimate of survival for the censoring random variable at the specified time
Value
Kaplan Meier estimate of survival for censoring at time t
Author(s)
Layla Parast
Helper function
Description
Helper function; should not be called directly by user.
Usage
cumsum2(mydat)
Arguments
mydat |
mydat |
Value
out |
matrix |
Author(s)
Layla Parast
Hypothetical data
Description
Hypothetical data to be used in examples.
Usage
data(d_example)
Format
A list with 8 elements representing 500 observations from a control group and 500 observations from a treatment group:
s1.a
First surrogate marker measurement for treated observations.
s1.b
Second surrogate marker measurement for treated observations.
s1.c
Third surrogate marker measurement for treated observations.
y1
Primary outcome for treated observations.
s0.a
First surrogate marker measurement for control observations.
s0.b
Second surrogate marker measurement for control observations.
s0.c
Third surrogate marker measurement for control observations.
y0
Primary outcome for control observations.
Examples
data(d_example)
names(d_example)
Hypothetical data with replicate measurements
Description
Hypothetical data to be used in measurement error example.
Usage
data(d_example_me)
Format
A list with 10 elements representing 500 observations from a control group and 500 observations from a treatment group:
y1
Primary outcome for treated observations.
s1
Surrogate marker for treated observations.
s1_rep1
Replicate measurement of the surrogate marker for treated observations.
s1_rep2
Replicate measurement of the surrogate marker for treated observations.
s1_rep3
Replicate measurement of the surrogate marker for treated observations.
y0
Primary outcome for control observations.
s0
Surrogate marker for control observations.
s0_rep1
Replicate measurement of the surrogate marker for control observations.
s0_rep2
Replicate measurement of the surrogate marker for control observations.
s0_rep3
Replicate measurement of the surrogate marker for control observations.
Examples
data(d_example_me)
names(d_example_me)
Hypothetical survival data with multiple surrogate markers
Description
Hypothetical survival data with multiple surrogate markers to be used in examples.
Usage
data(d_example_multiple)
Format
A list with 6 elements representing 1000 observations from a control group and 1000 observations from a treatment group:
s1
Surrogate marker measurements for treated observations; these markers are measured at time = 0.5. For observations that experience the primary outcome or are censored before 0.5, the surrogate values are NA.
x1
The observed event or censoring time for treated observations; X = min(T, C) where T is the time of the primary outcome and C is the censoring time.
delta1
The indicator identifying whether the treated observation was observed to have the event or was censored; D =1*(T<C) where T is the time of the primary outcome and C is the censoring time.
s0
Surrogate marker measurements for control observations; these markers are measured at time = 0.5. For observations that experience the primary outcome or are censored before 0.5, the surrogate values are NA.
x0
The observed event or censoring time for control observations; X = min(T, C) where T is the time of the primary outcome and C is the censoring time.
delta0
The indicator identifying whether the control observation was observed to have the event or was censored; D =1*(T<C) where T is the time of the primary outcome and C is the censoring time.
Examples
data(d_example_multiple)
names(d_example_multiple)
Hypothetical survival data
Description
Hypothetical survival data to be used in examples.
Usage
data(d_example_surv)
Format
A list with 8 elements representing 500 observations from a control group and 500 observations from a treatment group:
s1
Surrogate marker measurement for treated observations; this marker is measured at time = 0.5. For observations that experience the primary outcome or are censored before 0.5, this value is NA.
x1
The observed event or censoring time for treated observations; X = min(T, C) where T is the time of the primary outcome and C is the censoring time.
delta1
The indicator identifying whether the treated observation was observed to have the event or was censored; D =1*(T<C) where T is the time of the primary outcome and C is the censoring time.
s0
Surrogate marker measurement for control observations; this marker is measured at time = 0.5. For observations that experience the primary outcome or are censored before 0.5, this value is NA.
x0
The observed event or censoring time for control observations; X = min(T, C) where T is the time of the primary outcome and C is the censoring time.
delta0
The indicator identifying whether the control observation was observed to have the event or was censored; D =1*(T<C) where T is the time of the primary outcome and C is the censoring time.
z1
A baseline covariate value for treated observations.
z0
A baseline covariate value for control observations.
Examples
data(d_example_surv)
names(d_example_surv)
Helper function
Description
Helper function; should not be called directly by user.
Usage
helper.si(yy,FUN,Yi,Vi=NULL)
Arguments
yy |
yy |
FUN |
FUN |
Yi |
Yi |
Vi |
Vi |
Value
out |
matrix |
Author(s)
Layla Parast
Calculates treatment effect
Description
This function calculates the treatment effect estimate, the difference in the average outcome in the treatment group minus the control group. This function is intended to be used for a fully observed continuous outcome. The user can also request a variance estimate, estimated using perturbating-resampling, and a 95% confidence interval. If a confidence interval is requested two versions are provided: a normal approximation based interval and a quantile based interval, both use perturbation-resampling.
Usage
delta.estimate(yone,yzero, var = FALSE, conf.int = FALSE, weight = NULL,
weight.perturb = NULL)
Arguments
yone |
numeric vector; primary outcome for treated observations. |
yzero |
numeric vector; primary outcome for control observations. |
var |
TRUE or FALSE; indicates whether a variance estimate for delta is requested, default is FALSE. |
conf.int |
TRUE or FALSE; indicates whether a 95% confidence interval for delta is requested, default is FALSE. |
weight |
a n1+n0 by x matrix of weights where n1 = length of yone and n0 = length of yzero, default is null; generally not supplied by use but only used by other functions. |
weight.perturb |
a n1+n0 by x matrix of weights where n1 = length of yone and n0 = length of yzero, default is null; generally used for confidence interval construction and may be supplied by user. |
Details
Let Y^{(1)}
and Y^{(0)}
denote the primary outcome under the treatment and primary outcome under the control,respectively. The treatment effect, \Delta
, is the expected difference in Y^{(1)}
compared to Y^{(0)}
, \Delta=E(Y^{(1)}-Y^{(0)}).
We estimate \Delta
as
\hat{\Delta}
= n_1^{-1} \sum_{i=1}^{n_1} Y_{1i} - n_0^{-1} \sum_{i=1}^{n_0} Y_{0i}
where Y_{1i}
is the observed primary outcome for person i
in the treated group, Y_{0i}
is the observed primary outcome for person i
in the control group, and n_1
and n_0
are the number of individuals in the treatment and control group, respectively. Randomized treatment assignment is assumed throughout this package.
Variance estimation and confidence interval construction are performed using perturbation-resampling. Specifically, let \left \{ V^{(b)} = (V_{11}^{(b)}, ...V_{1n_1}^{(b)}, V_{01}^{(b)}, ...V_{0n_0}^{(b)})^T, b=1,....,D \right \}
be n \times D
independent copies of a positive random variables V
from a known distribution with unit mean and unit variance. Let
\hat{\Delta}^{(b)} = \frac{ \sum_{i=1}^{n_1} V_{1i}^{(b)} Y_{1i}}{ \sum_{i=1}^{n_1} V_{1i}^{(b)}} - \frac{ \sum_{i=1}^{n_0} V_{0i}^{(b)} Y_{0i}}{ \sum_{i=1}^{n_0} V_{0i}^{(b)}}.
The variance of \hat{\Delta}
is obtained as the empirical variance of \{\hat{\Delta}^{(b)}, b = 1,...,D\}.
In this package, we use weights generated from an Exponential(1) distribution and use D=500
. We construct two versions of the 95\%
confidence interval for \hat{\Delta}
: one based on a normal approximation confidence interval using the estimated variance and another taking the 2.5th and 97.5th empirical percentiles of \hat{\Delta}^{(b)}
.
Value
A list is returned:
delta |
the estimate, |
var |
the variance estimate of |
conf.int.normal |
a vector of size 2; the 95% confidence interval for |
conf.int.quantile |
a vector of size 2; the 95% confidence interval for |
Author(s)
Layla Parast
Examples
data(d_example)
names(d_example)
delta.estimate(yone=d_example$y1, yzero=d_example$y0)
Calculates robust residual treatment effect accounting for multiple surrogate markers at a specified time and primary outcome information up to that specified time
Description
This function calculates the robust estimate of the residual treatment effect accounting for multiple surrogate markers measured at t_0
and primary outcome information up to t_0
i.e. the hypothetical treatment effect if both the surrogate marker distributions at t_0
and survival up to t_0
in the treatment group look like the surrogate marker distributions and survival up to t_0
in the control group. Ideally this function is only used as a helper function and is not directly called.
Usage
delta.multiple.surv(xone, xzero, deltaone, deltazero, sone, szero, type =1, t,
weight.perturb = NULL, landmark, extrapolate = FALSE, transform = FALSE,
approx = T)
Arguments
xone |
numeric vector, the observed event times in the treatment group, X = min(T,C) where T is the time of the primary outcome and C is the censoring time. |
xzero |
numeric vector, the observed event times in the control group, X = min(T,C) where T is the time of the primary outcome and C is the censoring time. |
deltaone |
numeric vector, the event indicators for the treatment group, D = I(T<C) where T is the time of the primary outcome and C is the censoring time. |
deltazero |
numeric vector, the event indicators for the control group, D = I(T<C) where T is the time of the primary outcome and C is the censoring time. |
sone |
matrix of numeric values; surrogate marker measurements at |
szero |
matrix of numeric values; surrogate marker measurements at |
type |
type of estimate; options are 1 = two-stage robust estimator, 2 = weighted two-stage robust estimator, 3 = double-robust estimator, 4 = two-stage model-based estimator, 5 = weighted estimator, 6 = double-robust model-bsed estimator; default is 1. |
t |
the time of interest. |
weight.perturb |
weights used for perturbation resampling. |
landmark |
the landmark time |
extrapolate |
TRUE or FALSE; indicates whether the user wants to use extrapolation. |
transform |
TRUE or FALSE; indicates whether the user wants to use a transformation for the surrogate marker psuedo-score. |
approx |
TRUE or FALSE indicating whether an approximation should be used when calculating the probability of censoring; most relevant in settings where the survival time of interest for the primary outcome is greater than the last observed event but before the last censored case, default is TRUE. |
Details
Details are included in the documentation for R.multiple.surv.
Value
\hat{\Delta}_S(t,t_0)
, the residual treatment effect estimate accounting for multiple surrogarte markers measured at t_0
and primary outcome information up to t_0
.
Note
If the treatment effect is not significant, the user will receive the following message: "Warning: it looks like the treatment effect is not significant; may be difficult to interpret the residual treatment effect in this setting".
Author(s)
Layla Parast
References
Parast, L., Cai, T., & Tian, L. (2021). Evaluating multiple surrogate markers with censored data. Biometrics, 77(4), 1315-1327.
Examples
data(d_example_multiple)
names(d_example_multiple)
## Not run:
delta.multiple.surv(xone = d_example_multiple$x1, xzero = d_example_multiple$x0, deltaone =
d_example_multiple$delta1, deltazero = d_example_multiple$delta0, sone =
as.matrix(d_example_multiple$s1), szero = as.matrix(d_example_multiple$s0),
type =1, t = 1, landmark=0.5)
## End(Not run)
Calculates model-based or robust residual treatment effect
Description
This function calculates the model-based or robust estimate of the residual treatment effect i.e. the hypothetical treatment effect if the distribution of the surrogate in the treatment group looks like the distribution of the surrogate in the control group. Ideally, this function is only used as a helper function and is not directly called.
Usage
delta.s.estimate(sone, szero, yone, yzero, weight.perturb = NULL, number="single",
type="robust", warn.te = FALSE, warn.support = FALSE, extrapolate = FALSE,
transform = FALSE)
Arguments
sone |
numeric vector or matrix; surrogate marker for treated observations, assumed to be continuous. If there are multiple surrogates then this should be a matrix with |
szero |
numeric vector or matrix; surrogate marker for control observations, assumed to be continuous. If there are multiple surrogates then this should be a matrix with |
yone |
numeric vector; primary outcome for treated observations. |
yzero |
numeric vector; primary outcome for control observations. |
weight.perturb |
a |
number |
specifies the number of surrogate markers; choices are "multiple" or "single", default is "single". |
type |
specifies the type of estimation; choices are "robust" or "model", default is "robust". |
warn.te |
value passed from R.s.estimate function to control warnings; user does not need to specify. |
warn.support |
value passed from R.s.estimate function to control warnings; user does not need to specify. |
extrapolate |
TRUE or FALSE; indicates whether the user wants to use extrapolation. |
transform |
TRUE or FALSE; indicates whether the user wants to use a transformation for the surrogate marker. |
Details
Details are included in the documentation for R.s.estimate.
Value
\hat{\Delta}_S
, the model-based or robust residual treatment effect estimate.
Note
If the treatment effect is not significant, the user will receive the following message: "Warning: it looks like the treatment effect is not significant; may be difficult to interpret the residual treatment effect in this setting". In the single marker case with the robust estimation approach, if the observed support of the surrogate marker for the control group is outside the observed support of the surrogate marker for the treatment group, the user will receive the following message: "Warning: observed supports do not appear equal, may need to consider a transformation or extrapolation".
Author(s)
Layla Parast
References
Parast, L., McDermott, M., Tian, L. (2015). Robust estimation of the proportion of treatment effect explained by surrogate marker information. Statistics in Medicine, 35(10):1637-1653.
Wang, Y., & Taylor, J. M. (2002). A measure of the proportion of treatment effect explained by a surrogate marker. Biometrics, 58(4), 803-812.
Examples
data(d_example)
names(d_example)
delta.s.estimate(yone=d_example$y1, yzero=d_example$y0, sone=d_example$s1.a, szero=
d_example$s0.a, number = "single", type = "robust")
delta.s.estimate(yone=d_example$y1, yzero=d_example$y0, sone=d_example$s1.a, szero=
d_example$s0.a, number = "single", type = "model")
delta.s.estimate(yone=d_example$y1, yzero=d_example$y0, sone=cbind(d_example$s1.a,
d_example$s1.b, d_example$s1.c), szero=cbind(d_example$s0.a, d_example$s0.b, d_example$s0.c),
number = "multiple", type = "robust")
delta.s.estimate(yone=d_example$y1, yzero=d_example$y0, sone=cbind(d_example$s1.a,
d_example$s1.b, d_example$s1.c), szero=cbind(d_example$s0.a, d_example$s0.b, d_example$s0.c),
number = "multiple", type = "model")
Calculates robust residual treatment effect accounting for surrogate marker information measured at a specified time and primary outcome information up to that specified time
Description
This function calculates the robust estimate of the residual treatment effect accounting for surrogate marker information measured at t_0
and primary outcome information up to t_0
i.e. the hypothetical treatment effect if both the surrogate marker distribution at t_0
and survival up to t_0
in the treatment group look like the surrogate marker distribution and survival up to t_0
in the control group. Ideally this function is only used as a helper function and is not directly called.
Usage
delta.s.surv.estimate(xone, xzero, deltaone, deltazero, sone, szero, t,
weight.perturb = NULL, landmark, extrapolate = FALSE, transform = FALSE,
approx = T, warn.te = FALSE, warn.support = FALSE)
Arguments
xone |
numeric vector, the observed event times in the treatment group, X = min(T,C) where T is the time of the primary outcome and C is the censoring time. |
xzero |
numeric vector, the observed event times in the control group, X = min(T,C) where T is the time of the primary outcome and C is the censoring time. |
deltaone |
numeric vector, the event indicators for the treatment group, D = I(T<C) where T is the time of the primary outcome and C is the censoring time. |
deltazero |
numeric vector, the event indicators for the control group, D = I(T<C) where T is the time of the primary outcome and C is the censoring time. |
sone |
numeric vector; surrogate marker measurement at |
szero |
numeric vector; surrogate marker measurement at |
t |
the time of interest. |
weight.perturb |
weights used for perturbation resampling. |
landmark |
the landmark time |
extrapolate |
TRUE or FALSE; indicates whether the user wants to use extrapolation. |
transform |
TRUE or FALSE; indicates whether the user wants to use a transformation for the surrogate marker. |
approx |
TRUE or FALSE indicating whether an approximation should be used when calculating the probability of censoring; most relevant in settings where the survival time of interest for the primary outcome is greater than the last observed event but before the last censored case, default is TRUE. |
warn.te |
value passed from R.s.estimate function to control warnings; user does not need to specify. |
warn.support |
value passed from R.s.estimate function to control warnings; user does not need to specify. |
Details
Details are included in the documentation for R.s.surv.estimate.
Value
\hat{\Delta}_S(t,t_0)
, the robust residual treatment effect estimate accounting for surrogate marker information measured at t_0
and primary outcome information up to t_0
.
Note
If the treatment effect is not significant, the user will receive the following message: "Warning: it looks like the treatment effect is not significant; may be difficult to interpret the residual treatment effect in this setting". If the observed support of the surrogate marker for the control group is outside the observed support of the surrogate marker for the treatment group, the user will receive the following message: "Warning: observed supports do not appear equal, may need to consider a transformation or extrapolation".
Author(s)
Layla Parast
References
Parast, L., Cai, T., & Tian, L. (2017). Evaluating surrogate marker information using censored data. Statistics in Medicine, 36(11), 1767-1782.
Examples
data(d_example_surv)
names(d_example_surv)
Calculates treatment effect in a survival setting
Description
This function calculates the treatment effect in the survival setting i.e. the difference in survival at time t between the treatment group and the control group. The user can also request a variance estimate, estimated using perturbating-resampling, and a 95% confidence interval. If a confidence interval is requested two versions are provided: a normal approximation based interval and a quantile based interval, both use perturbation-resampling.
Usage
delta.surv.estimate(xone, xzero, deltaone, deltazero, t, var = FALSE, conf.int
= FALSE, weight = NULL, weight.perturb = NULL, approx = T)
Arguments
xone |
numeric vector, the observed event times in the treatment group, X = min(T,C) where T is the time of the primary outcome and C is the censoring time. |
xzero |
numeric vector, the observed event times in the control group, X = min(T,C) where T is the time of the primary outcome and C is the censoring time. |
deltaone |
numeric vector, the event indicators for the treatment group, D = I(T<C) where T is the time of the primary outcome and C is the censoring time. |
deltazero |
numeric vector, the event indicators for the control group, D = I(T<C) where T is the time of the primary outcome and C is the censoring time. |
t |
the time of interest. |
var |
TRUE or FALSE; indicates whether a variance estimate for delta is requested, default is FALSE. |
conf.int |
TRUE or FALSE; indicates whether a 95% confidence interval for delta is requested, default is FALSE. |
weight |
a |
weight.perturb |
a |
approx |
TRUE or FALSE indicating whether an approximation should be used when calculating the probability of censoring; most relevant in settings where the survival time of interest for the primary outcome is greater than the last observed event but before the last censored case, default is TRUE. |
Details
Let G
be the binary treatment indicator with G=1
for treatment and G=0
for control and we assume throughout that subjects are randomly assigned to a treatment group at baseline. Let T
denote the time of the primary outcome of interest, death for example. We use potential outcomes notation such that T^{(g)}
denotes the time of the primary outcome under treatment G = g
. We define the treatment effect, \Delta(t)
, as the difference in survival rates by time t
under treatment versus control,
\Delta(t)=E\{ I(T^{(1)}>t)\} - E\{I(T^{(0)}>t)\} = P(T^{(1)}>t) - P(T^{(0)}>t)
where t>t_0
Due to censoring, our data consist of n_1
observations \{(X_{1i}, \delta_{1i}), i=1,...,n_1\}
from the treatment group G=1
and n_0
observations \{(X_{0i}, \delta_{0i}), i=1,...,n_0\}
from the control group G=0
where X_{gi} = \min(T_{gi}, C_{ gi})
, \delta_{gi} = I(T_{gi} < C_{gi})
, and C_{gi}
denotes the censoring time for g= 1,0
, for individual i
. Throughout, we estimate the treatment effect \Delta(t)
as
\hat{\Delta}(t) = n_1^{-1} \sum_{i=1}^{n_1} \frac{I(X_{1i}>t)}{\hat{W}^C_1(t)} - n_0^{-1} \sum_{i=1}^{n_0} \frac{I(X_{0i}>t)}{\hat{W}^C_0(t)}
where \hat{W}^C_g(\cdot)
is the Kaplan-Meier estimator of survival for censoring for g=1,0.
Variance estimation and confidence interval construction are performed using perturbation-resampling. Specifically, let \left \{ V^{(b)} = (V_{11}^{(b)}, ...V_{1n_1}^{(b)}, V_{01}^{(b)}, ...V_{0n_0}^{(b)})^T, b=1,....,D \right \}
be n \times D
independent copies of a positive random variables V
from a known distribution with unit mean and unit variance. Let
\hat{\Delta}^{(b)} (t) = \frac{ \sum_{i=1}^{n_1} V_{1i}^{(b)} I(X_{1i}>t)}{ \sum_{i=1}^{n_1} V_{1i}^{(b)} \hat{W}_1^{C(b)}(t)} -\frac{ \sum_{i=1}^{n_0} V_{0i}^{(b)} I(X_{0i}>t)}{ \sum_{i=1}^{n_0} V_{0i}^{(b)} \hat{W}_0^{C(b)}(t)}.
In this package, we use weights generated from an Exponential(1) distribution and use D=500
. The variance of \hat{\Delta}(t)
is obtained as the empirical variance of \{\hat{\Delta}(t)^{(b)}, b = 1,...,D\}.
We construct two versions of the 95\%
confidence interval for \hat{\Delta}(t)
: one based on a normal approximation confidence interval using the estimated variance and another taking the 2.5th and 97.5th empirical percentiles of \hat{\Delta}(t)^{(b)}
.
Value
A list is returned:
delta |
the estimate, |
var |
the variance estimate of |
conf.int.normal |
a vector of size 2; the 95% confidence interval for |
conf.int.quantile |
a vector of size 2; the 95% confidence interval for |
Author(s)
Layla Parast
Examples
data(d_example_surv)
names(d_example_surv)
delta.surv.estimate(xone = d_example_surv$x1, xzero = d_example_surv$x0,
deltaone = d_example_surv$delta1, deltazero = d_example_surv$delta0, t = 3)
Calculates robust residual treatment effect accounting only for primary outcome information up to a specified time
Description
This function calculates the robust estimate of the residual treatment effect accounting only for primary outcome information up to t_0
i.e. the hypothetical treatment effect if survival up to t_0
in the treatment group looks like survival up to t_0
in the control group. Ideally this function is only used as a helper function and is not directly called.
Usage
delta.t.surv.estimate(xone, xzero, deltaone, deltazero, t, weight.perturb = NULL,
landmark, approx = T)
Arguments
xone |
numeric vector, the observed event times in the treatment group, X = min(T,C) where T is the time of the primary outcome and C is the censoring time. |
xzero |
numeric vector, the observed event times in the control group, X = min(T,C) where T is the time of the primary outcome and C is the censoring time. |
deltaone |
numeric vector, the event indicators for the treatment group, D = I(T<C) where T is the time of the primary outcome and C is the censoring time. |
deltazero |
numeric vector, the event indicators for the control group, D = I(T<C) where T is the time of the primary outcome and C is the censoring time. |
t |
the time of interest. |
weight.perturb |
weights used for perturbation resampling. |
landmark |
the landmark time |
approx |
TRUE or FALSE indicating whether an approximation should be used when calculating the probability of censoring; most relevant in settings where the survival time of interest for the primary outcome is greater than the last observed event but before the last censored case, default is TRUE. |
Details
Details are included in the documentation for R.t.surv.estimate.
Value
\hat{\Delta}_T(t,t_0)
, the robust residual treatment effect estimate accounting only for survival up to t_0
.
Note
If the treatment effect is not significant, the user will receive the following message: "Warning: it looks like the treatment effect is not significant; may be difficult to interpret the residual treatment effect in this setting".
Author(s)
Layla Parast
References
Parast, L., Cai, T., & Tian, L. (2017). Evaluating surrogate marker information using censored data. Statistics in Medicine, 36(11), 1767-1782.
Examples
data(d_example_surv)
names(d_example_surv)
Constructs Fieller's confidence interval.
Description
Constructs Fieller's confidence interval - closed form, not using perturbation.
Usage
fieller.calculate.me(a, B, var.mat)
Arguments
a |
value of numerator |
B |
value of denominator |
var.mat |
variance-covariance matrix for a and B |
Details
Produces a confidence interval for 1-a/B when closed form variance and covariance estimates are available. See documention for R.s.estimate.me for more detail.
Value
Returns a vector of length 2, lower bound of the 95% confidence interval and upper bound of the 95% confidence interval.
Author(s)
Layla Parast
References
Parast, L., Garcia, T. P., Prentice, R. L., & Carroll, R. J. (2022). Robust methods to correct for measurement error when evaluating a surrogate marker. Biometrics, 78(1), 9-23.
Fieller, Edgar C. (1954). Some problems in interval estimation. Journal of the Royal Statistical Society. Series B (Methodological), 175-185.
Fieller, E. C. (1940). The biological standardization of insulin. Supplement to the Journal of the Royal Statistical Society, 1-64.
Constructs Fieller's confidence interval.
Description
Constructs Fieller's confidence interval.
Usage
fieller.ci(perturb.delta.s, perturb.delta, delta.s, delta)
Arguments
perturb.delta.s |
numeric vector; the perturbed values for |
perturb.delta |
numeric vector; the perturbed values for |
delta.s |
the residual treatment effect, |
delta |
the treatment effect, |
Details
See documention for R.s.estimate for more detail.
Value
Returns a vector of length 2, lower bound of the 95% confidence interval and upper bound of the 95% confidence interval.
Author(s)
Layla Parast
References
Fieller, Edgar C. (1954). Some problems in interval estimation. Journal of the Royal Statistical Society. Series B (Methodological), 175-185.
Fieller, E. C. (1940). The biological standardization of insulin. Supplement to the Journal of the Royal Statistical Society, 1-64.
Parast, L., McDermott, M., Tian, L. (2016). Robust estimation of the proportion of treatment effect explained by surrogate marker information. Statistics in Medicine, 35(10):1637-1653.
Estimates measurement error variance given replicate data.
Description
Estimates measurement error variance given replicate data using a simple components of variance analysis.
Usage
me.variance.estimate(replicates)
Arguments
replicates |
matrix of data where each row indicates a subject and each column is a replicated measurement; columns can have NAs when subjects have different numbers of measurements. |
Details
Estimates measurement error variance given replicate data using a simple components of variance analysis.
Value
estimate of measurement error variance
Author(s)
Layla Parast
References
Carroll, R. J., Ruppert, D., Crainiceanu, C. M., and Stefanski, L. A. (2006). Measurement error in nonlinear models: a modern perspective. Chapman and Hall/CRC.
Parast, L., Garcia, T. P., Prentice, R. L., & Carroll, R. J. (2022). Robust methods to correct for measurement error when evaluating a surrogate marker. Biometrics, 78(1), 9-23.
Helper for augmentation function
Description
Helper for augmentation function
Usage
perturb.nu.vector(mat, weights)
Arguments
mat |
|
weights |
Calculates expected outcome for control group surrogate values
Description
Helper function; calculates the expected outcome for control group surrogate values using treatment group surrogate and outcome information.
Usage
pred.smooth(zz, zi.one, bw = NULL, y1, weight = NULL)
Arguments
zz |
surrogate values in the treatment group; used to estimate conditional mean function |
zi.one |
surrogate values in the control group |
bw |
bandwidth |
y1 |
outcome in the treatment group |
weight |
weight |
Details
Details are included in the documentation for R.s.estimate
Value
expected outcome for each surrogate value in the control group
Author(s)
Layla Parast
References
Parast, L., McDermott, M., Tian, L. (2016). Robust estimation of the proportion of treatment effect explained by surrogate marker information. Statistics in medicine, 35(10):1637-1653.
Calculates the conditional probability of survival for control group values
Description
Helper function; calculates the estimated probability of survival for control group for control group surrogate values using treatment group surrogate and outcome information.
Usage
pred.smooth.surv(xone.f, deltaone.f, sone.f, szero.one, myt, weight.pred,
extrapolate, transform, ps.weight = NULL, warn.support = FALSE)
Arguments
xone.f |
observed event times in the treatment group |
deltaone.f |
censoring indicators in the treatment group |
sone.f |
surrogate marker values in the treatment group |
szero.one |
surrogate marker values in the control group |
myt |
time of interest |
weight.pred |
weight used for perturbation resampling |
extrapolate |
TRUE or FALSE; indicates whether local constant extrapolation should be used, default is FALSE |
transform |
TRUE or FALSE; indicates whether a transformation should be used, default is FALSE |
ps.weight |
vector of propensity score weights to be used in the multiple marker setting, default is NULL |
warn.support |
value passed from R.s.estimate function to control warnings; user does not need to specify. |
Details
Details are included in the documentation for R.s.surv.estimate and R.multiple.surv
Value
conditional probability of survival past t for control group
Author(s)
Layla Parast