| Title: | Partial Transfer Learning for Causal Estimation |
| Version: | 0.1.0 |
| Description: | Implements partial transfer learning (PTL) for causal effect estimation using source and target data, with bootstrap-based source detection. Provides data generating processes and nuisance functions for simulation. |
| License: | GPL-2 | GPL-3 [expanded from: GPL (≥ 2)] |
| URL: | https://github.com/JackQuu/PartialTL |
| BugReports: | https://github.com/JackQuu/PartialTL/issues |
| Encoding: | UTF-8 |
| Depends: | R (≥ 4.0.0) |
| Imports: | MASS, DoubleML, ncvreg |
| Suggests: | lgr, mlr3, mlr3learners, glmnet |
| NeedsCompilation: | no |
| Packaged: | 2026-02-15 07:13:22 UTC; QUXINHAO |
| Author: | Xinhao Qu [aut, cre] |
| Maintainer: | Xinhao Qu <xqu018@ucr.edu> |
| Repository: | CRAN |
| Date/Publication: | 2026-02-18 19:10:13 UTC |
Partial Transfer Learning for Causal Estimation and Source Detection
Description
Implements partial transfer learning (PTL) and heterogeneous partial transfer learning (HPTL) for causal effect estimation using source and target data. Uses double machine learning for nuisance estimation and cross-fitting. Provides bootstrap-based source detection to identify which sources are transferable to the target. Includes data generating processes and nuisance functions for simulation.
Details
Main functions:
-
fit_PTL: PTL causal estimate (single source). -
fit_HPTL: Heterogeneous PTL with multiple sources and covariate modules. -
boot_detection: Bootstrap-based source detection. -
run_bootstrap_E_hat: Bootstrap for nuisance function estimation. -
DGP: Data generating process for simulations. -
RMSE: Root mean squared error for evaluation. -
f_0,g_0,f_k,g_k: Nuisance functions for target and source models.
Run demo(package = "PartialTL") for demo scripts.
Author(s)
Author author@example.com
References
(Add references to the partial transfer learning and DML literature.)
Data Generating Process
Description
Generates (X, D, Y) for simulations: multivariate normal X with
AR(1)-type covariance, treatment D from a linear confounding model plus
noise, and outcome Y = rho*D + f(X) + error.
Usage
DGP(n, q, p, p.nonzero, rho, Beta, Gamma, mu = rep(10, p), sigma = 0.5,
f_func = f_0, g_func = g_0, seed = NULL)
Arguments
n |
Sample size. |
q |
Causal dimension (kept for interface; unused in current body). |
p |
Nuisance dimension (number of covariates). |
p.nonzero |
Number of non-zero coefficients for Beta and Gamma. |
rho |
Causal effect (scalar). |
Beta |
Nuisance coefficient vector for outcome; |
Gamma |
Nuisance coefficient vector for treatment; |
mu |
Mean vector of X; length |
sigma |
Base for AR(1)-type covariance: |
f_func |
Function |
g_func |
Function |
seed |
Optional random seed. |
Value
A list with components D (treatment), X (design matrix),
Y (outcome), and data (cbind(Y, D, X)).
See Also
f_0, g_0, f_k, g_k,
fit_PTL, RMSE
Examples
set.seed(1)
n <- 50
p <- 10
p.nz <- 3
Beta <- matrix(c(rep(0.5, p.nz), rep(0, p - p.nz)))
Gamma <- matrix(c(rep(0.3, p.nz), rep(0, p - p.nz)))
dat <- DGP(n, q = 1, p, p.nz, rho = 0.5, Beta, Gamma, seed = 1)
str(dat)
Root Mean Squared Error
Description
Computes the root mean squared error of parameter estimates across
simulations: for each component, \sqrt{\mathrm{mean}((\hat{\theta} - \theta)^2)}.
Usage
RMSE(Theta, Theta.hat)
Arguments
Theta |
True parameter vector; |
Theta.hat |
Matrix of estimates; |
Value
Numeric vector of length p: RMSE per parameter component.
See Also
Examples
th <- c(1, 2)
th_hat <- matrix(c(1.1, 2.2, 0.9, 1.8), nrow = 2)
RMSE(th, th_hat)
Bootstrap-Based Source Detection
Description
Identifies which sources are transferable to the target by comparing bootstrap estimates of the nuisance function on the target and each source. A source is detected as transferable if the difference is below a threshold proportional to the combined standard error.
Usage
boot_detection(D_t, X_t, Y_t, D_s_all, X_s_all, Y_s_all, source_sizes, B,
ml_f, ml_g)
Arguments
D_t |
Target treatment; |
X_t |
Target design matrix; |
Y_t |
Target outcome; |
D_s_all |
Source treatments concatenated by row; rows split by |
X_s_all |
Source design matrices concatenated by row. |
Y_s_all |
Source outcomes concatenated by row. |
source_sizes |
Integer vector of length K: sample size of each source. |
B |
Number of bootstrap replications. |
ml_f |
Outcome learner for DoubleML. |
ml_g |
Treatment learner for DoubleML. |
Value
A data frame with columns Source (label) and detected.source
(list of logical vectors of length B per source).
See Also
run_bootstrap_E_hat, fit_PTL, fit_HPTL
Outcome Nuisance Function (Target Model)
Description
Linear outcome nuisance function for the target model: f_0(X, \beta) = X \beta.
Usage
f_0(X, Beta)
Arguments
X |
Design matrix; |
Beta |
Coefficient vector (column matrix); |
Value
Numeric vector of outcome nuisance values; length n.
See Also
Outcome Nuisance Function (Source Model)
Description
Linear outcome nuisance function for source models: f_k(X, \beta) = X \beta.
Same form as f_0; used in DGP for source data.
Usage
f_k(X, Beta)
Arguments
X |
Design matrix; |
Beta |
Coefficient vector (column matrix); |
Value
Numeric vector of outcome nuisance values; length n.
See Also
HPTL (Heterogeneous Partial Transfer Learning) Fit
Description
Fits heterogeneous partial transfer learning with multiple sources and covariate modules. Each source has its own module of covariates; PTL is applied per source and results are combined for the target causal estimate.
Usage
fit_HPTL(D_t, X_t, Y_t, D_s_all, X_s_all, Y_s_all, source_sizes, module_sizes,
ml_f, ml_g, fold = 5)
Arguments
D_t |
Target treatment; |
X_t |
Target design matrix; |
Y_t |
Target outcome; |
D_s_all |
Source treatments concatenated by row; dimension is (sum of |
X_s_all |
Source design matrices concatenated by row; rows split by |
Y_s_all |
Source outcomes concatenated by row; rows split by |
source_sizes |
Integer vector of length K: sample size of each source. Must sum to |
module_sizes |
Integer vector of length K: covariate module sizes. The k-th source uses columns |
ml_f |
Outcome learner for DoubleML. |
ml_g |
Treatment learner for DoubleML. |
fold |
Number of folds for cross-fitting (default 5). |
Value
A list with component hat_rho_HPTL: HPTL causal estimate on target.
See Also
PTL (Partial Transfer Learning) Fit
Description
Fits partial transfer learning with cross-fitting and outcome reconstruction. Uses double machine learning on the source and SCAD on the outcome nuisance, then transfers to the target via the partial transfer formula.
Usage
fit_PTL(D_t, X_t, Y_t, D_s, X_s, Y_s, ml_f, ml_g, fold = 5L)
Arguments
D_t |
Target treatment; |
X_t |
Target design matrix; |
Y_t |
Target outcome; |
D_s |
Source treatment; |
X_s |
Source design matrix; |
Y_s |
Source outcome; |
ml_f |
Outcome learner for DoubleML (e.g. |
ml_g |
Treatment learner for DoubleML (same type). |
fold |
Number of folds for cross-fitting (default 5). |
Value
A list with components:
hat_rho_s |
Source causal estimate. |
beta_hat_s |
Source nuisance coefficient estimate. |
E_s |
Estimated |
hat_rho_PTL |
PTL causal estimate on target. |
See Also
Treatment Nuisance Function (Target Model)
Description
Linear treatment/confounding function for the target model: g_0(X, \gamma) = X \gamma.
Usage
g_0(X, Gamma)
Arguments
X |
Design matrix; |
Gamma |
Coefficient vector (column matrix); |
Value
Numeric vector of treatment equation values; length n.
See Also
Treatment Nuisance Function (Source Model)
Description
Linear treatment/confounding function for source models: g_k(X, \gamma) = X \gamma.
Same form as g_0; used in DGP for source data.
Usage
g_k(X, Gamma)
Arguments
X |
Design matrix; |
Gamma |
Coefficient vector (column matrix); |
Value
Numeric vector of treatment equation values; length n.
See Also
Bootstrap for Nuisance Function Estimation
Description
Runs bootstrap replications to estimate the nuisance quantity
E[Y - D \theta] using double machine learning, and returns the
bootstrap distribution (mean and variance) of the estimate.
Usage
run_bootstrap_E_hat(D, X, Y, B, ml_f, ml_g)
Arguments
D |
Treatment variable(s); matrix. |
X |
Design matrix. |
Y |
Outcome variable. |
B |
Number of bootstrap replications. |
ml_f |
Outcome learner for DoubleML. |
ml_g |
Treatment learner for DoubleML. |
Value
A list with components:
E_hat |
Numeric vector of length B: bootstrap estimates. |
E_hat_mean |
Mean of |
E_hat_var |
Variance of |