Title: | Efficient Tuning-Free Conformal Prediction |
Version: | 1.0 |
Description: | An implementation of efficiency first conformal prediction (EFCP) and validity first conformal prediction (VFCP) that demonstrates both validity (coverage guarantee) and efficiency (width guarantee). To learn how to use it, check the vignettes for a quick tutorial. The package is based on the work by Yang Y., Kuchibhotla A.,(2021) <doi:10.48550/arXiv.2104.13871>. |
URL: | https://github.com/Elsa-Yang98/ConformalSmallest |
Imports: | glmnet, mvtnorm, stats, MASS, quantregForest |
License: | GPL (≥ 3) |
Encoding: | UTF-8 |
RoxygenNote: | 7.1.1 |
LazyData: | true |
Suggests: | testthat (≥ 3.0.0), knitr, rmarkdown, ggplot2, repr |
Config/testthat/edition: | 3 |
Depends: | R (≥ 3.5.0) |
VignetteBuilder: | knitr |
NeedsCompilation: | no |
Packaged: | 2021-08-08 15:56:27 UTC; elsayang |
Author: | Yachong Yang [aut, cre] |
Maintainer: | Yachong Yang <yachong@wharton.upenn.edu> |
Repository: | CRAN |
Date/Publication: | 2021-08-09 14:10:06 UTC |
Conditional width and coverage for CQR, internal function used inside conf_CQR_conditional
Description
Conditional width and coverage for CQR, internal function used inside conf_CQR_conditional
Usage
conf_CQR(X1, Y1, X2, Y2, beta, mtry, ntree, alpha = 0.1)
Arguments
X1 |
training matrix to fit the quantile regression forest |
Y1 |
training vector |
X2 |
training matrix to compute the conformal scores |
Y2 |
training vector to compute the conformal scores |
beta |
nominal quantile level |
mtry |
random forest parameter |
ntree |
random forest parameter |
alpha |
miscoverage level |
Value
a function for computing conditional width and coverage
Conditional width and coverage for CQR
Description
Conditional width and coverage for CQR
Usage
conf_CQR_conditional(x, y, beta, mtry, ntree, alpha = 0.1)
Arguments
x |
A N*d training matrix |
y |
A N*1 training vector |
beta |
nominal quantile level |
mtry |
random forest parameter |
ntree |
random forest parameter |
alpha |
miscoverage level |
Value
a function for computing conditional width and coverage
preliminary function for CQR
Description
preliminary function for CQR
Usage
conf_CQR_prelim(X1, Y1, X2, Y2, beta_grid, mtry, ntree, alpha = 0.1)
Arguments
X1 |
A n1*d matrix for training |
Y1 |
A n1*1 vector for training |
X2 |
A n2*d matrix for calibration |
Y2 |
A n2*1 vector for calibration |
beta_grid |
a grid of beta's |
mtry |
mtry parameter in random forest |
ntree |
number of trees parameter in random forest |
alpha |
miscoverage level |
Value
the smallest width and its corresponding beta
EFCP and VFCP for CQR, CQR-m, CQR-r
Description
EFCP and VFCP for CQR, CQR-m, CQR-r
Usage
conf_CQR_reg(
x,
y,
split,
beta_grid,
mtry_grid,
ntree_grid,
method = "efficient",
alpha = 0.1
)
Arguments
x |
A N*d training matrix |
y |
A N*1 training vector |
split |
a vector of length 1 for efcp, length 2 for vfcp |
beta_grid |
a grid of beta's |
mtry_grid |
a grid of mtry |
ntree_grid |
a grid of ntree |
method |
"efficient" for efcp; "valid" for vfcp |
alpha |
miscoverage level |
Value
the selected cqr method
Conditional width and coverage for EFCP, VFCP between CQR, CQR-m, CQR-r
Description
Conditional width and coverage for EFCP, VFCP between CQR, CQR-m, CQR-r
Usage
conf_CQR_reg_conditional(
x,
y,
split,
beta_grid,
mtry_grid,
ntree_grid,
method = "efficient",
alpha = 0.1
)
Arguments
x |
A N*d training matrix |
y |
A N*1 training vector |
split |
a vector of length 1 for efcp, length 2 for vfcp |
beta_grid |
a grid of beta's |
mtry_grid |
a grid of mtry |
ntree_grid |
a grid of ntree |
method |
"efficient" for efcp; "valid" for vfcp |
alpha |
miscoverage level |
Value
the selected cqr method
Cross validation conformal prediction for ridge regression
Description
Cross validation conformal prediction for ridge regression
Usage
cv.fun(X, Y, X0, lambda = seq(0, 100, length = 100), nfolds = 10, alpha = 0.1)
Arguments
X |
A N*d training matrix |
Y |
A N*1 training vector |
X0 |
A N0*d testing vector |
lambda |
a sequence of penalty parameters for ridge regression |
nfolds |
number of folds |
alpha |
miscoverage level |
Value
upper and lower prediction intervals for X0
Efficiency first conformal prediction for ridge regression
Description
Efficiency first conformal prediction for ridge regression
Usage
efcp.fun(X, Y, X0, lambda = seq(0, 100, length = 100), alpha = 0.1)
Arguments
X |
A N*d training matrix |
Y |
A N*1 training vector |
X0 |
A N0*d testing vector |
lambda |
a sequence of penalty parameters for ridge regression |
alpha |
miscoverage level |
Value
upper and lower prediction intervals for X0.
Examples
df=3
d = 5
n=50 #number of training samples
n0=10 #number of prediction points
rho=0.5
Sigma=matrix(rho,d,d)
diag(Sigma)=rep(1,d)
beta=rep(1:5,d/5)
X0=mvtnorm::rmvt(n0,Sigma,df)
X=mvtnorm::rmvt(n,Sigma,df) #multivariate t distribution
eps=rt(n,df)*(1+sqrt(X[,1]^2+X[,2]^2))
Y=X%*%beta+eps
out.efcp=efcp.fun(X,Y,X0)
out.efcp$up
out.efcp$lo
Efficiency first conformal prediction for Conformal Quantile Regression
Description
Efficiency first conformal prediction for Conformal Quantile Regression
Usage
efcp_cqr(x, y, split, beta_grid, params_grid, alpha = 0.1)
Arguments
x |
A N*d training matrix |
y |
A N*1 training vector |
split |
a number between 0 and 1 |
beta_grid |
a grid of beta's |
params_grid |
a grid of mtry and ntree |
alpha |
miscoverage level |
Value
average prediction width and a function for coverage on some testing points
Efficiency first conformal prediction for ridge regression
Description
Efficiency first conformal prediction for ridge regression
Usage
efcp_ridge(X, Y, X0, lambda = seq(0, 100, length = 100), alpha = 0.1)
Arguments
X |
A N*d training matrix |
Y |
A N*1 training vector |
X0 |
A N0*d testing vector |
lambda |
a sequence of penalty parameters for ridge regression |
alpha |
miscoverage level |
Value
upper and lower prediction intervals for X0.
Examples
df=3
d = 5
n=50 #number of training samples
n0=10 #number of prediction points
rho=0.5
Sigma=matrix(rho,d,d)
diag(Sigma)=rep(1,d)
beta=rep(1:5,d/5)
X0=mvtnorm::rmvt(n0,Sigma,df)
X=mvtnorm::rmvt(n,Sigma,df) #multivariate t distribution
eps=rt(n,df)*(1+sqrt(X[,1]^2+X[,2]^2))
Y=X%*%beta+eps
out.efcp=efcp.fun(X,Y,X0)
out.efcp$up
out.efcp$lo
Conformal prediction for linear regression
Description
Conformal prediction for linear regression
Usage
ginverse.fun(x, y, x0, alpha = 0.1)
Arguments
x |
A N*d training matrix |
y |
A N*1 training vector |
x0 |
A N0*d testing vector |
alpha |
miscoverage level |
Value
upper and lower prediction intervals for X0
Internal function used for ginverse.fun
Description
Internal function used for ginverse.fun
Usage
ginverselm.funs(intercept = TRUE, lambda = 0)
Arguments
intercept |
default is TRUE |
lambda |
a vector |
Internal function used for ginverse.fun
Description
Internal function used for ginverse.fun
Usage
my.ginverselm.funs
Format
An object of class list
of length 4.
Conformal prediction for linear regression
Description
Conformal prediction for linear regression
Usage
naive.fun(X, Y, X0, alpha = 0.1)
Arguments
X |
A N*d training matrix |
Y |
A N*1 training vector |
X0 |
A N0*d testing vector |
alpha |
miscoverage level |
Value
upper and lower prediction intervals for X0
Outcomes of an example for tuning-free conformalized quantile regression(CQR).
Description
A dataset containing the experiment results used in the vignettes.
Usage
pois_n400_reps100
Format
A list with 10 elements: x_test, n,nrep,width_mat, cov_mat,beta_mat, ntree_mat, cqr_method_mat, evaluations, alpha
- x_test
test points of x
- n
number of training samples
- nrep
number of replications
- width_mat
a data frame with the first column being the width of the prediction regions
- cov_mat
a data frame with the first column being the coverage of the prediction regions
- beta_mat
a data frame with the first column being the beta for CQR used in the final prediction
- ntree_mat
a data frame with the first column being the number of trees for CQR used in the final prediction
- ntree_mat
a data frame with the first column being the CQR method (among CQR, CQR-m, CQR-r)used in the final prediction
- alpha
desired miscoverage level
Source
For details please see the "Example-tuning_free_CQR" vignette:vignette("Example-tuning_free_CQR", package = "ConformalSmallest")
Outcomes of an example for tuning-free conformal prediction with ridge regression.
Description
A dataset containing the experiment results used in the vignettes.
Usage
ridge_linear_cov100_t3
Format
A list with 7 elements: dim_linear_t3,cov.param_linear_fm_t3, cov.naive_linear_fm_t3, cov.vfcp_linear_fm_t3, cov.star_linear_fm_t3, cov.cv5_linear_fm_t3, cov.efcp_linear_fm_t3
- dim
dimensions used in the experiment
- len.param
a matrix with coverages for the prediction regions produced by the parametric method
- len.naive
a matrix with coverages for the prediction regions produced by naive linear regression method
- len.vfcp
na matrix with coverages for the prediction regions produced by VFCP
- len.star
a matrix with coverages for the prediction regions produced by cross validation with the errors
- len.cv5
a matrix with coverages for the prediction regions produced by cross-validation with 5 splits
- len.efcp
a matrix with coverages for the prediction regions produced by efcp
Source
For details please see the "Example-tuning_free_ridge_regression" vignette:vignette("Example-tuning_free_ridge_regression", package = "ConformalSmallest")
Outcomes of an example for tuning-free conformal prediction with ridge regression.
Description
A dataset containing the experiment results used in the vignettes.
Usage
ridge_linear_cov100_t5
Format
A list with 7 elements: dim_linear_t5,cov.param_linear_fm_t5, cov.naive_linear_fm_t5, cov.vfcp_linear_fm_t5, cov.star_linear_fm_t5, cov.cv5_linear_fm_t5, cov.efcp_linear_fm_t5
- dim
dimensions used in the experiment
- cov.param
a matrix with coverages for the prediction regions produced by the parametric method
- cov.naive
a matrix with coverages for the prediction regions produced by naive linear regression method
- cov.vfcp
na matrix with coverages for the prediction regions produced by VFCP
- cov.star
a matrix with coverages for the prediction regions produced by cross validation with the errors
- cov.cv5
a matrix with coverages for the prediction regions produced by cross-validation with 5 splits
- cov.efcp
a matrix with coverages for the prediction regions produced by efcp
Source
For details please see the "Example-tuning_free_ridge_regression" vignette:vignette("Example-tuning_free_ridge_regression", package = "ConformalSmallest")
Outcomes of an example for tuning-free conformal prediction with ridge regression.
Description
A dataset containing the experiment results used in the vignettes.
Usage
ridge_linear_len100_t3
Format
A list with 6 elements: len.param_linear_fm_t3, len.naive_linear_fm_t3, len.vfcp_linear_fm_t3, len.star_linear_fm_t3, len.cv5_linear_fm_t3, len.efcp_linear_fm_t3
- len.param
a matrix with widths for the prediction regions produced by the parametric method
- len.naive
a matrix with widths for the prediction regions produced by naive linear regression method
- len.vfcp
na matrix with widths for the prediction regions produced by VFCP
- len.star
a matrix with widths for the prediction regions produced by cross validation with the errors
- len.cv5
a matrix with widths for the prediction regions produced by cross-validation with 5 splits
- len.efcp
a matrix with widths for the prediction regions produced by efcp
Source
For details please see the "Example-tuning_free_ridge_regression" vignette:vignette("Example-tuning_free_ridge_regression", package = "ConformalSmallest")
Outcomes of an example for tuning-free conformal prediction with ridge regression.
Description
A dataset containing the experiment results used in the vignettes.
Usage
ridge_linear_len100_t5
Format
A list with 6 elements: len.param_linear_fm_t5, len.naive_linear_fm_t5, len.vfcp_linear_fm_t5, len.star_linear_fm_t5, len.cv5_linear_fm_t5, len.efcp_linear_fm_t5
- len.param
a matrix with widths for the prediction regions produced by the parametric method
- len.naive
a matrix with widths for the prediction regions produced by naive linear regression method
- len.vfcp
na matrix with widths for the prediction regions produced by VFCP
- len.star
a matrix with widths for the prediction regions produced by cross validation with the errors
- len.cv5
a matrix with widths for the prediction regions produced by cross-validation with 5 splits
- len.efcp
a matrix with widths for the prediction regions produced by efcp
Source
For details please see the "Example-tuning_free_ridge_regression" vignette:vignette("Example-tuning_free_ridge_regression", package = "ConformalSmallest")
Conformal prediction for ridge regression, tuning parameter by minimizing the mean of the residuals
Description
Conformal prediction for ridge regression, tuning parameter by minimizing the mean of the residuals
Usage
star.fun(X, Y, X0, lambda = seq(0, 100, length = 100), alpha = 0.1)
Arguments
X |
A N*d training matrix |
Y |
A N*1 training vector |
X0 |
A N0*d testing vector |
lambda |
a sequence of penalty parameters for ridge regression |
alpha |
miscoverage level |
Value
upper and lower prediction intervals for X0
Validity first conformal prediction for ridge regression
Description
Validity first conformal prediction for ridge regression
Usage
vfcp.fun(X, Y, X0, lambda = seq(0, 100, length = 100), alpha = 0.1)
Arguments
X |
A N*d training matrix |
Y |
A N*1 training vector |
X0 |
A N0*d testing vector |
lambda |
a sequence of penalty parameters for ridge regression |
alpha |
miscoverage level |
Value
upper and lower prediction intervals for X0.
Examples
df=3
d = 5
n=50 #number of training samples
n0=10 #number of prediction points
rho=0.5
Sigma=matrix(rho,d,d)
diag(Sigma)=rep(1,d)
beta=rep(1:5,d/5)
X0=mvtnorm::rmvt(n0,Sigma,df)
X=mvtnorm::rmvt(n,Sigma,df) #multivariate t distribution
eps=rt(n,df)*(1+sqrt(X[,1]^2+X[,2]^2))
Y=X%*%beta+eps
out.vfcp=vfcp.fun(X,Y,X0)
out.vfcp$up
out.vfcp$lo
Validity first conformal prediction for Conformal Quantile Regression
Description
Validity first conformal prediction for Conformal Quantile Regression
Usage
vfcp_cqr(x, y, split, beta_grid, params_grid, alpha = 0.1)
Arguments
x |
A N*d training matrix |
y |
A N*1 training vector |
split |
a number between 0 and 1 |
beta_grid |
a grid of beta's |
params_grid |
a grid of mtry and ntree |
alpha |
miscoverage level |
Value
average prediction width and a function for coverage on some testing points
Validity first conformal prediction for ridge regression
Description
Validity first conformal prediction for ridge regression
Usage
vfcp_ridge(X, Y, X0, lambda = seq(0, 100, length = 100), alpha = 0.1)
Arguments
X |
A N*d training matrix |
Y |
A N*1 training vector |
X0 |
A N0*d testing vector |
lambda |
a sequence of penalty parameters for ridge regression |
alpha |
miscoverage level |
Value
upper and lower prediction intervals for X0.
Examples
df=3
d = 5
n=50 #number of training samples
n0=10 #number of prediction points
rho=0.5
Sigma=matrix(rho,d,d)
diag(Sigma)=rep(1,d)
beta=rep(1:5,d/5)
X0=mvtnorm::rmvt(n0,Sigma,df)
X=mvtnorm::rmvt(n,Sigma,df) #multivariate t distribution
eps=rt(n,df)*(1+sqrt(X[,1]^2+X[,2]^2))
Y=X%*%beta+eps
out.vfcp=vfcp.fun(X,Y,X0)
out.vfcp$up
out.vfcp$lo