Type: | Package |
Title: | Bayesian Sparse Estimation of a Covariance Matrix |
Version: | 1.0.2 |
Date: | 2025-07-02 |
Maintainer: | Kyeongwon Lee <kwlee1718@gmail.com> |
Author: | Kwangmin Lee [aut], Kyeongwon Lee [aut, cre], Kyoungjae Lee [aut], Seongil Jo [aut], Jaeyong Lee [ctb] |
Depends: | R (≥ 4.2) |
Description: | Bayesian estimations of a covariance matrix for multivariate normal data. Assumes that the covariance matrix is sparse or band matrix and positive-definite. Methods implemented include the beta-mixture shrinkage prior (Lee et al. (2022) <doi:10.1016/j.jmva.2022.105067>), screened beta-mixture prior (Lee et al. (2024) <doi:10.1214/24-BA1495>), and post-processed posteriors for banded and sparse covariances (Lee et al. (2023) <doi:10.1214/22-BA1333>; Lee and Lee (2023) <doi:10.1016/j.jeconom.2023.105475>). This software has been developed using funding supported by Basic Science Research Program through the National Research Foundation of Korea ('NRF') funded by the Ministry of Education ('RS-2023-00211979', 'NRF-2022R1A5A7033499', 'NRF-2020R1A4A1018207' and 'NRF-2020R1C1C1A01013338'). |
Imports: | GIGrvg, coda, progress, BayesFactor, MASS, mvnfast, matrixcalc, matrixStats, purrr, dplyr, RSpectra, Matrix, plyr, CholWishart, magrittr, future, furrr, ks, ggplot2, ggmcmc, caret, FinCovRegularization, mvtnorm, stats |
Suggests: | hdbinseg, POET, tidyquant, tidyr, timetk, quantmod |
License: | GPL-2 |
LazyLoad: | yes |
URL: | https://github.com/statjs/bspcov |
Encoding: | UTF-8 |
RoxygenNote: | 7.3.2 |
NeedsCompilation: | no |
Packaged: | 2025-07-02 19:51:32 UTC; kwlee |
LazyData: | true |
LazyDataCompression: | xz |
Repository: | CRAN |
Date/Publication: | 2025-07-02 22:00:07 UTC |
SP500 dataset
Description
The S&P 500 dataset from State Street Global Advisors with the collection period from Jan 2013 to Nov 2023.
Format
'list'
Source
State Street Global Advisors.
Examples
data("SP500")
names(SP500)
Bayesian Estimation of a Banded Covariance Matrix
Description
Provides a post-processed posterior for Bayesian inference of a banded covariance matrix.
Usage
bandPPP(X, k, eps, prior = list(), nsample = 2000)
Arguments
X |
a n |
k |
a scalar value (natural number) specifying the bandwidth of covariance matrix. |
eps |
a small positive number decreasing to |
prior |
a list giving the prior information.
The list includes the following parameters (with default values in parentheses):
|
nsample |
a scalar value giving the number of the post-processed posterior samples. |
Details
Lee, Lee, and Lee (2023+) proposed a two-step procedure generating samples from the post-processed posterior for Bayesian inference of a banded covariance matrix:
Initial posterior computing step: Generate random samples from the following initial posterior obtained by using the inverse-Wishart prior
IW_p(B_0, \nu_0)
\Sigma \mid X_1, \ldots, X_n \sim IW_p(B_0 + nS_n, \nu_0 + n),
where
S_n = n^{-1}\sum_{i=1}^{n}X_iX_i^\top
.Post-processing step: Post-process the samples generated from the initial samples
\Sigma_{(i)} := \left\{\begin{array}{ll}B_{k}(\Sigma^{(i)}) + \left[\epsilon_n - \lambda_{\min}\{B_{k}(\Sigma^{(i)})\}\right]I_p, & \mbox{ if } \lambda_{\min}\{B_{k}(\Sigma^{(i)})\} < \epsilon_n, \\ B_{k}(\Sigma^{(i)}), & \mbox{ otherwise }, \end{array}\right.
where \Sigma^{(1)}, \ldots, \Sigma^{(N)}
are the initial posterior samples,
\epsilon_n
is a small positive number decreasing to 0
as n \rightarrow \infty
,
and B_k(B)
denotes the k
-band operation given as
B_{k}(B) = \{b_{ij}I(|i - j| \le k)\} \mbox{ for any } B = (b_{ij}) \in R^{p \times p}.
For more details, see Lee, Lee and Lee (2023+).
Value
Sigma |
a nsample |
p |
dimension of covariance matrix. |
Author(s)
Kwangmin Lee
References
Lee, K., Lee, K., and Lee, J. (2023+), "Post-processes posteriors for banded covariances", Bayesian Analysis, DOI: 10.1214/22-BA1333.
See Also
cv.bandPPP estimate
Examples
n <- 25
p <- 50
Sigma0 <- diag(1, p)
X <- MASS::mvrnorm(n = n, mu = rep(0, p), Sigma = Sigma0)
res <- bspcov::bandPPP(X,2,0.01,nsample=100)
Bayesian Sparse Covariance Estimation
Description
Provides a Bayesian sparse and positive definite estimate of a covariance matrix via the beta-mixture shrinkage prior.
Usage
bmspcov(X, Sigma, prior = list(), nsample = list())
Arguments
X |
a n |
Sigma |
an initial guess for Sigma. |
prior |
a list giving the prior information.
The list includes the following parameters (with default values in parentheses):
|
nsample |
a list giving the MCMC parameters.
The list includes the following integers (with default values in parentheses):
|
Details
Lee, Jo and Lee (2022) proposed the beta-mixture shrinkage prior for estimating a sparse and positive definite covariance matrix.
The beta-mixture shrinkage prior for \Sigma = (\sigma_{jk})
is defined as
\pi(\Sigma) = \frac{\pi^{u}(\Sigma)I(\Sigma \in C_p)}{\pi^{u}(\Sigma \in C_p)}, ~ C_p = \{\mbox{ all } p \times p \mbox{ positive definite matrices }\},
where \pi^{u}(\cdot)
is the unconstrained prior given by
\pi^{u}(\sigma_{jk} \mid \rho_{jk}) = N\left(\sigma_{jk} \mid 0, \frac{\rho_{jk}}{1 - \rho_{jk}}\tau_1^2\right)
\pi^{u}(\rho_{jk}) = Beta(\rho_{jk} \mid a, b), ~ \rho_{jk} = 1 - 1/(1 + \phi_{jk})
\pi^{u}(\sigma_{jj}) = Exp(\sigma_{jj} \mid \lambda).
For more details, see Lee, Jo and Lee (2022).
Value
Sigma |
a nmc |
Phi |
a nmc |
p |
dimension of covariance matrix. |
Author(s)
Kyoungjae Lee and Seongil Jo
References
Lee, K., Jo, S., and Lee, J. (2022), "The beta-mixture shrinkage prior for sparse covariances with near-minimax posterior convergence rate", Journal of Multivariate Analysis, 192, 105067.
See Also
sbmspcov
Examples
set.seed(1)
n <- 20
p <- 5
# generate a sparse covariance matrix:
True.Sigma <- matrix(0, nrow = p, ncol = p)
diag(True.Sigma) <- 1
Values <- -runif(n = p*(p-1)/2, min = 0.2, max = 0.8)
nonzeroIND <- which(rbinom(n=p*(p-1)/2,1,prob=1/p)==1)
zeroIND = (1:(p*(p-1)/2))[-nonzeroIND]
Values[zeroIND] <- 0
True.Sigma[lower.tri(True.Sigma)] <- Values
True.Sigma[upper.tri(True.Sigma)] <- t(True.Sigma)[upper.tri(True.Sigma)]
if(min(eigen(True.Sigma)$values) <= 0){
delta <- -min(eigen(True.Sigma)$values) + 1.0e-5
True.Sigma <- True.Sigma + delta*diag(p)
}
# generate a data
X <- MASS::mvrnorm(n = n, mu = rep(0, p), Sigma = True.Sigma)
# compute sparse, positive covariance estimator:
fout <- bspcov::bmspcov(X = X, Sigma = diag(diag(cov(X))))
post.est.m <- bspcov::estimate(fout)
sqrt(mean((post.est.m - True.Sigma)^2))
sqrt(mean((cov(X) - True.Sigma)^2))
colon dataset
Description
The colon cancer dataset, which includes gene expression values from 22 colon tumor tissues and 40 non-tumor tissues.
Format
'data.frame'
Source
http://genomics-pubs.princeton.edu/oncology/affydata/.
Examples
data("colon")
head(colon)
CV for Bayesian Estimation of a Banded Covariance Matrix
Description
Performs leave-one-out cross-validation (LOOCV) to calculate the predictive log-likelihood for a post-processed posterior of a banded covariance matrix and selects the optimal parameters.
Usage
cv.bandPPP(X, kvec, epsvec, prior = list(), nsample = 2000, ncores = 2)
Arguments
X |
a n |
kvec |
a vector of natural numbers specifying the bandwidth of covariance matrix. |
epsvec |
a vector of small positive numbers decreasing to |
prior |
a list giving the prior information.
The list includes the following parameters (with default values in parentheses):
|
nsample |
a scalar value giving the number of the post-processed posterior samples. |
ncores |
a scalar value giving the number of CPU cores. |
Details
The predictive log-likelihood for each k
and \epsilon_n
is estimated as follows:
\sum_{i=1}^n \log S^{-1} \sum_{s=1}^S p(X_i \mid B_k^{(\epsilon_n)}(\Sigma_{i,s})),
where X_i
is the ith observation, \Sigma_{i,s}
is the sth posterior sample based on (X_1,\ldots,X_{i-1},X_{i+1},\ldots,X_n)
, and B_k^{(\epsilon_n)}
represents the banding post-processing function.
For more details, see (3) in Lee, Lee and Lee (2023+).
Value
elpd |
a M |
Author(s)
Kwangmin Lee
References
Lee, K., Lee, K., and Lee, J. (2023+), "Post-processes posteriors for banded covariances", Bayesian Analysis, DOI: 10.1214/22-BA1333.
Gelman, A., Hwang, J., and Vehtari, A. (2014). "Understanding predictive information criteria for Bayesian models." Statistics and computing, 24(6), 997-1016.
See Also
bandPPP
Examples
Sigma0 <- diag(1,50)
X <- mvtnorm::rmvnorm(25,sigma = Sigma0)
kvec <- 1:2
epsvec <- c(0.01,0.05)
res <- bspcov::cv.bandPPP(X,kvec,epsvec,nsample=10,ncores=4)
plot(res)
CV for Bayesian Estimation of a Sparse Covariance Matrix
Description
Performs cross-validation to estimate spectral norm error for a post-processed posterior of a sparse covariance matrix.
Usage
cv.thresPPP(
X,
thresvec,
epsvec,
prior = NULL,
thresfun = "hard",
nsample = 2000,
ncores = 2
)
Arguments
X |
a n |
thresvec |
a vector of real numbers specifying the parameter of the threshold function. |
epsvec |
a vector of small positive numbers decreasing to |
prior |
a list giving the prior information.
The list includes the following parameters (with default values in parentheses):
|
thresfun |
a string to specify the type of threshold function. |
nsample |
a scalar value giving the number of the post-processed posterior samples. |
ncores |
a scalar value giving the number of CPU cores. |
Details
Given a set of train data and validation data, the spectral norm error for each \gamma
and \epsilon_n
is estimated as follows:
||\hat{\Sigma}(\gamma,\epsilon_n)^{(train)} - S^{(val)}||_2
where \hat{\Sigma}(\gamma,\epsilon_n)^{(train)}
is the estimate for the covariance based on the train data and S^{(val)}
is the sample covariance matrix based on the validation data.
The spectral norm error is estimated by the 10
-fold cross-validation.
For more details, see the first paragraph on page 9 in Lee and Lee (2023).
Value
CVdf |
a M |
Author(s)
Kwangmin Lee
References
Lee, K. and Lee, J. (2023), "Post-processes posteriors for sparse covariances", Journal of Econometrics, 236(3), 105475.
See Also
thresPPP
Examples
Sigma0 <- diag(1,50)
X <- mvtnorm::rmvnorm(25,sigma = Sigma0)
thresvec <- c(0.01,0.1)
epsvec <- c(0.01,0.1)
res <- bspcov::cv.thresPPP(X,thresvec,epsvec,nsample=100)
plot(res)
Point-estimate of posterior distribution
Description
Compute the point estimate (mean) to describe posterior distribution.
Usage
estimate(object, ...)
## S3 method for class 'bspcov'
estimate(object, ...)
Arguments
object |
an object from bandPPP, bmspcov, sbmspcov, and thresPPP. |
... |
additional arguments for estimate. |
Value
Sigma |
the point estimate (mean) of covariance matrix. |
Author(s)
Seongil Jo
See Also
plot.postmean.bspcov
Examples
n <- 25
p <- 50
Sigma0 <- diag(1, p)
X <- MASS::mvrnorm(n = n, mu = rep(0, p), Sigma = Sigma0)
res <- bspcov::bandPPP(X,2,0.01,nsample=100)
est <- bspcov::estimate(res)
Plot Diagnostics of Posterior Samples and Cross-Validation
Description
Provides a trace plot of posterior samples and a plot of a learning curve for cross-validation
Usage
## S3 method for class 'bspcov'
plot(x, ..., cols, rows)
Arguments
x |
an object from bmspcov, sbmspcov, cv.bandPPP, and cv.thresPPP. |
... |
additional arguments for ggplot2. |
cols |
a scalar or a vector including specific column indices for the trace plot. |
rows |
a scalar or a vector including specific row indices greater than or equal to columns indices for the trace plot. |
Value
plot |
a plot for diagnostics of posterior samples x. |
Author(s)
Seongil Jo
Examples
set.seed(1)
n <- 100
p <- 20
# generate a sparse covariance matrix:
True.Sigma <- matrix(0, nrow = p, ncol = p)
diag(True.Sigma) <- 1
Values <- -runif(n = p*(p-1)/2, min = 0.2, max = 0.8)
nonzeroIND <- which(rbinom(n=p*(p-1)/2,1,prob=1/p)==1)
zeroIND = (1:(p*(p-1)/2))[-nonzeroIND]
Values[zeroIND] <- 0
True.Sigma[lower.tri(True.Sigma)] <- Values
True.Sigma[upper.tri(True.Sigma)] <- t(True.Sigma)[upper.tri(True.Sigma)]
if(min(eigen(True.Sigma)$values) <= 0){
delta <- -min(eigen(True.Sigma)$values) + 1.0e-5
True.Sigma <- True.Sigma + delta*diag(p)
}
# generate a data
X <- MASS::mvrnorm(n = n, mu = rep(0, p), Sigma = True.Sigma)
# compute sparse, positive covariance estimator:
fout <- bspcov::sbmspcov(X = X, Sigma = diag(diag(cov(X))))
plot(fout, cols = c(1, 3, 4), rows = c(1, 3, 4))
plot(fout, cols = 1, rows = 1:3)
# Cross-Validation for Banded Structure
Sigma0 <- diag(1,50)
X <- mvtnorm::rmvnorm(25,sigma = Sigma0)
kvec <- 1:2
epsvec <- c(0.01,0.05)
res <- bspcov::cv.bandPPP(X,kvec,epsvec,nsample=10,ncores=4)
plot(res)
Draw a Heat Map for Point Estimate of Covariance Matrix
Description
Provides a heat map for posterior mean estimate of sparse covariance matrix
Usage
## S3 method for class 'postmean.bspcov'
plot(x, ...)
Arguments
x |
an object from estimate. |
... |
additional arguments for ggplot2. |
Value
plot |
a heatmap for point estimate of covariance matrix x. |
Author(s)
Seongil Jo
See Also
estimate
Examples
n <- 25
p <- 50
Sigma0 <- diag(1, p)
X <- MASS::mvrnorm(n = n, mu = rep(0, p), Sigma = Sigma0)
res <- bspcov::thresPPP(X, eps=0.01, thres=list(value=0.5,fun='hard'), nsample=100)
est <- bspcov::estimate(res)
plot(est)
Preprocess SP500 data
Description
The proc_SP500
function preprocesses the SP500 stock data by calculating monthly
returns for selected sectors and generating idiosyncratic errors.
Usage
proc_SP500(SP500data, sectors)
Arguments
SP500data |
A data frame containing SP500 stock data with columns including:
|
sectors |
A character vector specifying the sectors to include in the analysis. |
Details
Calculates monthly returns for each stock in the specified sectors
Estimates the number of factors via
hdbinseg::get.factor.model(ic="ah")
Uses
POET::POET()
to extract factor loadings/factors and form idiosyncratic errors
Value
A list with components:
Uhat |
A matrix of idiosyncratic errors. |
Khat |
Estimated number of factors. |
factorparthat |
Estimated factor returns. |
sectornames |
Sector for each column of |
Examples
data("SP500")
set.seed(1234)
sectors <- c("Energy", "Financials", "Materials",
"Real Estate", "Utilities", "Information Technology")
Uhat <- proc_SP500(SP500, sectors)$Uhat
PPPres <- thresPPP(Uhat, eps = 0, thres = list(value = 0.0020, fun = "hard"), nsample = 100)
postmean <- estimate(PPPres)
diag(postmean) <- 0 # hide color for diagonal
plot(postmean)
Preprocess Colon Gene Expression Data
Description
The proc_colon
function preprocesses colon gene expression data by:
Log transforming the raw counts.
Performing two-sample t-tests for each gene between normal and tumor samples.
Selecting the top 50 genes by absolute t-statistic.
Returning the filtered expression matrix and sample indices/groups.
Usage
proc_colon(colon, tissues)
Arguments
colon |
A numeric matrix of raw colon gene expression values (genes × samples). Rows are genes; columns are samples. |
tissues |
A numeric vector indicating tissue type per sample: positive for normal, negative for tumor. |
Value
A list with components:
- X
A numeric matrix (samples x 50 genes) of selected, log‐transformed expression values.
- normal_idx
Integer indices of normal‐tissue columns in the original data.
- tumor_idx
Integer indices of tumor‐tissue columns in the original data.
- group
Integer vector of length
ncol(colon)
, with 1 = normal, 2 = tumor.
Examples
data("colon")
data("tissues")
set.seed(1234)
colon_data <- proc_colon(colon, tissues)
X <- colon_data$X
foo <- bmspcov(X, Sigma = cov(X))
sigmah <- estimate(foo)
Bayesian Sparse Covariance Estimation using Sure Screening
Description
Provides a Bayesian sparse and positive definite estimate of a covariance matrix via screened beta-mixture prior.
Usage
sbmspcov(X, Sigma, cutoff = list(), prior = list(), nsample = list())
Arguments
X |
a n |
Sigma |
an initial guess for Sigma. |
cutoff |
a list giving the information for the threshold.
The list includes the following parameters (with default values in parentheses):
|
prior |
a list giving the prior information.
The list includes the following parameters (with default values in parentheses):
|
nsample |
a list giving the MCMC parameters.
The list includes the following integers (with default values in parentheses):
|
Details
Lee, Jo, Lee, and Lee (2023+) proposed the screened beta-mixture shrinkage prior for estimating a sparse and positive definite covariance matrix.
The screened beta-mixture shrinkage prior for \Sigma = (\sigma_{jk})
is defined as
\pi(\Sigma) = \frac{\pi^{u}(\Sigma)I(\Sigma \in C_p)}{\pi^{u}(\Sigma \in C_p)}, ~ C_p = \{\mbox{ all } p \times p \mbox{ positive definite matrices }\},
where \pi^{u}(\cdot)
is the unconstrained prior given by
\pi^{u}(\sigma_{jk} \mid \psi_{jk}) = N\left(\sigma_{jk} \mid 0, \frac{\psi_{jk}}{1 - \psi_{jk}}\tau_1^2\right), ~ \psi_{jk} = 1 - 1/(1 + \phi_{jk})
\pi^{u}(\psi_{jk}) = Beta(\psi_{jk} \mid a, b) ~ \mbox{for } (j, k) \in S_r(\hat{R})
\pi^{u}(\sigma_{jj}) = Exp(\sigma_{jj} \mid \lambda),
where S_r(\hat{R}) = \{(j,k) : 1 < j < k \le p, |\hat{\rho}_{jk}| > r\}
, \hat{\rho}_{jk}
are sample correlations, and r
is a threshold given by user.
For more details, see Lee, Jo, Lee and Lee (2022+).
Value
Sigma |
a nmc |
p |
dimension of covariance matrix. |
Phi |
a nmc |
INDzero |
a list including indices of off-diagonal elements screened by sure screening. |
cutoff |
the cutoff value specified by FNR-approach. |
Author(s)
Kyoungjae Lee and Seongil Jo
References
Lee, K., Jo, S., Lee, K., and Lee, J. (2023+), "Scalable and optimal Bayesian inference for sparse covariance matrices via screened beta-mixture prior", arXiv:2206.12773.
See Also
bmspcov
Examples
set.seed(1)
n <- 20
p <- 5
# generate a sparse covariance matrix:
True.Sigma <- matrix(0, nrow = p, ncol = p)
diag(True.Sigma) <- 1
Values <- -runif(n = p*(p-1)/2, min = 0.2, max = 0.8)
nonzeroIND <- which(rbinom(n=p*(p-1)/2,1,prob=1/p)==1)
zeroIND = (1:(p*(p-1)/2))[-nonzeroIND]
Values[zeroIND] <- 0
True.Sigma[lower.tri(True.Sigma)] <- Values
True.Sigma[upper.tri(True.Sigma)] <- t(True.Sigma)[upper.tri(True.Sigma)]
if(min(eigen(True.Sigma)$values) <= 0){
delta <- -min(eigen(True.Sigma)$values) + 1.0e-5
True.Sigma <- True.Sigma + delta*diag(p)
}
# generate a data
X <- MASS::mvrnorm(n = n, mu = rep(0, p), Sigma = True.Sigma)
# compute sparse, positive covariance estimator:
fout <- bspcov::sbmspcov(X = X, Sigma = diag(diag(cov(X))))
post.est.m <- bspcov::estimate(fout)
sqrt(mean((post.est.m - True.Sigma)^2))
sqrt(mean((cov(X) - True.Sigma)^2))
Summary of Posterior Distribution
Description
Provides the summary statistics for posterior samples of covariance matrix.
Usage
## S3 method for class 'bspcov'
summary(object, cols, rows, ...)
Arguments
object |
an object from bandPPP, bmspcov, sbmspcov, and thresPPP. |
cols |
a scalar or a vector including specific column indices. |
rows |
a scalar or a vector including specific row indices greater than or equal to columns indices. |
... |
additional arguments for the summary function. |
Value
summary |
a table of summary statistics including empirical mean, standard deviation, and quantiles for posterior samples |
Note
If both cols
and rows
are vectors, they must have the same length.
Author(s)
Seongil Jo
Examples
set.seed(1)
n <- 20
p <- 5
# generate a sparse covariance matrix:
True.Sigma <- matrix(0, nrow = p, ncol = p)
diag(True.Sigma) <- 1
Values <- -runif(n = p*(p-1)/2, min = 0.2, max = 0.8)
nonzeroIND <- which(rbinom(n=p*(p-1)/2,1,prob=1/p)==1)
zeroIND = (1:(p*(p-1)/2))[-nonzeroIND]
Values[zeroIND] <- 0
True.Sigma[lower.tri(True.Sigma)] <- Values
True.Sigma[upper.tri(True.Sigma)] <- t(True.Sigma)[upper.tri(True.Sigma)]
if(min(eigen(True.Sigma)$values) <= 0){
delta <- -min(eigen(True.Sigma)$values) + 1.0e-5
True.Sigma <- True.Sigma + delta*diag(p)
}
# generate a data
X <- MASS::mvrnorm(n = n, mu = rep(0, p), Sigma = True.Sigma)
# compute sparse, positive covariance estimator:
fout <- bspcov::sbmspcov(X = X, Sigma = diag(diag(cov(X))))
summary(fout, cols = c(1, 3, 4), rows = c(1, 3, 4))
summary(fout, cols = 1, rows = 1:p)
Bayesian Estimation of a Sparse Covariance Matrix
Description
Provides a post-processed posterior (PPP) for Bayesian inference of a sparse covariance matrix.
Usage
thresPPP(X, eps, thres = list(), prior = list(), nsample = 2000)
Arguments
X |
a n |
eps |
a small positive number decreasing to |
thres |
a list giving the information for thresholding PPP procedure.
The list includes the following parameters (with default values in parentheses):
|
prior |
a list giving the prior information.
The list includes the following parameters (with default values in parentheses):
|
nsample |
a scalar value giving the number of the post-processed posterior samples. |
Details
Lee and Lee (2023) proposed a two-step procedure generating samples from the post-processed posterior for Bayesian inference of a sparse covariance matrix:
Initial posterior computing step: Generate random samples from the following initial posterior obtained by using the inverse-Wishart prior
IW_p(B_0, \nu_0)
\Sigma \mid X_1, \ldots, X_n \sim IW_p(B_0 + nS_n, \nu_0 + n),
where
S_n = n^{-1}\sum_{i=1}^{n}X_iX_i^\top
.Post-processing step: Post-process the samples generated from the initial samples
\Sigma_{(i)} := \left\{\begin{array}{ll}H_{\gamma_n}(\Sigma^{(i)}) + \left[\epsilon_n - \lambda_{\min}\{H_{\gamma_n}(\Sigma^{(i)})\}\right]I_p, & \mbox{ if } \lambda_{\min}\{H_{\gamma_n}(\Sigma^{(i)})\} < \epsilon_n, \\ H_{\gamma_n}(\Sigma^{(i)}), & \mbox{ otherwise }, \end{array}\right.
where \Sigma^{(1)}, \ldots, \Sigma^{(N)}
are the initial posterior samples,
\epsilon_n
is a positive real number, and H_{\gamma_n}(\Sigma)
denotes the generalized threshodling operator given as
(H_{\gamma_n}(\Sigma))_{ij} = \left\{\begin{array}{ll}\sigma_{ij}, & \mbox{ if } i = j, \\
h_{\gamma_n}(\sigma_{ij}), & \mbox{ if } i \neq j, \end{array}\right.
where \sigma_{ij}
is the (i,j)
element of \Sigma
and h_{\gamma_n}(\cdot)
is a generalized thresholding function.
For more details, see Lee and Lee (2023).
Value
Sigma |
a nsample |
p |
dimension of covariance matrix. |
Author(s)
Kwangmin Lee
References
Lee, K. and Lee, J. (2023), "Post-processes posteriors for sparse covariances", Journal of Econometrics.
See Also
cv.thresPPP
Examples
n <- 25
p <- 50
Sigma0 <- diag(1, p)
X <- MASS::mvrnorm(n = n, mu = rep(0, p), Sigma = Sigma0)
res <- bspcov::thresPPP(X, eps=0.01, thres=list(value=0.5,fun='hard'), nsample=100)
est <- bspcov::estimate(res)
tissues dataset
Description
The tissues data of colon cancer dataset, which includes gene expression values from 22 colon tumor tissues and 40 non-tumor tissues.
Format
'numeric'
Source
http://genomics-pubs.princeton.edu/oncology/affydata/.
Examples
data("tissues")
head(tissues)