Type: | Package |
Title: | Rank and Factor Loadings Estimation in Time Series Tensor Factor Models |
Version: | 1.1.0 |
Author: | Weilin Chen [aut, cre] |
Description: | A set of functions to estimate rank and factor loadings of time series tensor factor models. A tensor is a multidimensional array. To analyze high-dimensional tensor time series, factor model is a major dimension reduction tool. 'TensorPreAve' provides functions to estimate the rank of core tensors and factor loading spaces of tensor time series. More specifically, a pre-averaging method that accumulates information from tensor fibres is used to estimate the factor loading spaces. The estimated directions corresponding to the strongest factors are then used for projecting the data for a potentially improved re-estimation of the factor loading spaces themselves. A new rank estimation method is also implemented to utilizes correlation information from the projected data. See Chen and Lam (2023) <doi:10.48550/arXiv.2208.04012> for more details. |
License: | GPL-3 |
Encoding: | UTF-8 |
LazyData: | true |
URL: | https://github.com/William-Chenwl/TensorPreAve |
RoxygenNote: | 7.2.1 |
Imports: | rTensor,MASS,stats,pracma |
Depends: | R (≥ 2.10) |
Suggests: | knitr, rmarkdown |
VignetteBuilder: | knitr |
NeedsCompilation: | no |
Packaged: | 2023-04-14 13:12:02 UTC; William Chan |
Maintainer: | Weilin Chen <w.chen56@lse.ac.uk> |
Repository: | CRAN |
Date/Publication: | 2023-04-14 13:20:02 UTC |
Bootstrap Rank Estimation.
Description
Function to estimate the rank of the core tensor by Bootstrapped Correlation Thresholding.
Usage
bs_cor_rank(X, initial_direction, r_range = NULL, C_range = NULL, B = 50)
Arguments
X |
A 'Tensor' object defined in package rTensor with |
initial_direction |
Direction corresponds to the strongest factors, written in a list of |
r_range |
Approximate range of |
C_range |
The range of constant C for calculating threshold. Default is |
B |
Number of bootstrap samples. Default is 50. Can be set as 10 to save time when dimension is large. |
Details
Input a tensor time series and estimated directions corresponding to the strongest factors, return the estimated rank of core tensor.
Value
A vector of length K
, indicating estimated number of factors in each mode.
Examples
# Example of real data set
set.seed(10)
Q_PRE = pre_est(value_weight_tensor)
Q_PROJ = iter_proj(value_weight_tensor, initial_direction = Q_PRE)
bs_rank = bs_cor_rank(value_weight_tensor, Q_PROJ)
bs_rank
# Example using generated data
K = 2
T = 100
d = c(40,40)
r = c(2,2)
re = c(2,2)
eta = list(c(0,0),c(0,0))
u = list(c(-2,2),c(-2,2))
set.seed(10)
Data_test = tensor_data_gen(K,T,d,r,re,eta,u)
X = Data_test$X
Q_PRE = pre_est(X)
Q_PROJ = iter_proj(X, initial_direction = Q_PRE)
bs_rank = bs_cor_rank(X, Q_PROJ)
bs_rank
Equal weight Fama-French portfolio returns data.
Description
Equal weight Fama-French portfolio returns data formed on size and operating profitability of Chen and Lam (2023).
Format
A 576 × 10 × 10 'Tensor' object defined in package rTensor, where mode-1,2,3 correspond to time, OP levels and size levels, respectively.
Details
Stocks are categorized into 10 different sizes (market equity, using NYSE market equity deciles) and 10 different operating profitability (OP) levels (using NYSE OP deciles. OP is annual revenues minus cost of goods sold, interest expense, and selling, general, and administrative expenses divided by book equity for the last fiscal year end). The stocks in each of the 10 × 10 categories form a portfolio by equal weight. We use monthly data from July 1973 to June 2021, so that T = 576, and each data tensor we have thus has size 10 × 10 × 576. Since the market factor is certainly pervasive in financial returns, we use the CAPM to remove its effects and facilitate detection of potentially weaker factors.
References
Chen, W. and Lam, C. (2023). Rank and Factor Loadings Estimation in Time Series Tensor Factor Model by Pre-averaging. Manuscript.
Iterative Projection Estimator.
Description
Function for Iterative Projection Direction Refinement to re-estimate the factor loading matrices.
Usage
iter_proj(X, initial_direction, proj_N = 30, z = rep(1, X@num_modes - 1))
Arguments
X |
A 'Tensor' object defined in package rTensor with |
initial_direction |
Initial direction for projection, written in a list of |
proj_N |
Number of iterations, should be a positive integer. Default is 30. |
z |
(Estimated) Rank of the core tensor, written as a vector of length |
Details
Input a tensor time series and initial estimated directions corresponding to the strongest factors, return the estimated factor loading matrices (or directions) using the Algorithm for Iterative Projection Direction Refinement.
Value
A list of K
estimated factor loading matrices.
Examples
# Example of a real data set
set.seed(10)
Q_PRE = pre_est(value_weight_tensor)
Q_PROJ = iter_proj(value_weight_tensor, initial_direction = Q_PRE)
Q_PROJ
set.seed(10)
Q_PRE = pre_est(value_weight_tensor)
Q_PROJ_2 = iter_proj(value_weight_tensor, initial_direction = Q_PRE, z = c(2,2))
Q_PROJ_2
# Example using generated data
K = 2
T = 100
d = c(40,40)
r = c(2,2)
re = c(2,2)
eta = list(c(0,0),c(0,0))
u = list(c(-2,2),c(-2,2))
set.seed(10)
Data_test = tensor_data_gen(K,T,d,r,re,eta,u)
X = Data_test$X
Q_PRE = pre_est(X)
Q_PROJ = iter_proj(X, initial_direction = Q_PRE, z = r)
Q_PROJ
Eigenvalue Plot of a Random Sample
Description
Function to plot the eigenvalues of the sample covariance matrix of a randomly chosen sample.
Usage
pre_eigenplot(X, k)
Arguments
X |
A 'Tensor' object defined in package rTensor with |
k |
The mode to plot the eigenvalues for. |
Details
Input a tensor time series and a mode index, output the plot of eigenvalues of the sample covariance matrix of the given mode,
with a randomly chosen sample of the mode-k
fibres. This helps users to choose the parameter eigen_j
in function pre_est
.
A large dip should be observed at the (r_k+1
)-th position of the plot,
and user can choose eigen_j
to be a bit larger than the position of dip observed to avoid missing potential weak factors. If such a dip
is not observed, try to run the function for a few times until it can be observed.
Examples
# Example of a real data set
set.seed(800)
pre_eigenplot(value_weight_tensor, k = 2)
# Example using generated data
K = 2
T = 100
d = c(40,40)
r = c(2,2)
re = c(2,2)
eta = list(c(0,0),c(0,0))
u = list(c(-2,2),c(-2,2))
set.seed(10)
Data_test = tensor_data_gen(K,T,d,r,re,eta,u)
X = Data_test$X
pre_eigenplot(X, k = 1)
Pre-Averaging Estimator
Description
Function for the initial Pre-Averaging Procedure.
Usage
pre_est(X, z = rep(1, X@num_modes - 1), M0 = 200, M = 5, eigen_j = NULL)
Arguments
X |
A 'Tensor' object defined in package rTensor with |
z |
(Estimated) Rank of the core tensor, written as a vector of length |
M0 |
Number of random samples to generate, should be a positive integer. Default is 200. |
M |
Number of chosen samples for pre-averaging, should be a positive integer. Usually can be set as constants (5 or 10) or 2.5 percents of |
eigen_j |
The j-th eigenvalue to calculate eigenvalue-ratio for a randomly chosen sample, written as a vector of length |
Details
Input a tensor time series and return the estimated factor loading matrices (or directions) using pre-averaging method.
Value
A list of K
estimated factor loading matrices.
Examples
# Example of a real data set
set.seed(10)
Q_PRE = pre_est(value_weight_tensor)
Q_PRE
set.seed(10)
Q_PRE_2 = pre_est(value_weight_tensor, z = c(2,2))
Q_PRE_2
# Example using generated data
K = 2
T = 100
d = c(40,40)
r = c(2,2)
re = c(2,2)
eta = list(c(0,0),c(0,0))
u = list(c(-2,2),c(-2,2))
set.seed(10)
Data_test = tensor_data_gen(K,T,d,r,re,eta,u)
X = Data_test$X
Q_PRE = pre_est(X, z = r)
Q_PRE
Rank and Factor Loadings Estimation
Description
The complete procedure to estimate both rank and factor loading matrices simultaneously for a tensor time series.
Usage
rank_factors_est(
X,
proj_N = 30,
r_range = NULL,
C_range = NULL,
M0 = 200,
M = 5,
B = 50,
eigen_j = NULL,
input_r = NULL
)
Arguments
X |
A 'Tensor' object defined in package rTensor with |
proj_N |
Number of iterations for iterative projection. Default is 30. |
r_range |
Approximate range of |
C_range |
The range of constant C for calculating threshold. Default is |
M0 |
Number of random samples to generate in pre-averaging procedure. Default is 200. |
M |
Number of chosen samples for pre-averaging. Usually can be set as constants (5 or 10) or 2.5 percents of |
B |
Number of bootstrap samples for estimating rank of core tensor by bootstrapped correlation thresholding. Default is 50. Can be set as 10 when dimension is large. |
eigen_j |
The j-th eigenvalue to calculate eigenvalue-ratio for a randomly chosen sample, written as a vector of length |
input_r |
The rank of core tensor if it is already know, written as a vector of length |
Details
Input a tensor time series and return the estimated factor loading matrices and rank of core tensor.
Value
A list containing the following:
rank
: A vector of K
elements, indicating the estimated number of factors in each mode
loadings
: A list of K
estimated factor loading matrices.
Examples
# Example of real data set
set.seed(10)
results = rank_factors_est(value_weight_tensor)
results
# Example using generated data
K = 2
T = 100
d = c(40,40)
r = c(2,2)
re = c(2,2)
eta = list(c(0,0),c(0,0))
u = list(c(-2,2),c(-2,2))
set.seed(10)
Data_test = tensor_data_gen(K,T,d,r,re,eta,u)
X = Data_test$X
results = rank_factors_est(X)
results
Tensor time series data generation.
Description
Function to generate a random sample of time series tensor factor model, based on econometrics assumptions. (See Chen and Lam (2023) for more details on the assumptions.)
Usage
tensor_data_gen(K, n, d, r, re, eta, u, heavy_tailed = FALSE, t_df = 3)
Arguments
K |
The number of modes for the tensor time series. |
n |
Length of time series. |
d |
Dimensions of each mode of the tensor, written in a vector of length |
r |
Rank of the core tensors, written in a vector of length |
re |
Rank of the cross-sectional common error core tensors, written in a vector of length |
eta |
Quantities controlling factor strengths in each factor loading matrix, written in a list of |
u |
Quantities controlling range of elements in each factor loading matrix, written in a list of |
heavy_tailed |
Whether to generate data from heavy-tailed distribution. If FALSE, generate from N(0,1); if TRUE, generate from t-distribution. Default is FALSE. |
t_df |
The degree of freedom for t-distribution if heavy_tailed = TRUE. Default is 3. |
Details
Input tensor dimension and rank of core tensor, return a sample of tensor time series generated by factor model.
Value
A list containing the following:
X
: the generated tensor time series, stored in a 'Tensor' object defined in rTensor, where mode-1 is the time mode
A
: a list of K factor loading matrices
F_ts
: time series of core tensor, stored in a 'Tensor' object, where mode-1 is the time mode
E_ts
: time series of error tensor, stored in a 'Tensor' object, where mode-1 is the time mode
Examples
set.seed(10)
K = 2
n = 100
d = c(40,40)
r = c(2,2)
re = c(2,2)
eta = list(c(0,0),c(0,0))
u = list(c(-2,2),c(-2,2))
Data_test = tensor_data_gen(K,n,d,r,re,eta,u)
X = Data_test$X
A = Data_test$A
F_ts = Data_test$F_ts
E_ts = Data_test$E_ts
X@modes
F_ts@modes
E_ts@modes
dim(A[[1]])
Value weighted Fama-French portfolio returns data.
Description
Value weighted Fama-French portfolio returns data formed on size and operating profitability of Chen and Lam (2023).
Format
A 576 × 10 × 10 'Tensor' object defined in package rTensor, where mode-1,2,3 correspond to time, OP levels and size levels, respectively.
Details
Stocks are categorized into 10 different sizes (market equity, using NYSE market equity deciles) and 10 different operating profitability (OP) levels (using NYSE OP deciles. OP is annual revenues minus cost of goods sold, interest expense, and selling, general, and administrative expenses divided by book equity for the last fiscal year end). The stocks in each of the 10 × 10 categories form a portfolio using value weighted. We use monthly data from July 1973 to June 2021, so that T = 576, and each data tensor we have thus has size 10 × 10 × 576. Since the market factor is certainly pervasive in financial returns, we use the CAPM to remove its effects and facilitate detection of potentially weaker factors.
References
Chen, W. and Lam, C. (2023). Rank and Factor Loadings Estimation in Time Series Tensor Factor Model by Pre-averaging. Manuscript.