Type: | Package |
Title: | High-Dimensional Location Testing with Normal-Reference Approaches |
Version: | 2.0.1 |
Date: | 2024-10-22 |
Author: | Pengfei Wang [aut, cre], Shuqi Luo [aut], Tianming Zhu [aut], Bu Zhou [aut] |
Maintainer: | Pengfei Wang <nie23.wp8738@e.ntu.edu.sg> |
Description: | We provide a collection of various classical tests and latest normal-reference tests for comparing high-dimensional mean vectors including two-sample and general linear hypothesis testing (GLHT) problem. Some existing tests for two-sample problem [see Bai, Zhidong, and Hewa Saranadasa.(1996) https://www.jstor.org/stable/24306018; Chen, Song Xi, and Ying-Li Qin.(2010) <doi:10.1214/09-aos716>; Srivastava, Muni S., and Meng Du.(2008) <doi:10.1016/j.jmva.2006.11.002>; Srivastava, Muni S., Shota Katayama, and Yutaka Kano.(2013)<doi:10.1016/j.jmva.2012.08.014>]. Normal-reference tests for two-sample problem [see Zhang, Jin-Ting, Jia Guo, Bu Zhou, and Ming-Yen Cheng.(2020) <doi:10.1080/01621459.2019.1604366>; Zhang, Jin-Ting, Bu Zhou, Jia Guo, and Tianming Zhu.(2021) <doi:10.1016/j.jspi.2020.11.008>; Zhang, Liang, Tianming Zhu, and Jin-Ting Zhang.(2020) <doi:10.1016/j.ecosta.2019.12.002>; Zhang, Liang, Tianming Zhu, and Jin-Ting Zhang.(2023) <doi:10.1080/02664763.2020.1834516>; Zhang, Jin-Ting, and Tianming Zhu.(2022) <doi:10.1080/10485252.2021.2015768>; Zhang, Jin-Ting, and Tianming Zhu.(2022) <doi:10.1007/s42519-021-00232-w>; Zhu, Tianming, Pengfei Wang, and Jin-Ting Zhang.(2023) <doi:10.1007/s00180-023-01433-6>]. Some existing tests for GLHT problem [see Fujikoshi, Yasunori, Tetsuto Himeno, and Hirofumi Wakaki.(2004) <doi:10.14490/jjss.34.19>; Srivastava, Muni S., and Yasunori Fujikoshi.(2006) <doi:10.1016/j.jmva.2005.08.010>; Yamada, Takayuki, and Muni S. Srivastava.(2012) <doi:10.1080/03610926.2011.581786>; Schott, James R.(2007) <doi:10.1016/j.jmva.2006.11.007>; Zhou, Bu, Jia Guo, and Jin-Ting Zhang.(2017) <doi:10.1016/j.jspi.2017.03.005>]. Normal-reference tests for GLHT problem [see Zhang, Jin-Ting, Jia Guo, and Bu Zhou.(2017) <doi:10.1016/j.jmva.2017.01.002>; Zhang, Jin-Ting, Bu Zhou, and Jia Guo.(2022) <doi:10.1016/j.jmva.2021.104816>; Zhu, Tianming, Liang Zhang, and Jin-Ting Zhang.(2022) <doi:10.5705/ss.202020.0362>; Zhu, Tianming, and Jin-Ting Zhang.(2022) <doi:10.1007/s00180-021-01110-6>; Zhang, Jin-Ting, and Tianming Zhu.(2022) <doi:10.1016/j.csda.2021.107385>]. |
License: | GPL (≥ 3) |
URL: | https://nie23wp8738.github.io/HDNRA/ |
BugReports: | https://github.com/nie23wp8738/HDNRA/issues |
Encoding: | UTF-8 |
RoxygenNote: | 7.3.2 |
LinkingTo: | Rcpp, RcppArmadillo |
Imports: | expm, Rcpp, Rdpack, readr, stats, utils |
Suggests: | devtools, dplyr, knitr, rmarkdown, spelling, testthat (≥ 3.0.0), tidyr |
RdMacros: | Rdpack |
Depends: | R (≥ 4.0) |
LazyData: | true |
Language: | en-US |
Config/testthat/edition: | 3 |
NeedsCompilation: | yes |
Packaged: | 2024-10-22 06:45:12 UTC; yehe |
Repository: | CRAN |
Date/Publication: | 2024-10-22 08:20:06 UTC |
HDNRA: High-Dimensional Location Testing with Normal-Reference Approaches
Description
We provide a collection of various classical tests and latest normal-reference tests for comparing high-dimensional mean vectors including two-sample and general linear hypothesis testing (GLHT) problem. Some existing tests for two-sample problem [see Bai, Zhidong, and Hewa Saranadasa.(1996) https://www.jstor.org/stable/24306018; Chen, Song Xi, and Ying-Li Qin.(2010) doi:10.1214/09-aos716; Srivastava, Muni S., and Meng Du.(2008) doi:10.1016/j.jmva.2006.11.002; Srivastava, Muni S., Shota Katayama, and Yutaka Kano.(2013)doi:10.1016/j.jmva.2012.08.014]. Normal-reference tests for two-sample problem [see Zhang, Jin-Ting, Jia Guo, Bu Zhou, and Ming-Yen Cheng.(2020) doi:10.1080/01621459.2019.1604366; Zhang, Jin-Ting, Bu Zhou, Jia Guo, and Tianming Zhu.(2021) doi:10.1016/j.jspi.2020.11.008; Zhang, Liang, Tianming Zhu, and Jin-Ting Zhang.(2020) doi:10.1016/j.ecosta.2019.12.002; Zhang, Liang, Tianming Zhu, and Jin-Ting Zhang.(2023) doi:10.1080/02664763.2020.1834516; Zhang, Jin-Ting, and Tianming Zhu.(2022) doi:10.1080/10485252.2021.2015768; Zhang, Jin-Ting, and Tianming Zhu.(2022) doi:10.1007/s42519-021-00232-w; Zhu, Tianming, Pengfei Wang, and Jin-Ting Zhang.(2023) doi:10.1007/s00180-023-01433-6]. Some existing tests for GLHT problem [see Fujikoshi, Yasunori, Tetsuto Himeno, and Hirofumi Wakaki.(2004) doi:10.14490/jjss.34.19; Srivastava, Muni S., and Yasunori Fujikoshi.(2006) doi:10.1016/j.jmva.2005.08.010; Yamada, Takayuki, and Muni S. Srivastava.(2012) doi:10.1080/03610926.2011.581786; Schott, James R.(2007) doi:10.1016/j.jmva.2006.11.007; Zhou, Bu, Jia Guo, and Jin-Ting Zhang.(2017) doi:10.1016/j.jspi.2017.03.005]. Normal-reference tests for GLHT problem [see Zhang, Jin-Ting, Jia Guo, and Bu Zhou.(2017) doi:10.1016/j.jmva.2017.01.002; Zhang, Jin-Ting, Bu Zhou, and Jia Guo.(2022) doi:10.1016/j.jmva.2021.104816; Zhu, Tianming, Liang Zhang, and Jin-Ting Zhang.(2022) doi:10.5705/ss.202020.0362; Zhu, Tianming, and Jin-Ting Zhang.(2022) doi:10.1007/s00180-021-01110-6; Zhang, Jin-Ting, and Tianming Zhu.(2022) doi:10.1016/j.csda.2021.107385].
Author(s)
Maintainer: Pengfei Wang nie23.wp8738@e.ntu.edu.sg
Authors:
Shuqi Luo nie23.ls4909@e.ntu.edu.sg
Tianming Zhu tianming.zhu@nie.edu.sg
Bu Zhou bu.zhou@u.nus.edu
See Also
Useful links:
Normal-approximation-based test for two-sample problem proposed by Bai and Saranadasa (1996)
Description
Bai and Saranadasa (1996)'s test for testing equality of two-sample high-dimensional mean vectors with assuming that two covariance matrices are the same.
Usage
BS1996.TS.NABT(y1, y2)
Arguments
y1 |
The data matrix ( |
y2 |
The data matrix ( |
Details
Suppose we have two independent high-dimensional samples:
\boldsymbol{y}_{i1},\ldots,\boldsymbol{y}_{in_i}, \;\operatorname{are \; i.i.d. \; with}\; \operatorname{E}(\boldsymbol{y}_{i1})=\boldsymbol{\mu}_i,\; \operatorname{Cov}(\boldsymbol{y}_{i1})=\boldsymbol{\Sigma},i=1,2.
The primary object is to test
H_{0}: \boldsymbol{\mu}_1 = \boldsymbol{\mu}_2\; \operatorname{versus}\; H_{1}: \boldsymbol{\mu}_1 \neq \boldsymbol{\mu}_2.
Bai and Saranadasa (1996) proposed the following centralised L^2
-norm-based test statistic:
T_{BS} = \frac{n_1n_2}{n} \|\bar{\boldsymbol{y}}_1 - \bar{\boldsymbol{y}}_2\|^2-\operatorname{tr}(\hat{\boldsymbol{\Sigma}}),
where \bar{\boldsymbol{y}}_{i},i=1,2
are the sample mean vectors and \hat{\boldsymbol{\Sigma}}
is the pooled sample covariance matrix.
They showed that under the null hypothesis, T_{BS}
is asymptotically normally distributed.
Value
A list of class "NRtest"
containing the results of the hypothesis test. See the help file for NRtest.object
for details.
References
Bai Z, Saranadasa H (1996). “Effect of high dimension: by an example of a two sample problem.” Statistica Sinica, 311–329. https://www.jstor.org/stable/24306018.
Examples
library("HDNRA")
data("COVID19")
dim(COVID19)
group1 <- as.matrix(COVID19[c(2:19, 82:87), ]) ## healthy group
group2 <- as.matrix(COVID19[-c(1:19, 82:87), ]) ## COVID-19 patients
BS1996.TS.NABT(group1,group2)
HDNRA_data COVID19
Description
A COVID19 data set from NCBI with ID GSE152641. The data set profiled peripheral blood from 24 healthy controls and 62 prospectively enrolled patients with community-acquired lower respiratory tract infection by SARS-COV-2 within the first 24 hours of hospital admission using RNA sequencing.
Usage
data(COVID19)
Format
'COVID19'
A data frame with 86 observations on the following 2 groups.
- healthy group1
row 2 to row 19, and row 82 to 87, in total 24 healthy controls
- patients group2
row 20 to 81, in total 62 prospectively enrolled patients
References
Thair SA, He YD, Hasin-Brumshtein Y, Sakaram S, Pandya R, Toh J, Rawling D, Remmel M, Coyle S, Dalekos GN, others (2021). “Transcriptomic similarities and differences in host response between SARS-CoV-2 and other viral infections.” Iscience, 24(1). doi:10.1016/j.isci.2020.101947.
Examples
library(HDNRA)
data(COVID19)
dim(COVID19)
group1 <- as.matrix(COVID19[c(2:19, 82:87), ]) ## healthy group
dim(group1)
group2 <- as.matrix(COVID19[-c(1:19, 82:87), ]) ## COVID-19 patients
dim(group2)
Normal-approximation-based test for two-sample BF problem proposed by Chen and Qin (2010)
Description
Chen and Qin (2010)'s test for testing equality of two-sample high-dimensional mean vectors without assuming that two covariance matrices are the same.
Usage
CQ2010.TSBF.NABT(y1, y2)
Arguments
y1 |
The data matrix ( |
y2 |
The data matrix ( |
Details
Suppose we have two independent high-dimensional samples:
\boldsymbol{y}_{i1},\ldots,\boldsymbol{y}_{in_i}, \;\operatorname{are \; i.i.d. \; with}\; \operatorname{E}(\boldsymbol{y}_{i1})=\boldsymbol{\mu}_i,\; \operatorname{Cov}(\boldsymbol{y}_{i1})=\boldsymbol{\Sigma}_i,i=1,2.
The primary object is to test
H_{0}: \boldsymbol{\mu}_1 = \boldsymbol{\mu}_2\; \operatorname{versus}\; H_{1}: \boldsymbol{\mu}_1 \neq \boldsymbol{\mu}_2.
Chen and Qin (2010) proposed the following test statistic:
T_{CQ} = \frac{\sum_{i \neq j}^{n_1} \boldsymbol{y}_{1i}^\top \boldsymbol{y}_{1j}}{n_1 (n_1 - 1)} + \frac{\sum_{i \neq j}^{n_2} \boldsymbol{y}_{2i}^\top \boldsymbol{y}_{2j}}{n_2 (n_2 - 1)} - 2 \frac{\sum_{i = 1}^{n_1} \sum_{j = 1}^{n_2} \boldsymbol{y}_{1i}^\top \boldsymbol{y}_{2j}}{n_1 n_2}.
They showed that under the null hypothesis, T_{CQ}
is asymptotically normally distributed.
Value
A list of class "NRtest"
containing the results of the hypothesis test. See the help file for NRtest.object
for details.
References
Chen SX, Qin Y (2010). “A two-sample test for high-dimensional data with applications to gene-set testing.” The Annals of Statistics, 38(2). doi:10.1214/09-aos716.
Examples
library("HDNRA")
data("COVID19")
dim(COVID19)
group1 <- as.matrix(COVID19[c(2:19, 82:87), ]) ## healthy group
group2 <- as.matrix(COVID19[-c(1:19, 82:87), ]) ## COVID-19 patients
CQ2010.TSBF.NABT(group1,group2)
Normal-approximation-based test for GLHT problem proposed by Fujikoshi et al. (2004)
Description
Fujikoshi et al. (2004)'s test for general linear hypothesis testing (GLHT) problem for high-dimensional data with assuming that underlying covariance matrices are the same.
Usage
FHW2004.GLHT.NABT(Y,X,C,n,p)
Arguments
Y |
A list of |
X |
A known |
C |
A known matrix of size |
n |
A vector of |
p |
The dimension of data. |
Details
A high-dimensional linear regression model can be expressed as
\boldsymbol{Y}=\boldsymbol{X\Theta}+\boldsymbol{\epsilon},
where \Theta
is a k\times p
unknown parameter matrix and \boldsymbol{\epsilon}
is an n\times p
error matrix.
It is of interest to test the following GLHT problem
H_0: \boldsymbol{C\Theta}=\boldsymbol{0}, \quad \text { vs. } \quad H_1: \boldsymbol{C\Theta} \neq \boldsymbol{0}.
Fujikoshi et al. (2004) proposed the following test statistic:
T_{FHW}=\sqrt{p}\left[(n-k)\frac{\operatorname{tr}(\boldsymbol{S}_h)}{\operatorname{tr}(\boldsymbol{S}_e)}-q\right],
where \boldsymbol{S}_h
and \boldsymbol{S}_e
are the matrices of sums of squares and products due to the hypothesis and the error, respecitively.
They showed that under the null hypothesis, T_{FHW}
is asymptotically normally distributed.
Value
A list of class "NRtest"
containing the results of the hypothesis test. See the help file for NRtest.object
for details.
References
Fujikoshi Y, Himeno T, Wakaki H (2004). “Asymptotic results of a high dimensional MANOVA test and power comparison when the dimension is large compared to the sample size.” Journal of the Japan Statistical Society, 34(1), 19–26. doi:10.14490/jjss.34.19.
Examples
library("HDNRA")
data("corneal")
dim(corneal)
group1 <- as.matrix(corneal[1:43, ]) ## normal group
group2 <- as.matrix(corneal[44:57, ]) ## unilateral suspect group
group3 <- as.matrix(corneal[58:78, ]) ## suspect map group
group4 <- as.matrix(corneal[79:150, ]) ## clinical keratoconus group
p <- dim(corneal)[2]
Y <- list()
k <- 4
Y[[1]] <- group1
Y[[2]] <- group2
Y[[3]] <- group3
Y[[4]] <- group4
n <- c(nrow(Y[[1]]),nrow(Y[[2]]),nrow(Y[[3]]),nrow(Y[[4]]))
X <- matrix(c(rep(1,n[1]),rep(0,sum(n)),rep(1,n[2]), rep(0,sum(n)),
rep(1,n[3]),rep(0,sum(n)),rep(1,n[4])),ncol=k,nrow=sum(n))
q <- k-1
C <- cbind(diag(q),-rep(1,q))
FHW2004.GLHT.NABT(Y,X,C,n,p)
S3 Class "NRtest"
Description
The "NRtest"
objects provide a comprehensive summary of hypothesis test outcomes,
including test statistics, p-values, parameter estimates, and confidence intervals, if applicable.
Usage
NRtest.object(
statistic,
p.value,
method,
null.value,
alternative,
parameter = NULL,
sample.size = NULL,
sample.dimension = NULL,
estimation.method = NULL,
data.name = NULL,
...
)
Arguments
statistic |
Numeric scalar containing the value of the test statistic, with a |
p.value |
Numeric scalar containing the p-value for the test. |
method |
Character string giving the name of the test. |
null.value |
Character string indicating the null hypothesis. |
alternative |
Character string indicating the alternative hypothesis. |
parameter |
Numeric vector containing the estimated approximation parameter(s) associated with the approximation method. This vector has a |
sample.size |
Numeric vector containing the number of observations in each group used for the hypothesis test. |
sample.dimension |
Numeric scalar containing the dimension of the dataset used for the hypothesis test. |
estimation.method |
Character string giving the name of the approximation approach used to approximate the null distribution of the test statistic. |
data.name |
Character string describing the data set used in the hypothesis test. |
... |
Additional optional arguments. |
Details
A class of objects returned by high-dimensional hypothesis testing functions in the HDNRA package, designed to encapsulate detailed results from statistical hypothesis tests. These objects are structured similarly to htest objects in the package EnvStats but are tailored to the needs of the HDNRA package.
Value
An object of class "NRtest"
containing both required and optional components depending on the specifics of the hypothesis test,
shown as follows:
Required Components
These components must be present in every "NRtest"
object:
statistic
Must e present.
p.value
Must e present.
null.value
Must e present.
alternative
Must e present.
method
Must e present.
Optional Components
These components are included depending on the specifics of the hypothesis test performed:
parameter
May be present.
sample.size
May be present.
sample.dimension
May be present.
estimation.method
May be present.
data.name
May be present.
Methods
The class has the following methods:
print.NRtest
Printing the contents of the NRtest object in a human-readable form.
Examples
# Example 1: Using Bai and Saranadasa (1996)'s test (two-sample problem)
NRtest.obj1 <- NRtest.object(
statistic = c("T[BS]" = 2.208),
p.value = 0.0136,
method = "Bai and Saranadasa (1996)'s test",
data.name = "group1 and group2",
null.value = c("Difference between two mean vectors is o"),
alternative = "Difference between two mean vectors is not 0",
parameter = NULL,
sample.size = c(n1 = 24, n2 = 26),
sample.dimension = 20460,
estimation.method = "Normal approximation"
)
print(NRtest.obj1)
# Example 2: Using Fujikoshi et al. (2004)'s test (GLHT problem)
NRtest.obj2 <- NRtest.object(
statistic = c("T[FHW]" = 6.4015),
p.value = 0,
method = "Fujikoshi et al. (2004)'s test",
data.name = "Y",
null.value = "The general linear hypothesis is true",
alternative = "The general linear hypothesis is not true",
parameter = NULL,
sample.size = c(n1 = 43, n2 = 14, n3 = 21, n4 = 72),
sample.dimension = 2000,
estimation.method = "Normal approximation"
)
print(NRtest.obj2)
Normal-approximation-based test for one-way MANOVA problem proposed by Schott (2007)
Description
Schott, J. R. (2007)'s test for one-way MANOVA problem for high-dimensional data with assuming that underlying covariance matrices are the same.
Usage
S2007.ks.NABT(Y, n, p)
Arguments
Y |
A list of |
n |
A vector of |
p |
The dimension of data. |
Details
Suppose we have the following k
independent high-dimensional samples:
\boldsymbol{y}_{i1},\ldots,\boldsymbol{y}_{in_i}, \;\operatorname{are \; i.i.d. \; with}\; \operatorname{E}(\boldsymbol{y}_{i1})=\boldsymbol{\mu}_i,\; \operatorname{Cov}(\boldsymbol{y}_{i1})=\boldsymbol{\Sigma},i=1,\ldots,k.
It is of interest to test the following one-way MANOVA problem:
H_0: \boldsymbol{\mu}_1=\cdots=\boldsymbol{\mu}_k, \quad \text { vs. }\; H_1: H_0 \;\operatorname{is \; not\; ture}.
Schott (2007) proposed the following test statistic:
T_{S}=[\operatorname{tr}(\boldsymbol{H})/h-\operatorname{tr}(\boldsymbol{E})/e]/\sqrt{N-1},
where \boldsymbol{H}=\sum_{i=1}^kn_i(\bar{\boldsymbol{y}}_i-\bar{\boldsymbol{y}})(\bar{\boldsymbol{y}}_i-\bar{\boldsymbol{y}})^\top
, \boldsymbol{E}=\sum_{i=1}^k\sum_{j=1}^{n_i}(\boldsymbol{y}_{ij}-\bar{\boldsymbol{y}}_{i})(\boldsymbol{y}_{ij}-\bar{\boldsymbol{y}}_{i})^\top
, h=k-1
, and e=N-k
, with N=n_1+\cdots+n_k
.
They showed that under the null hypothesis, T_{S}
is asymptotically normally distributed.
Value
A list of class "NRtest"
containing the results of the hypothesis test. See the help file for NRtest.object
for details.
References
Schott JR (2007). “Some high-dimensional tests for a one-way MANOVA.” Journal of Multivariate Analysis, 98(9), 1825–1839. doi:10.1016/j.jmva.2006.11.007.
Examples
library("HDNRA")
data("corneal")
dim(corneal)
group1 <- as.matrix(corneal[1:43, ]) ## normal group
group2 <- as.matrix(corneal[44:57, ]) ## unilateral suspect group
group3 <- as.matrix(corneal[58:78, ]) ## suspect map group
group4 <- as.matrix(corneal[79:150, ]) ## clinical keratoconus group
p <- dim(corneal)[2]
Y <- list()
Y[[1]] <- group1
Y[[2]] <- group2
Y[[3]] <- group3
Y[[4]] <- group4
n <- c(nrow(Y[[1]]),nrow(Y[[2]]),nrow(Y[[3]]),nrow(Y[[4]]))
S2007.ks.NABT(Y, n, p)
Normal-approximation-based test for two-sample problem proposed by Srivastava and Du (2008)
Description
Srivastava and Du (2008)'s test for testing equality of two-sample high-dimensional mean vectors with assuming that two covariance matrices are the same.
Usage
SD2008.TS.NABT(y1, y2)
Arguments
y1 |
The data matrix ( |
y2 |
The data matrix ( |
Details
Suppose we have two independent high-dimensional samples:
\boldsymbol{y}_{i1},\ldots,\boldsymbol{y}_{in_i}, \;\operatorname{are \; i.i.d. \; with}\; \operatorname{E}(\boldsymbol{y}_{i1})=\boldsymbol{\mu}_i,\; \operatorname{Cov}(\boldsymbol{y}_{i1})=\boldsymbol{\Sigma},i=1,2.
The primary object is to test
H_{0}: \boldsymbol{\mu}_1 = \boldsymbol{\mu}_2\; \operatorname{versus}\; H_{1}: \boldsymbol{\mu}_1 \neq \boldsymbol{\mu}_2.
Srivastava and Du (2008) proposed the following test statistic:
T_{SD} = \frac{n^{-1}n_1n_2(\bar{\boldsymbol{y}}_1 - \bar{\boldsymbol{y}}_2)^\top \boldsymbol{D}_S^{-1}(\bar{\boldsymbol{y}}_1 - \bar{\boldsymbol{y}}_2) - \frac{(n-2)p}{n-4}}{\sqrt{2 \left[\operatorname{tr}(\boldsymbol{R}^2) - \frac{p^2}{n-2}\right] c_{p, n}}},
where \bar{\boldsymbol{y}}_{i},i=1,2
are the sample mean vectors, \boldsymbol{D}_S
is the diagonal matrix of sample variance, \boldsymbol{R}
is the sample correlation matrix and c_{p, n}
is the adjustment coefficient proposed by Srivastava and Du (2008).
They showed that under the null hypothesis, T_{SD}
is asymptotically normally distributed.
Value
A list of class "NRtest"
containing the results of the hypothesis test. See the help file for NRtest.object
for details.
References
Srivastava MS, Du M (2008). “A test for the mean vector with fewer observations than the dimension.” Journal of Multivariate Analysis, 99(3), 386–402. doi:10.1016/j.jmva.2006.11.002.
Examples
library("HDNRA")
data("COVID19")
dim(COVID19)
group1 <- as.matrix(COVID19[c(2:19, 82:87), ]) ## healthy group
group2 <- as.matrix(COVID19[-c(1:19, 82:87), ]) ## COVID-19 patients
SD2008.TS.NABT(group1,group2)
Normal-approximation-based test for GLHT problem proposed by Srivastava and Fujikoshi (2006)
Description
Srivastava and Fujikoshi (2006)'s test for general linear hypothesis testing (GLHT) problem for high-dimensional data with assuming that underlying covariance matrices are the same.
Usage
SF2006.GLHT.NABT(Y,X,C,n,p)
Arguments
Y |
A list of |
X |
A known |
C |
A known matrix of size |
n |
A vector of |
p |
The dimension of data. |
Details
A high-dimensional linear regression model can be expressed as
\boldsymbol{Y}=\boldsymbol{X\Theta}+\boldsymbol{\epsilon},
where \Theta
is a k\times p
unknown parameter matrix and \boldsymbol{\epsilon}
is an n\times p
error matrix.
It is of interest to test the following GLHT problem
H_0: \boldsymbol{C\Theta}=\boldsymbol{0}, \quad \text { vs. } \quad H_1: \boldsymbol{C\Theta} \neq \boldsymbol{0}.
Srivastava and Fujikoshi (2006) proposed the following test statistic:
T_{SF}=\left[2q\hat{a}_2(1+(n-k)^{-1}q)\right]^{-1/2}\left[\frac{\operatorname{tr}(\boldsymbol{B})}{\sqrt{p}}-\frac{q}{\sqrt{n-k}}\frac{\operatorname{tr}(\boldsymbol{W})}{\sqrt{(n-k)p}}\right].
where \boldsymbol{W}
and \boldsymbol{B}
are the matrix of sum of squares and products due to error and the error, respectively, and \hat{a}_2=[\operatorname{tr}(\boldsymbol{W}^2)-\operatorname{tr}^2(\boldsymbol{W})/(n-k)]/[(n-k-1)(n-k+2)p]
.
They showed that under the null hypothesis, T_{SF}
is asymptotically normally distributed.
Value
A list of class "NRtest"
containing the results of the hypothesis test. See the help file for NRtest.object
for details.
References
Srivastava MS, Fujikoshi Y (2006). “Multivariate analysis of variance with fewer observations than the dimension.” Journal of Multivariate Analysis, 97(9), 1927–1940. doi:10.1016/j.jmva.2005.08.010.
Examples
library("HDNRA")
data("corneal")
dim(corneal)
group1 <- as.matrix(corneal[1:43, ]) ## normal group
group2 <- as.matrix(corneal[44:57, ]) ## unilateral suspect group
group3 <- as.matrix(corneal[58:78, ]) ## suspect map group
group4 <- as.matrix(corneal[79:150, ]) ## clinical keratoconus group
p <- dim(corneal)[2]
Y <- list()
k <- 4
Y[[1]] <- group1
Y[[2]] <- group2
Y[[3]] <- group3
Y[[4]] <- group4
n <- c(nrow(Y[[1]]),nrow(Y[[2]]),nrow(Y[[3]]),nrow(Y[[4]]))
X <- matrix(c(rep(1,n[1]),rep(0,sum(n)),rep(1,n[2]), rep(0,sum(n)),
rep(1,n[3]),rep(0,sum(n)),rep(1,n[4])),ncol=k,nrow=sum(n))
q <- k-1
C <- cbind(diag(q),-rep(1,q))
SF2006.GLHT.NABT(Y,X,C,n,p)
Normal-approximation-based test for two-sample BF problem proposed by Srivastava et al. (2013)
Description
Srivastava et al. (2013)'s test for testing equality of two-sample high-dimensional mean vectors without assuming that two covariance matrices are the same.
Usage
SKK2013.TSBF.NABT(y1, y2)
Arguments
y1 |
The data matrix ( |
y2 |
The data matrix ( |
Details
Suppose we have two independent high-dimensional samples:
\boldsymbol{y}_{i1},\ldots,\boldsymbol{y}_{in_i}, \;\operatorname{are \; i.i.d. \; with}\; \operatorname{E}(\boldsymbol{y}_{i1})=\boldsymbol{\mu}_i,\; \operatorname{Cov}(\boldsymbol{y}_{i1})=\boldsymbol{\Sigma}_i,i=1,2.
The primary object is to test
H_{0}: \boldsymbol{\mu}_1 = \boldsymbol{\mu}_2\; \operatorname{versus}\; H_{1}: \boldsymbol{\mu}_1 \neq \boldsymbol{\mu}_2.
Srivastava et al. (2013) proposed the following test statistic:
T_{SKK} = \frac{(\bar{\boldsymbol{y}}_1 - \bar{\boldsymbol{y}}_2)^\top \hat{\boldsymbol{D}}^{-1}(\bar{\boldsymbol{y}}_1 - \bar{\boldsymbol{y}}_2) - p}{\sqrt{2 \widehat{\operatorname{Var}}(\hat{q}_n) c_{p,n}}},
where \bar{\boldsymbol{y}}_{i},i=1,2
are the sample mean vectors, \hat{\boldsymbol{D}}=\hat{\boldsymbol{D}}_1/n_1+\hat{\boldsymbol{D}}_2/n_2
with \hat{\boldsymbol{D}}_i,i=1,2
being the diagonal matrices consisting of only the diagonal elements of the sample covariance matrices. \widehat{\operatorname{Var}}(\hat{q}_n)
is given by equation (1.18) in Srivastava et al. (2013), and c_{p, n}
is the adjustment coefficient proposed by Srivastava et al. (2013).
They showed that under the null hypothesis, T_{SKK}
is asymptotically normally distributed.
Value
A list of class "NRtest"
containing the results of the hypothesis test. See the help file for NRtest.object
for details.
References
Srivastava MS, Katayama S, Kano Y (2013). “A two sample test in high dimensional data.” Journal of Multivariate Analysis, 114, 349–358. doi:10.1016/j.jmva.2012.08.014.
Examples
library("HDNRA")
data("COVID19")
dim(COVID19)
group1 <- as.matrix(COVID19[c(2:19, 82:87), ]) ## healthy group
group2 <- as.matrix(COVID19[-c(1:19, 82:87), ]) ## COVID-19 patients
SKK2013.TSBF.NABT(group1,group2)
Normal-approximation-based test for GLHT problem proposed by Yamada and Srivastava (2012)
Description
Yamada and Srivastava (2012)'test for general linear hypothesis testing (GLHT) problem for high-dimensional data with assuming that underlying covariance matrices are the same.
Usage
YS2012.GLHT.NABT(Y,X,C,n,p)
Arguments
Y |
A list of |
X |
A known |
C |
A known matrix of size |
n |
A vector of |
p |
The dimension of data. |
Details
A high-dimensional linear regression model can be expressed as
\boldsymbol{Y}=\boldsymbol{X\Theta}+\boldsymbol{\epsilon},
where \Theta
is a k\times p
unknown parameter matrix and \boldsymbol{\epsilon}
is an n\times p
error matrix.
It is of interest to test the following GLHT problem
H_0: \boldsymbol{C\Theta}=\boldsymbol{0}, \quad \text { vs. } H_1: \boldsymbol{C\Theta} \neq \boldsymbol{0}.
Yamada and Srivastava (2012) proposed the following test statistic:
T_{YS}=\frac{(n-k)\operatorname{tr}(\boldsymbol{S}_h\boldsymbol{D}_{\boldsymbol{S}_e}^{-1})-(n-k)pq/(n-k-2)}{\sqrt{2q[\operatorname{tr}(\boldsymbol{R}^2)-p^2/(n-k)]c_{p,n}}},
where \boldsymbol{S}_h
and \boldsymbol{S}_e
are the variation matrices due to the hypothesis and error, respectively, and \boldsymbol{D}_{\boldsymbol{S}_e}
and \boldsymbol{R}
are diagonal matrix with the diagonal elements of \boldsymbol{S}_e
and the sample correlation matrix, respectively. c_{p, n}
is the adjustment coefficient proposed by Yamada and Srivastava (2012).
They showed that under the null hypothesis, T_{YS}
is asymptotically normally distributed.
Value
A list of class "NRtest"
containing the results of the hypothesis test. See the help file for NRtest.object
for details.
References
Yamada T, Srivastava MS (2012). “A test for multivariate analysis of variance in high dimension.” Communications in Statistics-Theory and Methods, 41(13-14), 2602–2615. doi:10.1080/03610926.2011.581786.
Examples
library("HDNRA")
data("corneal")
dim(corneal)
group1 <- as.matrix(corneal[1:43, ]) ## normal group
group2 <- as.matrix(corneal[44:57, ]) ## unilateral suspect group
group3 <- as.matrix(corneal[58:78, ]) ## suspect map group
group4 <- as.matrix(corneal[79:150, ]) ## clinical keratoconus group
p <- dim(corneal)[2]
Y <- list()
k <- 4
Y[[1]] <- group1
Y[[2]] <- group2
Y[[3]] <- group3
Y[[4]] <- group4
n <- c(nrow(Y[[1]]),nrow(Y[[2]]),nrow(Y[[3]]),nrow(Y[[4]]))
X <- matrix(c(rep(1,n[1]),rep(0,sum(n)),rep(1,n[2]), rep(0,sum(n)),rep(1,n[3]),
rep(0,sum(n)),rep(1,n[4])),ncol=k,nrow=sum(n))
q <- k-1
C <- cbind(diag(q),-rep(1,q))
YS2012.GLHT.NABT(Y,X,C,n,p)
Normal-reference-test with two-cumulant (2-c) matched $\chi^2$-approximation for GLHT problem proposed Zhang et al. (2017)
Description
Zhang et al. (2017)'s test for general linear hypothesis testing (GLHT) problem for high-dimensional data with assuming that underlying covariance matrices are the same.
Usage
ZGZ2017.GLHT.2cNRT(Y,G,n,p)
Arguments
Y |
A list of |
G |
A known full-rank coefficient matrix ( |
n |
A vector of |
p |
The dimension of data. |
Details
Suppose we have the following k
independent high-dimensional samples:
\boldsymbol{y}_{i1},\ldots,\boldsymbol{y}_{in_i}, \;\operatorname{are \; i.i.d. \; with}\; \operatorname{E}(\boldsymbol{y}_{i1})=\boldsymbol{\mu}_i,\; \operatorname{Cov}(\boldsymbol{y}_{i1})=\boldsymbol{\Sigma},\;i=1,\ldots,k.
It is of interest to test the following GLHT problem:
H_0: \boldsymbol{G M}=\boldsymbol{0}, \quad \text { vs. } \quad H_1: \boldsymbol{G M} \neq \boldsymbol{0},
where
\boldsymbol{M}=(\boldsymbol{\mu}_1,\ldots,\boldsymbol{\mu}_k)^\top
is a k\times p
matrix collecting k
mean vectors and \boldsymbol{G}:q\times k
is a known full-rank coefficient matrix with \operatorname{rank}(\boldsymbol{G})<k
.
Zhang et al. (2017) proposed the following test statistic:
T_{ZGZ}=\|\boldsymbol{C \hat{\mu}}\|^2,
where \boldsymbol{C}=[(\boldsymbol{G D G}^\top)^{-1/2}\boldsymbol{G}]\otimes\boldsymbol{I}_p
, and \hat{\boldsymbol{\mu}}=(\bar{\boldsymbol{y}}_1^\top,\ldots,\bar{\boldsymbol{y}}_k^\top)^\top
, with \bar{\boldsymbol{y}}_{i},i=1,\ldots,k
being the sample mean vectors and \boldsymbol{D}=\operatorname{diag}(1/n_1,\ldots,1/n_k)
.
They showed that under the null hypothesis, T_{ZGZ}
and a chi-squared-type mixture have the same normal or non-normal limiting distribution.
Value
A list of class "NRtest"
containing the results of the hypothesis test. See the help file for NRtest.object
for details.
References
Zhang J, Guo J, Zhou B (2017). “Linear hypothesis testing in high-dimensional one-way MANOVA.” Journal of Multivariate Analysis, 155, 200–216. doi:10.1016/j.jmva.2017.01.002.
Examples
library("HDNRA")
data("corneal")
dim(corneal)
group1 <- as.matrix(corneal[1:43, ]) ## normal group
group2 <- as.matrix(corneal[44:57, ]) ## unilateral suspect group
group3 <- as.matrix(corneal[58:78, ]) ## suspect map group
group4 <- as.matrix(corneal[79:150, ]) ## clinical keratoconus group
p <- dim(corneal)[2]
Y <- list()
k <- 4
Y[[1]] <- group1
Y[[2]] <- group2
Y[[3]] <- group3
Y[[4]] <- group4
n <- c(nrow(Y[[1]]),nrow(Y[[2]]),nrow(Y[[3]]),nrow(Y[[4]]))
G <- cbind(diag(k-1),rep(-1,k-1))
ZGZ2017.GLHT.2cNRT(Y,G,n,p)
Normal-approximation-based test for GLHT problem under heteroscedasticity proposed by Zhou et al. (2017)
Description
Zhou et al. (2017)'s test for general linear hypothesis testing (GLHT) problem for high-dimensional data under heteroscedasticity.
Usage
ZGZ2017.GLHTBF.NABT(Y,G,n,p)
Arguments
Y |
A list of |
G |
A known full-rank coefficient matrix ( |
n |
A vector of |
p |
The dimension of data. |
Details
Suppose we have the following k
independent high-dimensional samples:
\boldsymbol{y}_{i1},\ldots,\boldsymbol{y}_{in_i}, \;\operatorname{are \; i.i.d. \; with}\; \operatorname{E}(\boldsymbol{y}_{i1})=\boldsymbol{\mu}_i,\; \operatorname{Cov}(\boldsymbol{y}_{i1})=\boldsymbol{\Sigma}_i,i=1,\ldots,k.
It is of interest to test the following GLHT problem:
H_0: \boldsymbol{G M}=\boldsymbol{0}, \quad \text { vs. } H_1: \boldsymbol{G M} \neq \boldsymbol{0},
where
\boldsymbol{M}=(\boldsymbol{\mu}_1,\ldots,\boldsymbol{\mu}_k)^\top
is a k\times p
matrix collecting k
mean vectors and \boldsymbol{G}:q\times k
is a known full-rank coefficient matrix with \operatorname{rank}(\boldsymbol{G})<k
.
Let \bar{\boldsymbol{y}}_{i},i=1,\ldots,k
be the sample mean vectors and \hat{\boldsymbol{\Sigma}}_i,i=1,\ldots,k
be the sample covariance matrices.
Zhou et al. (2017) proposed the following U-statistic based test statistic:
T_{ZGZ}=\|\boldsymbol{C \hat{\mu}}\|^2-\sum_{i=1}^k h_{ii}\operatorname{tr}(\hat{\boldsymbol{\Sigma}}_i)/n_i,
where \boldsymbol{C}=[(\boldsymbol{G D G}^\top)^{-1/2}\boldsymbol{G}]\otimes\boldsymbol{I}_p
, \boldsymbol{D}=\operatorname{diag}(1/n_1,\ldots,1/n_k)
, and h_{ij}
is the (i,j)
th entry of the k\times k
matrix \boldsymbol{H}=\boldsymbol{G}^\top(\boldsymbol{G}\boldsymbol{D}\boldsymbol{G}^\top)^{-1}\boldsymbol{G}
.
They showed that under the null hypothesis, T_{ZGZ}
is asymptotically normally distributed.
Value
A list of class "NRtest"
containing the results of the hypothesis test. See the help file for NRtest.object
for details.
References
Zhou B, Guo J, Zhang J (2017). “High-dimensional general linear hypothesis testing under heteroscedasticity.” Journal of Statistical Planning and Inference, 188, 36–54. doi:10.1016/j.jspi.2017.03.005.
Examples
library("HDNRA")
data("corneal")
dim(corneal)
group1 <- as.matrix(corneal[1:43, ]) ## normal group
group2 <- as.matrix(corneal[44:57, ]) ## unilateral suspect group
group3 <- as.matrix(corneal[58:78, ]) ## suspect map group
group4 <- as.matrix(corneal[79:150, ]) ## clinical keratoconus group
p <- dim(corneal)[2]
Y <- list()
k <- 4
Y[[1]] <- group1
Y[[2]] <- group2
Y[[3]] <- group3
Y[[4]] <- group4
n <- c(nrow(Y[[1]]),nrow(Y[[2]]),nrow(Y[[3]]),nrow(Y[[4]]))
G <- cbind(diag(k-1),rep(-1,k-1))
ZGZ2017.GLHTBF.NABT(Y,G,n,p)
Normal-reference-test with two-cumulant (2-c) matched $\chi^2$-approximation for two-sample problem proposed by Zhang et al. (2020)
Description
Zhang et al. (2020)'s test for testing equality of two-sample high-dimensional mean vectors with assuming that two covariance matrices are the same.
Usage
ZGZC2020.TS.2cNRT(y1, y2)
Arguments
y1 |
The data matrix ( |
y2 |
The data matrix ( |
Details
Suppose we have two independent high-dimensional samples:
\boldsymbol{y}_{i1},\ldots,\boldsymbol{y}_{in_i}, \;\operatorname{are \; i.i.d. \; with}\; \operatorname{E}(\boldsymbol{y}_{i1})=\boldsymbol{\mu}_i,\; \operatorname{Cov}(\boldsymbol{y}_{i1})=\boldsymbol{\Sigma},i=1,2.
The primary object is to test
H_{0}: \boldsymbol{\mu}_1 = \boldsymbol{\mu}_2\; \operatorname{versus}\; H_{1}: \boldsymbol{\mu}_1 \neq \boldsymbol{\mu}_2.
Zhang et al.(2020) proposed the following test statistic:
T_{ZGZC} = \frac{n_1n_2}{n} \|\bar{\boldsymbol{y}}_1 - \bar{\boldsymbol{y}}_2\|^2,
where \bar{\boldsymbol{y}}_{i},i=1,2
are the sample mean vectors.
They showed that under the null hypothesis, T_{ZGZC}
and a chi-squared-type mixture have the same normal or non-normal limiting distribution.
Value
A list of class "NRtest"
containing the results of the hypothesis test. See the help file for NRtest.object
for details.
References
Zhang J, Guo J, Zhou B, Cheng M (2020). “A simple two-sample test in high dimensions based on L 2-norm.” Journal of the American Statistical Association, 115(530), 1011–1027. doi:10.1080/01621459.2019.1604366.
Examples
library("HDNRA")
data("COVID19")
dim(COVID19)
group1 <- as.matrix(COVID19[c(2:19, 82:87), ]) ## healthy group
group2 <- as.matrix(COVID19[-c(1:19, 82:87), ]) ## COVID-19 patients
ZGZC2020.TS.2cNRT(group1, group2)
Normal-reference-test with two-cumulant (2-c) matched $\chi^2$-approximation for two-sample BF problem proposed by Zhu et al. (2023)
Description
Zhu et al. (2023)'s test for testing equality of two-sample high-dimensional mean vectors without assuming that two covariance matrices are the same.
Usage
ZWZ2023.TSBF.2cNRT(y1, y2)
Arguments
y1 |
The data matrix ( |
y2 |
The data matrix ( |
Details
Suppose we have two independent high-dimensional samples:
\boldsymbol{y}_{i1},\ldots,\boldsymbol{y}_{in_i}, \;\operatorname{are \; i.i.d. \; with}\; \operatorname{E}(\boldsymbol{y}_{i1})=\boldsymbol{\mu}_i,\; \operatorname{Cov}(\boldsymbol{y}_{i1})=\boldsymbol{\Sigma}_i,\; i=1,2.
The primary object is to test
H_{0}: \boldsymbol{\mu}_1 = \boldsymbol{\mu}_2\; \operatorname{versus}\; H_{1}: \boldsymbol{\mu}_1 \neq \boldsymbol{\mu}_2.
Zhu et al. (2023) proposed the following test statistic:
T_{ZWZ}=\frac{n_1n_2n^{-1}\|\bar{\boldsymbol{y}}_1-\bar{\boldsymbol{y}}_2\|^2}{\operatorname{tr}(\hat{\boldsymbol{\Omega}}_n)},
where \bar{\boldsymbol{y}}_{i},i=1,2
are the sample mean vectors and \hat{\boldsymbol{\Omega}}_n
is the estimator of \operatorname{Cov}[(n_1n_2/n)^{1/2}(\bar{\boldsymbol{y}}_1-\bar{\boldsymbol{y}}_2)]
.
They showed that under the null hypothesis, T_{ZWZ}
and an F-type mixture have the same normal or non-normal limiting distribution.
Value
A list of class "NRtest"
containing the results of the hypothesis test. See the help file for NRtest.object
for details.
References
Zhu T, Wang P, Zhang J (2023). “Two-sample Behrens–Fisher problems for high-dimensional data: a normal reference F-type test.” Computational Statistics, 1–24. doi:10.1007/s00180-023-01433-6.
Examples
library("HDNRA")
data("COVID19")
dim(COVID19)
group1 <- as.matrix(COVID19[c(2:19, 82:87), ]) ## healthy group
group2 <- as.matrix(COVID19[-c(1:19, 82:87), ]) ## COVID-19 patients
ZWZ2023.TSBF.2cNRT(group1, group2)
Normal-reference-test with three-cumulant (3-c) matched $\chi^2$-approximation for GLHT problem proposed by Zhu and Zhang (2022)
Description
Zhu and Zhang (2022)'s test for general linear hypothesis testing (GLHT) problem for high-dimensional data with assuming that underlying covariance matrices are the same.
Usage
ZZ2022.GLHT.3cNRT(Y,G,n,p)
Arguments
Y |
A list of |
G |
A known full-rank coefficient matrix ( |
n |
A vector of |
p |
The dimension of data. |
Details
Suppose we have the following k
independent high-dimensional samples:
\boldsymbol{y}_{i1},\ldots,\boldsymbol{y}_{in_i}, \;\operatorname{are \; i.i.d. \; with}\; \operatorname{E}(\boldsymbol{y}_{i1})=\boldsymbol{\mu}_i,\; \operatorname{Cov}(\boldsymbol{y}_{i1})=\boldsymbol{\Sigma},\; i=1,\ldots,k.
It is of interest to test the following GLHT problem:
H_0: \boldsymbol{G M}=\boldsymbol{0}, \quad \text { vs. } \quad H_1: \boldsymbol{G M} \neq \boldsymbol{0},
where
\boldsymbol{M}=(\boldsymbol{\mu}_1,\ldots,\boldsymbol{\mu}_k)^\top
is a k\times p
matrix collecting k
mean vectors and \boldsymbol{G}:q\times k
is a known full-rank coefficient matrix with \operatorname{rank}(\boldsymbol{G})<k
.
Zhu and Zhang (2022) proposed the following test statistic:
T_{ZZ}=\|\boldsymbol{C} \hat{\boldsymbol{\mu}}\|^2-q \operatorname{tr}(\hat{\boldsymbol{\Sigma}}),
where \boldsymbol{C}=[(\boldsymbol{G D G}^\top)^{-1/2}\boldsymbol{G}]\otimes\boldsymbol{I}_p
, and \hat{\boldsymbol{\mu}}=(\bar{\boldsymbol{y}}_1^\top,\ldots,\bar{\boldsymbol{y}}_k^\top)^\top
, with \bar{\boldsymbol{y}}_{i},i=1,\ldots,k
being the sample mean vectors and \hat{\boldsymbol{\Sigma}}
being the usual pooled sample covariance matrix of the k
samples.
They showed that under the null hypothesis, T_{ZZ}
and a chi-squared-type mixture have the same normal or non-normal limiting distribution.
Value
A list of class "NRtest"
containing the results of the hypothesis test. See the help file for NRtest.object
for details.
References
Zhu T, Zhang J (2022). “Linear hypothesis testing in high-dimensional one-way MANOVA: a new normal reference approach.” Computational Statistics, 37(1), 1–27. doi:10.1007/s00180-021-01110-6.
Examples
library("HDNRA")
data("corneal")
dim(corneal)
group1 <- as.matrix(corneal[1:43, ]) ## normal group
group2 <- as.matrix(corneal[44:57, ]) ## unilateral suspect group
group3 <- as.matrix(corneal[58:78, ]) ## suspect map group
group4 <- as.matrix(corneal[79:150, ]) ## clinical keratoconus group
p <- dim(corneal)[2]
Y <- list()
k <- 4
Y[[1]] <- group1
Y[[2]] <- group2
Y[[3]] <- group3
Y[[4]] <- group4
n <- c(nrow(Y[[1]]),nrow(Y[[2]]),nrow(Y[[3]]),nrow(Y[[4]]))
G <- cbind(diag(k-1),rep(-1,k-1))
ZZ2022.GLHT.3cNRT(Y,G,n,p)
Normal-reference-test with three-cumulant (3-c) matched $\chi^2$-approximation for GLHT problem under heteroscedasticity proposed by Zhang and Zhu (2022)
Description
Zhang and Zhu (2022)'s test for general linear hypothesis testing (GLHT) problem for high-dimensional data under heteroscedasticity.
Usage
ZZ2022.GLHTBF.3cNRT(Y,G,n,p)
Arguments
Y |
A list of |
G |
A known full-rank coefficient matrix ( |
n |
A vector of |
p |
The dimension of data. |
Details
Suppose we have the following k
independent high-dimensional samples:
\boldsymbol{y}_{i1},\ldots,\boldsymbol{y}_{in_i}, \;\operatorname{are \; i.i.d. \; with}\; \operatorname{E}(\boldsymbol{y}_{i1})=\boldsymbol{\mu}_i,\; \operatorname{Cov}(\boldsymbol{y}_{i1})=\boldsymbol{\Sigma}_i,i=1,\ldots,k.
It is of interest to test the following GLHT problem:
H_0: \boldsymbol{G M}=\boldsymbol{0}, \quad \text { vs. } H_1: \boldsymbol{G M} \neq \boldsymbol{0},
where
\boldsymbol{M}=(\boldsymbol{\mu}_1,\ldots,\boldsymbol{\mu}_k)^\top
is a k\times p
matrix collecting k
mean vectors and \boldsymbol{G}:q\times k
is a known full-rank coefficient matrix with \operatorname{rank}(\boldsymbol{G})<k
.
Let \bar{\boldsymbol{y}}_{i},i=1,\ldots,k
be the sample mean vectors and \hat{\boldsymbol{\Sigma}}_i,i=1,\ldots,k
be the sample covariance matrices.
Zhang and Zhu (2022) proposed the following U-statistic based test statistic:
T_{ZZ}=\|\boldsymbol{C \hat{\mu}}\|^2-\sum_{i=1}^kh_{ii}\operatorname{tr}(\hat{\boldsymbol{\Sigma}}_i)/n_i,
where \boldsymbol{C}=[(\boldsymbol{G D G}^\top)^{-1/2}\boldsymbol{G}]\otimes\boldsymbol{I}_p
, \boldsymbol{D}=\operatorname{diag}(1/n_1,\ldots,1/n_k)
, and h_{ij}
is the (i,j)
th entry of the k\times k
matrix \boldsymbol{H}=\boldsymbol{G}^\top(\boldsymbol{G}\boldsymbol{D}\boldsymbol{G}^\top)^{-1}\boldsymbol{G}
.
Value
A list of class "NRtest"
containing the results of the hypothesis test. See the help file for NRtest.object
for details.
References
Zhang J, Zhu T (2022). “A new normal reference test for linear hypothesis testing in high-dimensional heteroscedastic one-way MANOVA.” Computational Statistics & Data Analysis, 168, 107385. doi:10.1016/j.csda.2021.107385.
Examples
library("HDNRA")
data("corneal")
dim(corneal)
group1 <- as.matrix(corneal[1:43, ]) ## normal group
group2 <- as.matrix(corneal[44:57, ]) ## unilateral suspect group
group3 <- as.matrix(corneal[58:78, ]) ## suspect map group
group4 <- as.matrix(corneal[79:150, ]) ## clinical keratoconus group
p <- dim(corneal)[2]
Y <- list()
k <- 4
Y[[1]] <- group1
Y[[2]] <- group2
Y[[3]] <- group3
Y[[4]] <- group4
n <- c(nrow(Y[[1]]),nrow(Y[[2]]),nrow(Y[[3]]),nrow(Y[[4]]))
G <- cbind(diag(k-1),rep(-1,k-1))
ZZ2022.GLHTBF.3cNRT(Y,G,n,p)
Normal-reference-test with three-cumulant (3-c) matched $\chi^2$-approximation for two-sample problem proposed by Zhang and Zhu (2022)
Description
Zhang and Zhu (2022)'s test for testing equality of two-sample high-dimensional mean vectors with assuming that two covariance matrices are the same.
Usage
ZZ2022.TS.3cNRT(y1, y2)
Arguments
y1 |
The data matrix ( |
y2 |
The data matrix ( |
Details
Suppose we have two independent high-dimensional samples:
\boldsymbol{y}_{i1},\ldots,\boldsymbol{y}_{in_i}, \;\operatorname{are \; i.i.d. \; with}\; \operatorname{E}(\boldsymbol{y}_{i1})=\boldsymbol{\mu}_i,\; \operatorname{Cov}(\boldsymbol{y}_{i1})=\boldsymbol{\Sigma},i=1,2.
The primary object is to test
H_{0}: \boldsymbol{\mu}_1 = \boldsymbol{\mu}_2\; \operatorname{versus}\; H_{1}: \boldsymbol{\mu}_1 \neq \boldsymbol{\mu}_2.
Zhang et al.(2022) proposed the following test statistic:
T_{ZZ} = \frac{n_1n_2}{n} \|\bar{\boldsymbol{y}}_1 - \bar{\boldsymbol{y}}_2\|^2-\operatorname{tr}(\hat{\boldsymbol{\Sigma}}),
where \bar{\boldsymbol{y}}_{i},i=1,2
are the sample mean vectors and \hat{\boldsymbol{\Sigma}}
is the pooled sample covariance matrix.
They showed that under the null hypothesis, T_{ZZ}
and a chi-squared-type mixture have the same normal or non-normal limiting distribution.
Value
A list of class "NRtest"
containing the results of the hypothesis test. See the help file for NRtest.object
for details.
References
Zhang J, Zhu T (2022). “A revisit to Bai–Saranadasa's two-sample test.” Journal of Nonparametric Statistics, 34(1), 58–76. doi:10.1080/10485252.2021.2015768.
Examples
library("HDNRA")
data("COVID19")
dim(COVID19)
group1 <- as.matrix(COVID19[c(2:19, 82:87), ]) ## healthy group
group2 <- as.matrix(COVID19[-c(1:19, 82:87), ]) ## COVID-19 patients
ZZ2022.TS.3cNRT(group1, group2)
Normal-reference-test with three-cumulant (3-c) matched $\chi^2$-approximation for two-sample BF problem proposed by Zhang and Zhu (2022)
Description
Zhang and Zhu (2022)'s test for testing equality of two-sample high-dimensional mean vectors without assuming that two covariance matrices are the same.
Usage
ZZ2022.TSBF.3cNRT(y1, y2)
Arguments
y1 |
The data matrix ( |
y2 |
The data matrix ( |
Details
Suppose we have two independent high-dimensional samples:
\boldsymbol{y}_{i1},\ldots,\boldsymbol{y}_{in_i}, \;\operatorname{are \; i.i.d. \; with}\; \operatorname{E}(\boldsymbol{y}_{i1})=\boldsymbol{\mu}_i,\; \operatorname{Cov}(\boldsymbol{y}_{i1})=\boldsymbol{\Sigma}_i,i=1,2.
The primary object is to test
H_{0}: \boldsymbol{\mu}_1 = \boldsymbol{\mu}_2\; \operatorname{versus}\; H_{1}: \boldsymbol{\mu}_1 \neq \boldsymbol{\mu}_2.
Zhang and Zhu (2022) proposed the following test statistic:
T_{ZZ} = \|\bar{\boldsymbol{y}}_1 - \bar{\boldsymbol{y}}_2\|^2-\operatorname{tr}(\hat{\boldsymbol{\Omega}}_n),
where \bar{\boldsymbol{y}}_{i},i=1,2
are the sample mean vectors and \hat{\boldsymbol{\Omega}}_n
is the estimator of \operatorname{Cov}(\bar{\boldsymbol{y}}_1-\bar{\boldsymbol{y}}_2)
.
They showed that under the null hypothesis, T_{ZZ}
and a chi-squared-type mixture have the same normal or non-normal limiting distribution.
Value
A list of class "NRtest"
containing the results of the hypothesis test. See the help file for NRtest.object
for details.
References
Zhang J, Zhu T (2022). “A further study on Chen-Qin’s test for two-sample Behrens–Fisher problems for high-dimensional data.” Journal of Statistical Theory and Practice, 16(1), 1. doi:10.1007/s42519-021-00232-w.
Examples
library("HDNRA")
data("COVID19")
dim(COVID19)
group1 <- as.matrix(COVID19[c(2:19, 82:87), ]) ## healthy group
group2 <- as.matrix(COVID19[-c(1:19, 82:87), ]) ## COVID-19 patients
ZZ2022.TSBF.3cNRT(group1, group2)
Normal-reference-test with two-cumulant (2-c) matched $\chi^2$-approximation for GLHT problem under heteroscedasticity proposed by Zhang et al. (2022)
Description
Zhang et al. (2022)'s test for general linear hypothesis testing (GLHT) problem for high-dimensional data under heteroscedasticity.
Usage
ZZG2022.GLHTBF.2cNRT(Y,G,n,p)
Arguments
Y |
A list of |
G |
A known full-rank coefficient matrix ( |
n |
A vector of |
p |
The dimension of data. |
Details
Suppose we have the following k
independent high-dimensional samples:
\boldsymbol{y}_{i1},\ldots,\boldsymbol{y}_{in_i}, \;\operatorname{are \; i.i.d. \; with}\; \operatorname{E}(\boldsymbol{y}_{i1})=\boldsymbol{\mu}_i,\; \operatorname{Cov}(\boldsymbol{y}_{i1})=\boldsymbol{\Sigma}_i,i=1,\ldots,k.
It is of interest to test the following GLHT problem:
H_0: \boldsymbol{G M}=\boldsymbol{0}, \quad \text { vs. } \; H_1: \boldsymbol{G M} \neq \boldsymbol{0},
where
\boldsymbol{M}=(\boldsymbol{\mu}_1,\ldots,\boldsymbol{\mu}_k)^\top
is a k\times p
matrix collecting k
mean vectors and \boldsymbol{G}:q\times k
is a known full-rank coefficient matrix with \operatorname{rank}(\boldsymbol{G})<k
.
Zhang et al. (2022) proposed the following test statistic:
T_{ZZG}=\|\boldsymbol{C} \hat{\boldsymbol{\mu}}\|^2,
where \boldsymbol{C}=[(\boldsymbol{G D G}^\top)^{-1/2}\boldsymbol{G}]\otimes\boldsymbol{I}_p
with \boldsymbol{D}=\operatorname{diag}(1/n_1,\ldots,1/n_k)
, and \hat{\boldsymbol{\mu}}=(\bar{\boldsymbol{y}}_1^\top,\ldots,\bar{\boldsymbol{y}}_k^\top)^\top
with \bar{\boldsymbol{y}}_{i},i=1,\ldots,k
being the sample mean vectors.
They showed that under the null hypothesis, T_{ZZG}
and a chi-squared-type mixture have the same normal or non-normal limiting distribution.
Value
A list of class "NRtest"
containing the results of the hypothesis test. See the help file for NRtest.object
for details.
References
Zhang J, Zhou B, Guo J (2022).
“Linear hypothesis testing in high-dimensional heteroscedastic one-way MANOVA: A normal reference L^2
-norm based test.”
Journal of Multivariate Analysis, 187, 104816.
doi:10.1016/j.jmva.2021.104816.
Examples
library("HDNRA")
data("corneal")
dim(corneal)
group1 <- as.matrix(corneal[1:43, ]) ## normal group
group2 <- as.matrix(corneal[44:57, ]) ## unilateral suspect group
group3 <- as.matrix(corneal[58:78, ]) ## suspect map group
group4 <- as.matrix(corneal[79:150, ]) ## clinical keratoconus group
p <- dim(corneal)[2]
Y <- list()
k <- 4
Y[[1]] <- group1
Y[[2]] <- group2
Y[[3]] <- group3
Y[[4]] <- group4
n <- c(nrow(Y[[1]]),nrow(Y[[2]]),nrow(Y[[3]]),nrow(Y[[4]]))
G <- cbind(diag(k-1),rep(-1,k-1))
ZZG2022.GLHTBF.2cNRT(Y,G,n,p)
Normal-reference-test with two-cumulant (2-c) matched $\chi^2$-approximation for two-sample BF problem proposed by Zhang et al. (2021)
Description
Zhang et al. (2021)'s test for testing equality of two-sample high-dimensional mean vectors without assuming that two covariance matrices are the same.
Usage
ZZGZ2021.TSBF.2cNRT(y1, y2)
Arguments
y1 |
The data matrix ( |
y2 |
The data matrix ( |
Details
Suppose we have two independent high-dimensional samples:
\boldsymbol{y}_{i1},\ldots,\boldsymbol{y}_{in_i}, \;\operatorname{are \; i.i.d. \; with}\; \operatorname{E}(\boldsymbol{y}_{i1})=\boldsymbol{\mu}_i,\; \operatorname{Cov}(\boldsymbol{y}_{i1})=\boldsymbol{\Sigma}_i,i=1,2.
The primary object is to test
H_{0}: \boldsymbol{\mu}_1 = \boldsymbol{\mu}_2\; \operatorname{versus}\; H_{1}: \boldsymbol{\mu}_1 \neq \boldsymbol{\mu}_2.
Zhang et al.(2021) proposed the following test statistic:
T_{ZZGZ} = \frac{n_1n_2}{n} \|\bar{\boldsymbol{y}}_1 - \bar{\boldsymbol{y}}_2\|^2,
where \bar{\boldsymbol{y}}_{i},i=1,2
are the sample mean vectors.
They showed that under the null hypothesis, T_{ZZGZ}
and a chi-squared-type mixture have the same normal or non-normal limiting distribution.
Value
A list of class "NRtest"
containing the results of the hypothesis test. See the help file for NRtest.object
for details.
References
Zhang J, Zhou B, Guo J, Zhu T (2021). “Two-sample Behrens-Fisher problems for high-dimensional data: A normal reference approach.” Journal of Statistical Planning and Inference, 213, 142–161. doi:10.1016/j.jspi.2020.11.008.
Examples
library("HDNRA")
data("COVID19")
dim(COVID19)
group1 <- as.matrix(COVID19[c(2:19, 82:87), ]) ## healthy group
group2 <- as.matrix(COVID19[-c(1:19, 82:87), ]) ## COVID-19 patients
ZZGZ2021.TSBF.2cNRT(group1, group2)
Normal-reference-test with two-cumulant (2-c) matched $\chi^2$-approximation for two-sample problem proposed by Zhang et al. (2020)
Description
Zhang et al. (2020)'s test for testing equality of two-sample high-dimensional mean vectors with assuming that two covariance matrices are the same.
Usage
ZZZ2020.TS.2cNRT(y1, y2)
Arguments
y1 |
The data matrix ( |
y2 |
The data matrix ( |
Details
Suppose we have two independent high-dimensional samples:
\boldsymbol{y}_{i1},\ldots,\boldsymbol{y}_{in_i}, \;\operatorname{are \; i.i.d. \; with}\; \operatorname{E}(\boldsymbol{y}_{i1})=\boldsymbol{\mu}_i,\; \operatorname{Cov}(\boldsymbol{y}_{i1})=\boldsymbol{\Sigma},i=1,2.
The primary object is to test
H_{0}: \boldsymbol{\mu}_1 = \boldsymbol{\mu}_2\; \operatorname{versus}\; H_{1}: \boldsymbol{\mu}_1 \neq \boldsymbol{\mu}_2.
Zhang et al.(2020) proposed the following test statistic:
T_{ZZZ} = \frac{n_1n_2}{np}(\bar{\boldsymbol{y}}_1 - \bar{\boldsymbol{y}}_2)^\top \hat{\boldsymbol{D}}^{-1}(\bar{\boldsymbol{y}}_1 - \bar{\boldsymbol{y}}_2),
where \bar{\boldsymbol{y}}_{i},i=1,2
are the sample mean vectors, \hat{\boldsymbol{D}}
is the diagonal matrix of sample covariance matrix.
They showed that under the null hypothesis, T_{ZZZ}
and a chi-squared-type mixture have the same limiting distribution.
Value
A list of class "NRtest"
containing the results of the hypothesis test. See the help file for NRtest.object
for details.
References
Zhang L, Zhu T, Zhang J (2020). “A simple scale-invariant two-sample test for high-dimensional data.” Econometrics and Statistics, 14, 131–144. doi:10.1016/j.ecosta.2019.12.002.
Examples
library("HDNRA")
data("COVID19")
dim(COVID19)
group1 <- as.matrix(COVID19[c(2:19, 82:87), ]) ## healthy group
group2 <- as.matrix(COVID19[-c(1:19, 82:87), ]) ## COVID-19 patients
ZZZ2020.TS.2cNRT(group1,group2)
Normal-reference-test with two-cumulant (2-c) matched $\chi^2$-approximation for GLHT problem proposed by Zhu et al. (2022)
Description
Zhu et al. (2022)'s test for general linear hypothesis testing (GLHT) problem for high-dimensional data with assuming that underlying covariance matrices are the same.
Usage
ZZZ2022.GLHT.2cNRT(Y,X,C,n,p)
Arguments
Y |
A list of |
X |
A known |
C |
A known matrix of size |
n |
A vector of |
p |
The dimension of data. |
Details
A high-dimensional linear regression model can be expressed as
\boldsymbol{Y}=\boldsymbol{X\Theta}+\boldsymbol{\epsilon},
where \Theta
is a k\times p
unknown parameter matrix and \boldsymbol{\epsilon}
is an n\times p
error matrix.
It is of interest to test the following GLHT problem
H_0: \boldsymbol{C\Theta}=\boldsymbol{0}, \quad \text { vs. } H_1: \boldsymbol{C\Theta} \neq \boldsymbol{0}.
Zhu et al. (2022) proposed the following test statistic:
T_{ZZZ}=\frac{(n-k-2)}{(n-k)pq}\operatorname{tr}(\boldsymbol{S}_h\boldsymbol{D}^{-1}),
where \boldsymbol{S}_h
and \boldsymbol{S}_e
are the variation matrices due to the hypothesis and error, respectively, and \boldsymbol{D}
is the diagonal matrix with the diagonal elements of \boldsymbol{S}_e/(n-k)
.
They showed that under the null hypothesis, T_{ZZZ}
and a chi-squared-type mixture have the same limiting distribution.
Value
A list of class "NRtest"
containing the results of the hypothesis test. See the help file for NRtest.object
for details.
References
Zhu T, Zhang L, Zhang J (2023). “Hypothesis Testing in High-Dimensional Linear Regression: A Normal Reference Scale-Invariant Test.” Statistica Sinica. doi:10.5705/ss.202020.0362.
Examples
library("HDNRA")
data("corneal")
dim(corneal)
group1 <- as.matrix(corneal[1:43, ]) ## normal group
group2 <- as.matrix(corneal[44:57, ]) ## unilateral suspect group
group3 <- as.matrix(corneal[58:78, ]) ## suspect map group
group4 <- as.matrix(corneal[79:150, ]) ## clinical keratoconus group
p <- dim(corneal)[2]
Y <- list()
k <- 4
Y[[1]] <- group1
Y[[2]] <- group2
Y[[3]] <- group3
Y[[4]] <- group4
n <- c(nrow(Y[[1]]),nrow(Y[[2]]),nrow(Y[[3]]),nrow(Y[[4]]))
X <- matrix(c(rep(1,n[1]),rep(0,sum(n)),rep(1,n[2]), rep(0,sum(n)),
rep(1,n[3]),rep(0,sum(n)),rep(1,n[4])),ncol=k,nrow=sum(n))
q <- k-1
C <- cbind(diag(q),-rep(1,q))
ZZZ2022.GLHT.2cNRT(Y,X,C,n,p)
Normal-reference-test with two-cumulant (2-c) matched $\chi^2$-approximation for two-sample BF problem proposed by Zhang et al. (2023)
Description
Zhang et al. (2023)'s test for testing equality of two-sample high-dimensional mean vectors without assuming that two covariance matrices are the same.
Usage
ZZZ2023.TSBF.2cNRT(y1, y2, cutoff)
Arguments
y1 |
The data matrix ( |
y2 |
The data matrix ( |
cutoff |
An empirical criterion for applying the adjustment coefficient |
Details
Suppose we have two independent high-dimensional samples:
\boldsymbol{y}_{i1},\ldots,\boldsymbol{y}_{in_i}, \;\operatorname{are \; i.i.d. \; with}\; \operatorname{E}(\boldsymbol{y}_{i1})=\boldsymbol{\mu}_i,\; \operatorname{Cov}(\boldsymbol{y}_{i1})=\boldsymbol{\Sigma}_i,i=1,2.
The primary object is to test
H_{0}: \boldsymbol{\mu}_1 = \boldsymbol{\mu}_2\; \operatorname{versus}\; H_{1}: \boldsymbol{\mu}_1 \neq \boldsymbol{\mu}_2.
Zhang et al.(2023) proposed the following test statistic:
T_{ZZZ}=\frac{n_1 n_2}{np}(\bar{\boldsymbol{y}}_1-\bar{\boldsymbol{y}}_2)^{\top} \hat{\boldsymbol{D}}_n^{-1}(\bar{\boldsymbol{y}}_1-\bar{\boldsymbol{y}}_2),
where \bar{\boldsymbol{y}}_{i},i=1,2
are the sample mean vectors, and \hat{\boldsymbol{D}}_n=\operatorname{diag}(\hat{\boldsymbol{\Sigma}}_1/n+\hat{\boldsymbol{\Sigma}}_2/n)
with n=n_1+n_2
.
They showed that under the null hypothesis, T_{ZZZ}
and a chi-squared-type mixture have the same limiting distribution.
Value
A list of class "NRtest"
containing the results of the hypothesis test. See the help file for NRtest.object
for details.
References
Zhang L, Zhu T, Zhang J (2023). “Two-sample Behrens–Fisher problems for high-dimensional data: a normal reference scale-invariant test.” Journal of Applied Statistics, 50(3), 456–476. doi:10.1080/02664763.2020.1834516.
Examples
library("HDNRA")
data("COVID19")
dim(COVID19)
group1 <- as.matrix(COVID19[c(2:19, 82:87), ]) ## healthy group
group2 <- as.matrix(COVID19[-c(1:19, 82:87), ]) ## COVID-19 patients
ZZZ2023.TSBF.2cNRT(group1,group2,cutoff=1.2)
HDNRA_data corneal
Description
This dataset was acquired during a keratoconus study, a collaborative project involving Ms.Nancy Tripoli and Dr.Kenneth L.Cohen of Department of Ophthalmology at the University of North Carolina, Chapel Hill. The fitted feature vectors for the complete corneal surface dataset collectively into a feature matrix with dimensions of 150 × 2000.
Usage
data(corneal)
Format
'corneal'
A data frame with 150 observations on the following 4 groups.
- normal group1
row 1 to row 43 in total 43 rows of the feature matrix correspond to observations from the normal group
- unilateral suspect group2
row 44 to row 57 in total 14 rows of the feature matrix correspond to observations from the unilateral suspect group
- suspect map group3
row 58 to row 78 in total 21 of the feature matrix correspond to observations from the suspect map group
- clinical keratoconus group4
row 79 to row 150 in total 72 of the feature matrix correspond to observations from the clinical keratoconus group
References
Smaga Ł, Zhang J (2019). “Linear hypothesis testing with functional data.” Technometrics, 61(1), 99–110. doi:10.1080/00401706.2018.1456976.
Examples
library(HDNRA)
data(corneal)
dim(corneal)
group1 <- as.matrix(corneal[1:43, ]) ## normal group
dim(group1)
group2 <- as.matrix(corneal[44:57, ]) ## unilateral suspect group
dim(group2)
group3 <- as.matrix(corneal[58:78, ]) ## suspect map group
dim(group3)
group4 <- as.matrix(corneal[79:150, ]) ## clinical keratoconus group
dim(group4)
Print Method for S3 Class "NRtest"
Description
Prints the details of the NRtest object in a user-friendly manner. This method provides a clear and concise presentation of the test results contained within the NRtest object, including all relevant statistical metrics and test details.
Usage
## S3 method for class \pkg{NRtest}
## S3 method for class 'NRtest'
print(x, ...)
Arguments
x |
an NRtest object. |
... |
further arguments passed to or from other methods. |
Details
The print.NRtest
function formats and presents the contents of the NRtest
object, which includes statistical test results and related parameters. This
function is designed to provide a user-friendly display of the object's
contents, making it easier to understand the results of the analysis.
Value
Invisibly returns the input x
.
Author(s)
Pengfei Wang nie23.wp8738@e.ntu.edu.sg