% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/na.test.R
\name{na.test}
\alias{na.test}
\title{Little's Missing Completely at Random (MCAR) Test}
\usage{
na.test(x, digits = 2, p.digits = 3, as.na = NULL, check = TRUE, output = TRUE)
}
\arguments{
\item{x}{a matrix or data frame with incomplete data, where missing
values are coded as \code{NA}.}

\item{digits}{an integer value indicating the number of decimal places to
be used for displaying results.}

\item{p.digits}{an integer value indicating the number of decimal places to be
used for displaying the \emph{p}-value.}

\item{as.na}{a numeric vector indicating user-defined missing values, i.e.
these values are converted to NA before conducting the analysis.}

\item{check}{logical: if \code{TRUE}, argument specification is checked.}

\item{output}{logical: if \code{TRUE}, output is shown.}
}
\value{
Returns an object of class \code{misty.object}, which is a list with following
entries:
  \item{\code{call}}{function call}
  \item{\code{type}}{type of analysis}
  \item{\code{data}}{matrix or data frame specified in \code{x}}
  \item{\code{args}}{specification of function arguments}
  \item{\code{result}}{result table}
}
\description{
This function performs Little's Missing Completely at Random (MCAR) test
}
\details{
Little (1988) proposed a multivariate test of Missing Completely at Random (MCAR)
that tests for mean differences on every variable in the data set across subgroups
that share the same missing data pattern by comparing the observed variable means
for each pattern of missing data with the expected population means estimated using
the expectation-maximization (EM) algorithm (i.e., EM maximum likelihood estimates).
The test statistic is the sum of the squared standardized differences between the
subsample means and the expected population means weighted by the estimated
variance-covariance matrix and the number of observations within each subgroup
(Enders, 2010). Under the null hypothesis that data are MCAR, the test statistic
follows asymptotically a chi-square distribution with \eqn{\sum k_j - k} degrees
of freedom, where \eqn{k_j} is the number of complete variables for missing data
pattern \eqn{j}, and \eqn{k} is the total number of variables. A statistically
significant result provides evidence against MCAR.

Note that Little's MCAR test has a number of problems (see Enders, 2010).
\strong{First}, the test does not identify the specific variables that violates
MCAR, i.e., the test does not identify potential correlates of missingness (i.e.,
auxiliary variables). \strong{Second}, the test is based on multivariate normality,
i.e., under departure from the normality assumption the test might be unreliable
unless the sample size is large and is not suitable for categorical variables.
\strong{Third}, the test investigates mean  differences assuming that the missing
data pattern share a common covariance matrix, i.e., the test cannot detect
covariance-based deviations from MCAR stemming from a Missing at Random (MAR)
or Missing Not at Random (MNAR) mechanism because MAR and MNAR mechanisms can
also produce missing data subgroups with equal means. \strong{Fourth}, simulation
studies suggest that Little's MCAR test suffers from low statistical power,
particularly when the number of variables that violate MCAR is small, the
relationship between the data and missingness is weak, or the data are MNAR
(Thoemmes & Enders, 2007). \strong{Fifth}, the test can only reject, but cannot
prove the MCAR assumption, i.e., a statistically not significant result and failing
to reject the null hypothesis of the MCAR test does not prove the null hypothesis
that the data is MCAR. \strong{Finally}, under the null hypothesis the data are
actually MCAR or MNAR, while a statistically significant result indicates that
missing data are MAR or MNAR, i.e., MNAR cannot be ruled out regardless of the
result of the test.

This function is based on the \code{prelim.norm} function in the \pkg{norm}
package which can handle about 30 variables. With more than 30 variables
specified in the argument \code{x}, the \code{prelim.norm} function might run
into numerical problems leading to results that are not trustworthy. In this
case it is recommended to reduce the number of variables specified in the argument
\code{x}. If the number of variables cannot be reduced, it is recommended to
use the \code{LittleMCAR} function in the \pkg{BaylorEdPsych} package which can
deal with up to 50 variables. However, this package was removed from the CRAN
repository and needs to be obtained from the archive along with the \pkg{mvnmle}
which is needed for using the \code{LittleMCAR} function. Note that the
\code{mcar_test} function in the \pkg{naniar} package is also based on the
\code{prelim.norm} function which results are not trustworthy whenever the warning
message \code{In norm::prelim.norm(data) : NAs introduced by coercion to integer range}
is printed on the console.
}
\note{
Code is adapted from the R function by Eric Stemmler:
tinyurl.com/r-function-for-MCAR-test
}
\examples{
na.test(airquality)
}
\references{
Enders, C. K. (2010). \emph{Applied missing data analysis}. Guilford Press.

Thoemmes, F., & Enders, C. K. (2007, April). \emph{A structural equation model for
testing whether data are missing completely at random}. Paper presented at the
annual meeting of the American Educational Research Association, Chicago, IL.

Little, R. J. A. (1988). A test of Missing Completely at Random for multivariate
data with missing values. \emph{Journal of the American Statistical Association, 83},
1198-1202. https://doi.org/10.2307/2290157
}
\seealso{
\code{\link{as.na}}, \code{\link{na.as}}, \code{\link{na.auxiliary}},
\code{\link{na.coverage}}, \code{\link{na.descript}}, \code{\link{na.indicator}},
\code{\link{na.pattern}}, \code{\link{na.prop}}.
}
\author{
Takuya Yanagida \email{takuya.yanagida@univie.ac.at}
}
