% Generated by roxygen2 (4.1.1): do not edit by hand
% Please edit documentation in R/EarlyConservationTest.R
\name{EarlyConservationTest}
\alias{EarlyConservationTest}
\title{Perform the Reductive Early Conservation Test}
\usage{
EarlyConservationTest(ExpressionSet, modules = NULL, permutations = 1000,
  lillie.test = FALSE, plotHistogram = FALSE, runs = 10,
  parallel = FALSE, gof.warning = FALSE, custom.perm.matrix = NULL)
}
\arguments{
\item{ExpressionSet}{a standard PhyloExpressionSet or DivergenceExpressionSet object.}

\item{modules}{a list storing three elements: early, mid, and late. Each element expects a numeric
vector specifying the developmental stages or experiments that correspond to each module.
For example, \code{module} = list(early = 1:2, mid = 3:5, late = 6:7) devides a dataset
storing seven developmental stages into 3 modules.}

\item{permutations}{a numeric value specifying the number of permutations to be performed for the \code{ReductiveHourglassTest}.}

\item{lillie.test}{a boolean value specifying whether the Lilliefors Kolmogorov-Smirnov Test shall be performed to quantify the goodness of fit.}

\item{plotHistogram}{a boolean value specifying whether a \emph{Lillifor's Kolmogorov-Smirnov-Test}
shall be performed to test the goodness of fit of the approximated distribution, as well as additional plots quantifying the significance
of the observed phylotranscriptomic pattern.}

\item{runs}{specify the number of runs to be performed for goodness of fit computations, in case \code{plotHistogram} = \code{TRUE}.
In most cases \code{runs} = 100 is a reasonable choice. Default is \code{runs} = 10 (because it takes less computation time for demonstration purposes).}

\item{parallel}{performing \code{runs} in parallel (takes all cores of your multicore machine).}

\item{gof.warning}{a logical value indicating whether non significant goodness of fit results should be printed as warning. Default is \code{gof.warning = FALSE}.}

\item{custom.perm.matrix}{a custom \code{\link{bootMatrix}} (permutation matrix) to perform the underlying test statistic. Default is \code{custom.perm.matrix = NULL}.}
}
\value{
a list object containing the list elements:

\code{p.value} : the p-value quantifying the statistical significance (low-high-high pattern) of the given phylotranscriptomics pattern.

\code{std.dev} : the standard deviation of the N sampled phylotranscriptomics patterns for each developmental stage S.

\code{lillie.test} : a boolean value specifying whether the \emph{Lillifors KS-Test} returned a p-value > 0.05,
which indicates that fitting the permuted scores with a normal distribution seems plausible.
}
\description{
The \emph{Reductive Early Conservation Test} aims to statistically evaluate the
existence of a monotonically increasing phylotranscriptomic pattern based on \code{\link{TAI}} or \code{\link{TDI}} computations.
The corresponding p-value quantifies the probability that a given TAI or TDI pattern (or any phylotranscriptomics pattern)
does not follow an early conservation like pattern. A p-value < 0.05 indicates that the corresponding phylotranscriptomics pattern does
indeed follow an early conservation (low-high-high) shape.
}
\details{
The \emph{reductive early conservation test} is a permutation test based on the following test statistic.

(1) A set of developmental stages is partitioned into three modules - early, mid, and late - based on prior biological knowledge.

(2) The mean \code{\link{TAI}} or \code{\link{TDI}} value for each of the three modules T_early, T_mid, and T_late are computed.

(3) The two differences D1 = T_mid - T_early and D2 = T_late - T_early are calculated.

(4) The minimum D_min of D1 and D2 is computed as final test statistic of the reductive hourglass test.


In order to determine the statistical significance of an observed minimum difference D_min
the following permutation test was performed. Based on the \code{\link{bootMatrix}} D_min
is calculated from each of the permuted \code{\link{TAI}} or \code{\link{TDI}} profiles,
approximated by a Gaussian distribution with method of moments estimated parameters returned by \code{\link[fitdistrplus]{fitdist}},
and the corresponding p-value is computed by \code{\link{pnorm}} given the estimated parameters of the Gaussian distribution.
The \emph{goodness of fit} for the random vector \emph{D_min} is statistically quantified by an Lilliefors (Kolmogorov-Smirnov) test
for normality.


In case the parameter \emph{plotHistogram = TRUE}, a multi-plot is generated showing:

(1) A Cullen and Frey skewness-kurtosis plot generated by \code{\link[fitdistrplus]{descdist}}.
This plot illustrates which distributions seem plausible to fit the resulting permutation vector D_min.
In the case of the \emph{reductive early conservation test} a normal distribution seemed plausible.

(2) A histogram of D_min combined with the density plot is plotted. D_min is then fitted by a normal distribution.
The corresponding parameters are estimated by \emph{moment matching estimation} using the \code{\link[fitdistrplus]{fitdist}} function.

(3) A plot showing the p-values for N independent runs to verify that a specific p-value is biased by a specific permutation order.

(4) A barplot showing the number of cases in which the underlying goodness of fit (returned by Lilliefors (Kolmogorov-Smirnov) test
for normality) has shown to be significant (\code{TRUE}) or not significant (\code{FALSE}).
This allows to quantify the permutation bias and their implications on the goodness of fit.
}
\examples{
data(PhyloExpressionSetExample)

# perform the early conservation test for a PhyloExpressionSet
# here the prior biological knowledge is that stages 1-2 correspond to module 1 = early,
# stages 3-5 to module 2 = mid (phylotypic module), and stages 6-7 correspond to
# module 3 = late
EarlyConservationTest(PhyloExpressionSetExample,
                       modules = list(early = 1:2, mid = 3:5, late = 6:7),
                       permutations = 1000)


# use your own permutation matrix based on which p-values (EarlyConservationTest)
# shall be computed
custom_perm_matrix <- bootMatrix(PhyloExpressionSetExample,100)

EarlyConservationTest(PhyloExpressionSetExample,
                       modules = list(early = 1:2, mid = 3:5, late = 6:7),
                       custom.perm.matrix = custom_perm_matrix)
}
\author{
Hajk-Georg Drost
}
\references{
Drost HG et al. (2015) \emph{Evidence for Active Maintenance of Phylotranscriptomic Hourglass Patterns in Animal and Plant Embryogenesis}. Mol Biol Evol. 32 (5): 1221-1231 doi:10.1093/molbev/msv012.

Quint M et al. (2012). \emph{A transcriptomic hourglass in plant embryogenesis}. Nature (490): 98-101.

Piasecka B, Lichocki P, Moretti S, et al. (2013) \emph{The hourglass and the early conservation models co-existing
patterns of developmental constraints in vertebrates}. PLoS Genet. 9(4): e1003476.
}
\seealso{
\code{\link{ecScore}}, \code{\link{bootMatrix}}, \code{\link{FlatLineTest}},\code{\link{ReductiveHourglassTest}} , \code{\link{PlotPattern}}
}

