% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/initial_estimators.R
\name{saturated_init}
\alias{saturated_init}
\title{Saturated 2SLS (split-sample initial estimator)}
\usage{
saturated_init(data, formula, cutoff, shuffle, shuffle_seed, split = 0.5)
}
\arguments{
\item{data}{A dataframe.}

\item{formula}{A formula in the format \code{y ~ x1 + x2 | x1 + z2} where
\code{y} is the dependent variable, \code{x1} are the exogenous regressors,
\code{x2} the endogenous regressors, and \code{z2} the outside instruments.}

\item{cutoff}{A numeric cutoff value used to judge whether an observation
is an outlier or not. If its absolute value is larger than the cutoff value,
the observations is classified as an outlier.}

\item{shuffle}{A logical value (\code{TRUE} or \code{FALSE}) whether the
sample should be split into sub-samples randomly. If \code{FALSE}, the sample
is simply cut into two parts using the original order of the supplied data
set.}

\item{shuffle_seed}{A numeric value that sets the seed for shuffling the
data set before splitting it. Only used if \code{shuffle == TRUE}.}

\item{split}{A numeric value strictly between 0 and 1 that determines
in which proportions the sample will be split.}
}
\value{
\code{saturated_init} returns a list with five elements. The first
four are vectors whose length equals the number of observations in the data
set. Unlike the residuals stored in a model object (usually accessible via
\code{model$residuals}), it does not ignore observations where any of y, x
or z are missing. It instead sets their values to \code{NA}.

The first element is a double vector containing the residuals for each
observation based on the model estimates. The second element contains the
standardised residuals, the third one a logical vector with \code{TRUE} if
the observation is judged as not outlying, \code{FALSE} if it is an outlier,
and \code{NA} if any of y, x, or z are missing. The fourth element of the
list is an integer vector with three values: 0 if the observations is judged
to be an outlier, 1 if not, and -1 if missing. The fifth and last element
is a list with the two initial \code{\link[ivreg]{ivreg}} model objects based
on the two different sub-samples.
}
\description{
\code{saturated_init} splits the sample into two sub-samples. The 2SLS model
is estimated on both sub-samples and the estimates of one sub-sample are
used to calculate the residuals and hence outliers from the other sub-sample.
}
\section{Warning}{

The estimator may have bad properties if the \code{split} is too unequal and
the sample size is not large enough.
}

