% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/agnostic_quantile.R
\name{drQuantile}
\alias{drQuantile}
\title{Sample Quantiles for 'ddf' Objects}
\usage{
drQuantile(x, var, by = NULL, probs = seq(0, 1, 0.005), preTransFn = NULL,
  varTransFn = identity, varRange = NULL, nBins = 10000, tails = 100,
  params = NULL, packages = NULL, control = NULL, ...)
}
\arguments{
\item{x}{a 'ddf' object}

\item{var}{the name of the variable to compute quantiles for}

\item{by}{an optional variable name or vector of variable names by which to group quantile computations}

\item{probs}{numeric vector of probabilities with values in [0-1]}

\item{preTransFn}{a transformation function (if desired) to applied to each subset prior to computing quantiles (here it may be useful for adding a "by" variable that is not present) - note: this transformation should not modify \code{var} (use \code{varTransFn} for that) - also note: this is deprecated - instead use \code{\link{addTransform}} prior to calling divide}

\item{varTransFn}{transformation to apply to variable prior to computing quantiles}

\item{varRange}{range of x (can be left blank if summaries have been computed)}

\item{nBins}{how many bins should the range of the variable be split into?}

\item{tails}{how many exact values at each tail should be retained?}

\item{params}{a named list of objects external to the input data that are needed in the distributed computing (most should be taken care of automatically such that this is rarely necessary to specify)}

\item{packages}{a vector of R package names that contain functions used in \code{fn} (most should be taken care of automatically such that this is rarely necessary to specify)}

\item{control}{parameters specifying how the backend should handle things (most-likely parameters to \code{rhwatch} in RHIPE) - see \code{\link{rhipeControl}} and \code{\link{localDiskControl}}}

\item{\ldots}{additional arguments}
}
\value{
data frame of quantiles \code{q} and their associated f-value \code{fval}.  If \code{by} is specified, then also a variable \code{group}.
}
\description{
Compute sample quantiles for 'ddf' objects
}
\details{
This division-agnostic quantile calculation algorithm takes the range of the variable of interest and splits it into \code{nBins} bins, tabulates counts for those bins, and reconstructs a quantile approximation from them.  \code{nBins} should not get too large, but larger \code{nBins} gives more accuracy.  If \code{tails} is positive, the first and last \code{tails} ordered values are attached to the quantile estimate - this is useful for long-tailed distributions or distributions with outliers for which you would like more detail in the tails.
}
\examples{
# break the iris data into k/v pairs
irisSplit <- list(
  list("1", iris[1:10,]), list("2", iris[11:110,]), list("3", iris[111:150,])
)
# represent it as ddf
irisSplit <- ddf(irisSplit, update = TRUE)

# approximate quantiles over the divided data set
probs <- seq(0, 1, 0.005)
iq <- drQuantile(irisSplit, var = "Sepal.Length", tails = 0, probs = probs)
plot(iq$fval, iq$q)

# compare to the all-data quantile "type 1" result
plot(probs, quantile(iris$Sepal.Length, probs = probs, type = 1))

}
\author{
Ryan Hafen
}
\seealso{
\code{\link{updateAttributes}}
}

