% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/diff_groups.R
\name{diff_groups}
\alias{diff_groups}
\title{Difference and Sum Groups}
\usage{
diff_groups(
  x,
  ...,
  hiddenNA = TRUE,
  sep_common = "_=_",
  sep_diff = "_-_",
  sep_sum = c("_=_", "_+_"),
  outputNA = "NA",
  diff_extra = FALSE
)
}
\arguments{
\item{x}{A data frame with exactly two columns.}

\item{...}{Additional arguments passed to \code{RowGroups()}.}

\item{hiddenNA}{Logical. When \code{TRUE} (default), missing codes (\code{NA}) are treated as
hidden categories — they are not available for computing difference and sum groups.
See \emph{Note} for details on how this differs from the \code{NAomit} parameter in \code{RowGroups()}.}

\item{sep_common}{A character string used in the \code{common} column to separate codes
that are identical across the two input columns.}

\item{sep_diff}{A character string used in the \code{diff_1_2} and \code{diff_2_1} columns to
indicate difference groups. The first column contains the parent code, and one or more
child codes from the other column are subtracted.}

\item{sep_sum}{A character vector of one or two elements used in the \code{sum_1_2} and
\code{sum_2_1} columns to describe relationships where a code in one column represents the
sum of several codes in the other. The first element (\code{sep_sum[1]}) acts as an equality
sign, and the second element (\code{sep_sum[2]}) acts as a plus sign. If \code{sep_sum} has
length 1, the same value is used for both positions.}

\item{outputNA}{Character string used to represent \code{NA} values within the newly
constructed text strings in the additional output columns.
Only relevant when \code{hiddenNA = FALSE}.}

\item{diff_extra}{Logical. When TRUE, additional difference-group variables are returned when found.}
}
\value{
A list (as returned by \code{RowGroups()}), where the \code{groups} data frame is
extended with additional descriptive columns indicating common, difference, and sum
relationships between the two code columns.
}
\description{
This function is a wrapper around \code{\link[=RowGroups]{RowGroups()}} for the specific case where the input
contains two columns. It calls \code{RowGroups()} with \code{returnGroups = TRUE}, and extends
the resulting data frame of unique code combinations with additional information
about common groups, difference groups, and sum groups.
}
\details{
The returned list contains the same elements as from \code{\link[=RowGroups]{RowGroups()}}, but with an
extended \code{groups} data frame. The columns describe relationships between the two
input columns as follows:
\itemize{
\item \strong{is_common} — \code{TRUE} when the two codes on the row are identical.
\item \strong{is_child_1}, \strong{is_child_2} — \code{TRUE} when the code in the column is a subset or
subgroup of a code in the other column.
\item \strong{common} — identical code pairs, formatted using \code{sep_common}.
\item \strong{diff_1_2}, \strong{diff_2_1} — difference groups. The first element is the parent
from the source column, followed by one or more child codes from the opposite
column, joined using \code{sep_diff}.
\item \strong{sum_1_2}, \strong{sum_2_1} — sum groups where a parent code in one column equals the
sum of several codes in the other.
}
}
\note{
The parameter \code{NAomit} from \code{RowGroups()} can still be set via \code{...}, but using it
will remove rows containing \code{NA} before processing. The relationships found will then
reflect the reduced data, which is usually not the intended behaviour when identifying
relationships between code sets.
}
\examples{

df <- SSBtoolsData("code_pairs")

df

diff_groups(df)


d2 <- SSBtoolsData("d2")
diff_groups(d2[1:2])$groups
diff_groups(d2[2:3])$groups
                           

}
\seealso{
\code{\link[=data_diff_groups]{data_diff_groups()}} for adding the results back as new columns in the data frame.
}
