% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/MRFcov.R
\name{MRFcov}
\alias{MRFcov}
\title{Markov Random Fields with covariates}
\usage{
MRFcov(data, symmetrise, prep_covariates, n_nodes, n_cores, n_covariates,
  family, bootstrap = FALSE)
}
\arguments{
\item{data}{A \code{dataframe}. The input data where the \code{n_nodes}
left-most variables are variables that are to be represented by nodes in the graph}

\item{symmetrise}{The method to use for symmetrising corresponding parameter estimates
(which are taken from separate regressions). Options are \code{min} (take the coefficient with the
smallest absolute value), \code{max} (take the coefficient with the largest absolute value)
or \code{mean} (take the mean of the two coefficients). Default is \code{mean}}

\item{prep_covariates}{Logical. If \code{TRUE}, covariate columns will be cross-multiplied
with nodes to prep the dataset for MRF models. Note this is only useful when additional
covariates are provided. Therefore, if \code{n_nodes < ncol(data)},
default is \code{TRUE}. Otherwise, default is \code{FALSE}. See
\code{\link{prep_MRF_covariates}} for more information}

\item{n_nodes}{Positive integer. The index of the last column in \code{data}
which is represented by a node in the final graph. Columns with index
greater than n_nodes are taken as covariates. Default is the number of
columns in \code{data}, corresponding to no additional covariates}

\item{n_cores}{Positive integer. The number of cores to spread the job across using
\code{\link[parallel]{makePSOCKcluster}}. Default is 1 (no parallelisation)}

\item{n_covariates}{Positive integer. The number of covariates in \code{data}, before cross-multiplication.
Default is \code{ncol(data) - n_nodes}}

\item{family}{The response type. Responses can be quantitative continuous (\code{family = "gaussian"}),
non-negative counts (\code{family = "poisson"}) or binomial 1s and 0s (\code{family = "binomial"}).
If using (\code{family = "binomial"}), please note that if nodes occur in less than 5 percent
of observations this can make it generally difficult to
estimate occurrence probabilities (on the extreme end, this can result in intercept-only
models being fitted for the nodes in question). The function will issue a warning in this case.
If nodes occur in more than 95 percent of observations, this will return an error as the cross-validation
step will generally be unable to proceed.}

\item{bootstrap}{Logical. Used by \code{\link{bootstrap_MRF}} to reduce memory usage}
}
\value{
A \code{list} containing:
\itemize{
   \item \code{graph}: Estimated parameter matrix of interaction effects
   \item \code{intercepts}: Estimated parameter vector of node intercepts
   \item \code{indirect_coefs}: \code{list} containing matrices of indirect effects of
   each covariate on node interactions
   \item \code{direct_coefs}: \code{matrix} of direct covariate effects on
   node occurrence probabilities
   \item \code{param_names}: Character string of covariate parameter names
   \item \code{mod_type}: A character stating the type of model that was fit
   (used in other functions)
   \item \code{mod_family}: A character stating the family of model that was fit
   (used in other functions)
   \item \code{poiss_sc_factors}: A vector of the square-root mean scaling factors
   used to standardise \code{poisson} variables (only returned if \code{family = "poisson"})
   }
}
\description{
This function is the workhorse of the \code{MRFcov} package, running
separate penalized regressions for each node to estimate parameters of
Markov Random Fields (MRF) graphs. Covariates can be included
(a class of models known as Conditional Random Fields; CRF), to estimate
how interactions between nodes vary across covariate magnitudes.
}
\details{
Separate penalized regressions are used to approximate
MRF parameters, where the regression for node \code{j} includes an
intercept and beta coefficients for the abundance (families \code{gaussian} or \code{poisson})
or presence-absence (family \code{binomial}) of all other
nodes (\code{/j}) in \code{data}. If covariates are included, beta coefficients
are also estimated for the effect of the covariate on \code{j} and the
effects of the covariate on interactions between \code{j} and all other species
(\code{/j}). Note that coefficients must be estimated on the same scale in order
for the resulting models to be unified into a Markov Random Field. Counts for \code{poisson}
variables will be therefore standardised using the square root mean transformation
\code{x = x / sqrt(mean(x ^ 2))} so that they are on similar ranges. These transformed counts
will then be used in a \code{(family = "gaussian")} model and their respective scaling factors
will be returned so that coefficients can be unscaled before interpretation (this unscaling is
performed automatatically by other functions including \code{\link{predict_MRF}}
and \code{\link{cv_MRF_diag}}). Gaussian variables are not automatically transformed, so
if they cover quite different ranges and scales, then it is recommended to scale them prior to fitting
models.
\cr
\cr
Note that since the number of parameters quickly increases with increasing
numbers of species and covariates, LASSO penalization is used to regularize
regressions based on values of the regularization parameter \code{lambda1}.
This can be done either by minimising the cross-validated
mean error for each node separately (using \code{\link[glmnet]{cv.glmnet}}) or by
running all regressions at a single \code{lambda1} value. The latter approach may be
useful for optimising all nodes as part of a joint graphical model, while the former
is likely to be more appropriate for maximising the log-likelihood of each node
separately before unifying the nodes into a graph. See \code{\link[penalized]{penalized}}
and \code{\link[glmnet]{cv.glmnet}} for further details.
}
\examples{
data("Bird.parasites")
CRFmod <- MRFcov(data = Bird.parasites, n_nodes = 4, family = 'binomial')

}
\references{
Ising, E. (1925). Beitrag zur Theorie des Ferromagnetismus.
Zeitschrift für Physik A Hadrons and Nuclei, 31, 253-258.\cr\cr
Cheng, J., Levina, E., Wang, P. & Zhu, J. (2014).
A sparse Ising model with covariates. (2012). Biometrics, 70, 943-953.\cr\cr
Clark, NJ, Wells, K and Lindberg, O.
Unravelling changing interspecific interactions across environmental gradients
using Markov random fields. (2018). Ecology doi: 10.1002/ecy.2221
\href{http://nicholasjclark.weebly.com/uploads/4/4/9/4/44946407/clark_et_al-2018-ecology.pdf}{Full text here}.\cr\cr
Sutton C, McCallum A. An introduction to conditional random fields.
Foundations and Trends in Machine Learning 4, 267-373.
}
\seealso{
Cheng et al. (2014), Sutton & McCallum (2012) and Clark et al. (2018)
for overviews of Conditional Random Fields. See \code{\link[glmnet]{cv.glmnet}} for
details of cross-validated optimization using LASSO penalty
}
