\name{LSMfit}
\alias{LSMfit}

\title{
  Fitting Latent Space Item Response Models using Joint Maximum Likelihood Estimation}
\description{
  This function fits a Latent Space Item Response Model (LSIRM) with an \code{R} dimensional latent space using penalized  Joint Maximum Likelihood (pJML) or constrained Joint Maxmum Likelihood (cJML) to observed binary or ordinal item scores. }

\usage{
LSMfit(X, ndim_z, penalty=NULL, C=NULL, starts=NULL,
       tol=.1e-2, silent=FALSE)

}
\arguments{

  \item{X}{A matrix of size \code{N} by \code{n} containing the binary or ordinal item scores, where \code{N} is
        the number of subjects and \code{n} is the number of items. The number of item score categories can be             different across items as long as the lowest score is coded 0 for all items. NA's are allowed.}

  \item{ndim_z}{Number of dimensions of the latent space, \code{R}.}

  \item{penalty}{The weight for the L2 penalty of pJML. If both \code{penalty} and \code{C} (see below) are
        \code{NULL} (the default), a pJML is used with a weight of 1 (i.e., standard normal prior on all parameters ).}

  \item{C}{The maximum size of the norm of the item parameter vectors. The resulting maximum norm for the person parameter vectors is \code{1/2*C}.}

  \item{starts}{Either a list containing starting values for the model parameters (see \bold{details}) or a character string inidcating the method of starting value calculation. The options are:
        \describe{
              \item{"wls"}{the starting values are determined from fitting a \code{R+1} dimensional item factor model to the polychoric correlation matrix of \code{X} using weighted least squares}
              \item{"ml"}{the starting values are determined from fitting a \code{R+1} dimensional linear factor model to the polychoric correlation matrix of \code{X} using normal theory maximum likelihood
              (default)}
              \item{"random"}{starting values are drawn from normal distributions.}
        }}

  \item{tol}{Convergence criterion: Iterations stop if the difference in loglikelihoods between two subsequent iterations is smaller than this number. Default is .001.}

  \item{silent}{Logical. If FALSE, iterations details are printed to the screen during estimation.}

  }

\details{
  \code{LSMfit} optimizes the joint likelihood function of the LSIRM described in Molenaar and Jeon (in press) using a variant of the alternating optimization algorithm by Chen et al. (2019) and using either a L2 regularization penalty similar to Bergner et al. (2022) or a constraint on the norms of the parameter vectors similar to Chen et al. (2019).
  For binary \eqn{X_{pi}}, the LISRM by Molenaar and Jeon is given by:
    \deqn{logit(E(X_{pi})) = \theta_p + b_i - (\Sigma_{r=1}^{R} (z_{pr}-w_{ir})^2)^{1/2}}

  where \eqn{\theta_p} is a person intercept, \eqn{b_i} is an item intercept, \eqn{R} denotes the dimension of the latent space, and \eqn{z_{pr}} and \eqn{w_{ir}} are respectively the person and item coordinates in the latent space. The matrix \eqn{\bold{W}} containing the \eqn{w_{ir}} parameters is constrained to an echelon structure (i.e., all elements from the upper triangle of submatrix \eqn{\bold{W}_{1:(R-1),1:(R-1)}} are fixed to 0.  Next, cJML estimation involves constraining the Euclidean norm of person parameter vector \eqn{\tau_{1p}=[\theta_p,z_{p1},z_{p2},...,z_{pK}]} to be equal to \code{C}, and the Euclidean norm of the item parameter vector \eqn{\tau_{2i}=[b_i,w_{i1},w_{i2},...,w_{iK}]} to be equal to \code{1/2*C}. On the contrary, pJML estimation involves adding an L2 regularization penalty for all parameters to the joint likelihood function in such a way that the \code{penalty} parameter can be interpreted as the precision of a 0-centered normal prior on the parameters.

  Using the \code{starts} argument, starting values can be provided in a list containing entries:
  \describe{
  \item{\code{z0}}{a \code{N} by \code{R} matrix with starting values for \eqn{z_{pr}}}
  \item{\code{w0}}{a \code{n} by \code{R} matrix with starting values for \eqn{w_{ir}}}
  \item{\code{b0}}{a \code{n} by \code{1} matrix with starting values for \eqn{b_i}}
  \item{\code{theta0}}{a \code{N} by \code{1} matrix with starting values for \eqn{\theta_p}}
  }
  Alternatively, starting values can be automatically determined by \code{LSMfit}. To this end, the following \code{R+1} factor model wil be fit (omitting the item intercept):

  \deqn{g(E(X_{pi}))=\eta_{p0}+\Sigma_{r=1}^{R} \lambda_{ir} \eta_{pr}}

  where the \code{n} by \code{R} matrix of \eqn{\lambda_{ir}} parameters follows the echelon structure above, and \eqn{g(.)} is either the identity link or the probit link (see below). The model above is fit to the polychoric correlation matrix of \eqn{\bold{X}} using either weighted least squares (WLS) estimation with a probit link for \eqn{g(.)} or normal theory maximum likelihood (ML) estimation with an identity link for \eqn{g(.)} (i.e., the polychoric correlation matrix is treated as a pmcc matrix). In both cases, the thresholds of the polychoric correlation matrix are taken as the basis for the starting values of \eqn{b_i}, the factor score estimates of \eqn{\eta_{p0}} are taken as starts for \eqn{\theta_p}, the estimates of \eqn{\eta_{pr}} are taken as the starts for \eqn{z_{pr}}, and the estimates of \eqn{\lambda_{ir}} are taken as a basis for the starting values of \eqn{w_{ir}}. The WLS approach is statistically the most rigorous approach but can be time consuming, while the ML approach is an ad-hoc approach but which is fast and turns out to work well in practice. The ML approach is the default approach to obtain starting values if \code{starts=NULL}. Especially for models with \code{R>2} fitting the factor model above may fail. In that case, \code{LSMfit} automatically switches to random starts.

  Ordinal items are internally accomodated by dummy coding the items with more than 2 score levels into \code{C-1} binary variables using a cummulatrive binary coding scheme (see Molenaar & Jeon, in press). Next, the dummy coded variables are submitted to the binary LSIRM above with the \eqn{w_{ir}} parameters equated for dummy coded variables that correspond to the same original items. In the resuling model, the estimates of \eqn{b_i} correspond to the category parameters of a sequential IRT model (Tutz, 1990) which are generally close to those of a graded response IRT model. The number of score levels can be different across items as long as the lowest score is coded 0 for all items



}
\value{
  An object of class \code{LSMfit} with values
   \item{ theta}{ \eqn{\theta_p} estimates}
   \item{ b}{ \eqn{b_i} estimates}
   \item{ z}{ \eqn{z_{pr}} estimates}
   \item{ w}{ \eqn{z_{ir}} estimates}
   \item{ logL }{ value of the loglikelihood at convergence}
   \item{ starts}{ the starting values used}
   \item{ as_starts}{ a list containing the parameter estimates, suitable to be used as argument for \code{starts} in a new run }
   \item{ internal}{ various matrices used internally }
}
\references{

Bergner, Y., Halpin, P., & Vie, J. J. (2022). Multidimensional Item Response Theory in the Style of	Collaborative Filtering. \emph{Psychometrika}, \bold{87(1)}, 266-288.
https://doi.org/10.1007/s11336-021-09788-9

Chen, Y., Li, X., & Zhang, S. (2019). Joint maximum likelihood estimation for high-dimensional
exploratory item factor analysis. \emph{Psychometrika}, \bold{84(1)}, 124-146.
https://doi.org/10.1007/s11336-018-9646-5

Molenaar, D., & Jeon, M.J. (in press). Joint maximum likelihood estimation of latent space item response models. \emph{Psychometrika}.

Tutz, G. (1990). Sequential item response models with an ordered response. \emph{British Journal of Mathematical and Statistical Psychology}, \bold{43(1)}, 39-55.
https://doi.org/10.1111/j.2044-8317.1990.tb00925.x

}
\author{ Dylan Molenaar \email{d.molenaar@uva.nl}}

\seealso{
\code{\link{LSMselect}} for selecting the number of latent space dimensions using cross-validation.
\code{\link{LSMsim}} for simulating data according to the LSIRM.
\code{\link{LSMrotate}} for rotating item and person coordinates.

}
\examples{
 #
 # only binary items
 #

 # data sim with 1000 subjects and 20 binary items
 # according to 2 dimensional latent space model (R=2)
 set.seed(1111)
 N=1000
 nit=20
 ndim_z=2
 dat_obj=LSMsim(N,nit,ndim_z)
 X=dat_obj$X
 zt=dat_obj$par$zt      # rotated true z, see ?LSMsim and ?LSMrotate
 wt=dat_obj$par$wt      # rotated true w

 #fit model
 results=LSMfit(X,2)

 #plot the parameter recovery results
 oldpar=par(mfrow=c(2,2))

 s_p=sign(cor(results$z,zt))          # to correct for sign switches in the plots
 s_i=sign(cor(results$w,wt))

 plot(s_p[1,1]*zt[,1],results$z[,1]); abline(0,1)
 plot(s_p[2,2]*zt[,2],results$z[,2]); abline(0,1)
 plot(s_i[1,1]*wt[,1],results$w[,1]); abline(0,1)
 plot(s_i[2,2]*wt[,2],results$w[,2]); abline(0,1)

 par(oldpar)

\donttest{
 #
 # mixed scale items
 #

 # data sim with 1000 subjects and 20 mixed scale items
 # according to 2 dimensional latent space model (R=2)
 set.seed(1111)
 N=1000
 nit=20
 ndim_z=2
 nc=rpois(nit,2)+2   # number of response categories
                     # (between 2 and 7 for this seed)
 dat_obj=LSMsim(N,nit,ndim_z,nc=nc)
 X=dat_obj$X
 zt=dat_obj$par$zt      # rotated true z, see ?LSMsim and ?LSMrotate
 wt=dat_obj$par$wt      # rotated true w

 #fit model
 results=LSMfit(X,2)

 #plot the parameter recovery results
 oldpar=par(mfrow=c(2,2))

 s_p=sign(cor(results$z,zt))          # to correct for sign switches in the plots
 s_i=sign(cor(results$w,wt))

 plot(s_p[1,1]*zt[,1],results$z[,1]); abline(0,1)
 plot(s_p[2,2]*zt[,2],results$z[,2]); abline(0,1)
 plot(s_i[1,1]*wt[,1],results$w[,1]); abline(0,1)
 plot(s_i[2,2]*wt[,2],results$w[,2]); abline(0,1)

 par(oldpar)
 }
}
\keyword{models}
\keyword{multivariate}
