
<!-- README.md is generated from README.Rmd. Please edit that file -->

# pedmut

<!-- badges: start -->

[![CRAN
status](https://www.r-pkg.org/badges/version/pedmut)](https://CRAN.R-project.org/package=pedmut)
[![](https://cranlogs.r-pkg.org/badges/grand-total/pedmut?color=yellow)](https://cran.r-project.org/package=pedmut)
[![](https://cranlogs.r-pkg.org/badges/last-month/pedmut?color=yellow)](https://cran.r-project.org/package=pedmut)
<!-- badges: end -->

## Introduction

The aim of **pedmut** to provide a framework for modelling mutations in
pedigree computations. It is a core member of the [**ped
suite**](https://magnusdv.github.io/pedsuite/) collection of packages
for pedigree analysis in R. Although **pedmut** is self-contained, its
main purpose is to be imported by other **ped suite** packages, in
particular [pedprobr](https://github.com/magnusdv/pedprobr) (marker
probabilities and pedigree likelihoods) and
[forrel](https://github.com/magnusdv/forrel) (forensic pedigree
analysis).

For the theoretical background of mutation models and their properties
(stationarity, reversibility, lumpability), I recommend Chapter 5 of
[Pedigree analysis in
R](https://www.elsevier.com/books/pedigree-analysis-in-r/vigeland/978-0-12-824430-2),
and the references therein.

## Installation

``` r
# The easiest way to get `pedmut` is to install the entire `ped suite`:
install.packages("pedsuite")

# Alternatively, you can install just `pedmut`:
install.packages("pedmut")

# If you need the latest development version, install it from GitHub:
# install.packages("devtools")
devtools::install_github("magnusdv/pedmut")
```

## A simple likelihood example

The examples to follow require the packages **pedmut**, **pedprobr** and
**pedtools**. Since all of these are core **ped suite** packages, they
are all loaded by:

``` r
library(pedsuite)
```

The figure below shows a father and son who are homozygous for different
alleles. We assume that the locus is an autosomal marker with 4 alleles,
labelled 1, 2, 3, and 4. The data clearly constitutes a *Mendelian
error*, and would give a likelihood of 0 unless mutations are modelled.

<img src="man/figures/README-ex1-ped-1.png" style="display: block; margin: auto;" />

The following code specifies a “proportional” mutation model (see below
for details) for this marker and compute the pedigree likelihood under
this model.

``` r
# Create pedigree
x = nuclearPed(father = "fa", mother = "mo", child = "boy")

# Create marker with mutation model
m = marker(x, fa = "1/1", boy = "2/2", alleles = 1:4, mutmod = "prop", rate = 0.1)

# Plot
plot(x, marker = m)

# Compute likelihood
likelihood(x, m)
#> [1] 0.0005208333
```

Although invisible to the end user, the **pedmut** package is involved
twice in the above code: first in `marker()`, translating the arguments
`mutmod = "prop"` and `rate = 0.1` into a complete mutation model. And
secondly inside `likelihood()`, by processing the mutation model in
setting up the likelihood calculation.

To see details about the mutation model attached to a marker, we can use
the `mutmod()` accessor:

``` r
mutmod(m)
#> Unisex mutation matrix:
#>            1          2          3          4
#> 1 0.90000000 0.03333333 0.03333333 0.03333333
#> 2 0.03333333 0.90000000 0.03333333 0.03333333
#> 3 0.03333333 0.03333333 0.90000000 0.03333333
#> 4 0.03333333 0.03333333 0.03333333 0.90000000
#> 
#> Model: proportional 
#> Rate: 0.1 
#> Frequencies: 0.25, 0.25, 0.25, 0.25 
#> 
#> Stationary: Yes 
#> Reversible: Yes 
#> Lumpable: Always
```

## Mutation models

A mutation matrix, in **pedmut**, is a stochastic matrix with each row
summing to 1, where the rows and columns are named with allele labels.

Two central functions of package are `mutationMatrix()` and
`mutationModel()`. The former of these constructs a single mutation
matrix according to various model specifications. The latter is a
shortcut for producing what is typically required in practical
applications, namely a list of *two* mutation matrices, named “male” and
“female”.

The mutations models currently implemented in **pedmut** are:

-   `equal`: All mutations equally likely; probability `1-rate` of no
    mutation. Parameters: `rate`.

-   `proportional`: Mutation probabilities are proportional to the
    target allele frequencies. Parameters: `rate`, `afreq`.

-   `random`: This produces a matrix of random numbers, each row
    normalised to have sum 1. Parameters: `seed`.

-   `custom`: Allows any valid mutation matrix to be provided by the
    user. Parameters: `matrix`.

-   `onestep`: Applicable if all alleles are integers. Mutations are
    allowed only to the nearest integer neighbour. Parameters: `rate`.

-   `stepwise`: For this model alleles must be integers or decimal
    numbers with a single decimal, such as ‘17.1’, indicating a
    microvariant. Mutation rates depend on whether transitions are
    within the same group or not, i.e., between integer alleles and
    microvariants in the latter case. Mutations also depend on the size
    of the mutation as modelled by the parameter `range`, the relative
    probability of mutating n+1 steps versus mutating n steps.
    Parameters: `rate`, `rate2`, `range`.

-   `trivial`: Diagonal mutation matrix with 1 on the diagonal.
    Parameters: None.

## Model properties

Certain properties of mutation models are of particular interest - both
theoretical and practical - for likelihood computations. The **pedmut**
package provides utility functions for quickly checking whether a given
model these properties:

-   `isStationary(M, afreq)`: Checks if `afreq` is a right eigenvector
    of the mutation matrix `M`

-   `isReversible(M, afreq)`: Checks if `M` together with `afreq` form a
    *reversible* Markov chain, i.e., that they satisfy the [detailed
    balance](https://en.wikipedia.org/wiki/Detailed_balance) criterion

-   `isLumpable(M, lump)`: Checks if `M` allows clustering (“lumping”)
    of a given subset of alleles. This implements the necessary and
    sufficient condition of *strong lumpability* of Kemeny and Snell
    (*Finite Markov Chains*, 1976)

-   `alwaysLumpable(M)`: Checks if `M` allows lumping of any allele
    subset

## Further examples

An `equal` model with rate 0.1:

``` r
mutationMatrix("equal", rate = 0.1, alleles = 1:3)
#>      1    2    3
#> 1 0.90 0.05 0.05
#> 2 0.05 0.90 0.05
#> 3 0.05 0.05 0.90
#> 
#> Model: equal 
#> Rate: 0.1 
#> 
#> Lumpable: Always
```

To illustrate the `stepwise` model, we recreate the mutation matrix in
Section 2.1.3 of Simonsson and Mostad (FSI:Genetics, 2015). This is done
as follows:

``` r
mutationMatrix(model = "stepwise",
               alleles = c("16", "17", "18", "16.1", "17.1"),
               rate = 0.003, rate2 = 0.001, range = 0.5)
#>                16           17           18         16.1         17.1
#> 16   0.9960000000 0.0020000000 0.0010000000 0.0005000000 0.0005000000
#> 17   0.0015000000 0.9960000000 0.0015000000 0.0005000000 0.0005000000
#> 18   0.0010000000 0.0020000000 0.9960000000 0.0005000000 0.0005000000
#> 16.1 0.0003333333 0.0003333333 0.0003333333 0.9960000000 0.0030000000
#> 17.1 0.0003333333 0.0003333333 0.0003333333 0.0030000000 0.9960000000
#> 
#> Model: stepwise 
#> Rate: 0.003 
#> Rate2: 0.001 
#> Range: 0.5 
#> 
#> Lumpable: Not always
```

A simpler version of the `stepwise` model above, is the `onestep` model,
in which only the immediate neighbouring integers are reachable by
mutation. This model is only applicable when all alleles are integers.

``` r
mutationMatrix(model = "onestep",
               alleles = c("16", "17", "18"),
               rate = 0.04)
#>      16   17   18
#> 16 0.96 0.04 0.00
#> 17 0.02 0.96 0.02
#> 18 0.00 0.04 0.96
#> 
#> Model: onestep 
#> Rate: 0.04 
#> 
#> Lumpable: Not always
```
