Maintainer: | Laura Marshall <lhm@st-andrews.ac.uk> |
License: | GPL-2 | GPL-3 [expanded from: GPL (≥ 2)] |
Title: | Density Surface Modelling of Distance Sampling Data |
LazyLoad: | yes |
Author: | David L. Miller, Eric Rexstad, Louise Burt, Mark V. Bravington, Sharon Hedley, Megan Ferguson, Natalie Kelly. |
Description: | Density surface modelling of line transect data. A Generalized Additive Model-based approach is used to calculate spatially-explicit estimates of animal abundance from distance sampling (also presence/absence and strip transect) data. Several utility functions are provided for model checking, plotting and variance estimation. |
Version: | 2.3.3 |
Language: | en-GB |
Encoding: | UTF-8 |
URL: | https://github.com/DistanceDevelopment/dsm |
BugReports: | https://github.com/DistanceDevelopment/dsm/issues |
Depends: | R (≥ 3.5.0), mgcv (≥ 1.8-23), mrds (≥ 2.1.16), numDeriv |
Imports: | nlme, ggplot2, plyr, statmod |
Suggests: | Distance, sp, tweedie, testthat |
RoxygenNote: | 7.2.1 |
NeedsCompilation: | no |
Packaged: | 2022-08-20 21:42:22 UTC; laura |
Repository: | CRAN |
Date/Publication: | 2022-08-20 22:10:13 UTC |
Density surface modelling
Description
dsm
implements spatial models for distance sampling data. Models for
detectability can be fitted using packages mrds
or Distance
. dsm
fits
generalized additive models to spatially-referenced data. See Miller et al
(2013) for an introduction.
Details
Further information on distance sampling methods and example code is available at http://distancesampling.org/R/.
For help with distance sampling and this package, there is a Google Group https://groups.google.com/forum/#!forum/distance-sampling.
A example analyses are available at http://examples.distancesampling.org.
References
Hedley, S. and S. T. Buckland. 2004. Spatial models for line transect sampling. JABES 9:181-199.
Miller, D. L., Burt, M. L., Rexstad, E. A., Thomas, L. (2013), Spatial models for distance sampling data: recent developments and future directions. Methods in Ecology and Evolution, 4: 1001-1010. doi: 10.1111/2041-210X.12105 (Open Access)
Wood, S.N. 2006. Generalized Additive Models: An Introduction with R. CRC/Chapman & Hall.
Find the block information
Description
Takes the transect data and works out how many blocks of a given size (in segment terms) fit into each.
Usage
block.info.per.su(block.size, data, name.su)
Arguments
block.size |
number of segments per block |
data |
data used to build the model |
name.su |
names of the sampling units (i.e., transects) |
Value
a data.frame
with the following columns
-
name
the sample unit name (e.g. transect label) -
num.seg
number of segments in that transect -
num.block
number of blocks available -
start.block
block number for first block -
end.block
block number for last block -
num.req
number of blocks needed for the unit
Check column names exist
Description
Internal function to check that supplied data.frames
have the correct
columns and checks that sample labels are all unique.
Usage
check.cols(ddf.obj, segment.data, observation.data, segment.area)
Arguments
ddf.obj |
a |
segment.data |
segment data as defined in |
observation.data |
observation data as defined in |
segment.area |
area of segments |
Value
nothing, but throws an error if something went wrong
Author(s)
David Lawrence Miller
Fit a density surface model to segment-specific estimates of abundance or density.
Description
Fits a density surface model (DSM) to detection adjusted counts from a
spatially-referenced distance sampling analysis. dsm
takes observations of
animals, allocates them to segments of line (or strip transects) and
optionally adjusts the counts based on detectability using a supplied
detection function model. A generalized additive model, generalized mixed
model or generalized linear model is then used to model these adjusted
counts based on a formula involving environmental covariates.
Usage
dsm(
formula,
ddf.obj,
segment.data,
observation.data,
engine = "gam",
convert.units = 1,
family = quasipoisson(link = "log"),
group = FALSE,
control = list(keepData = TRUE),
availability = 1,
segment.area = NULL,
weights = NULL,
method = "REML",
...
)
Arguments
formula |
formula for the surface. This should be a valid formula. See "Details", below, for how to define the response. |
ddf.obj |
result from call to |
segment.data |
segment data, see |
observation.data |
observation data, see |
engine |
which fitting engine should be used for the DSM
( |
convert.units |
conversion factor to multiply the area of the segments by. See 'Units' below. |
family |
response distribution (popular choices include
|
group |
if |
control |
the usual |
availability |
an estimate of availability bias. For count models used
to multiply the effective strip width (must be a vector of length 1 or
length the number of rows in |
segment.area |
if |
weights |
weights for each observation used in model fitting. The
default, |
method |
The smoothing parameter estimation method. Default is
|
... |
anything else to be passed straight to |
Details
The response (LHS of formula
) can be one of the following (with
restrictions outlined below):
-
count
count in each segment -
abundance.est
estimated abundance per segment, estimation is via a Horvitz-Thompson estimator -
density.est
density per segment
The offset used in the model is dependent on the response:
-
count
area of segment multiplied by average probability of detection in the segment -
abundance.est
area of the segment -
density
zero
The count
response can only be used when detection function covariates
only vary between segments/points (not within). For example, weather
conditions (like visibility or sea state) or foliage cover are usually
acceptable as they do not change within the segment, but animal sex or
behaviour will not work. The abundance.est
response can be used with any
covariates in the detection function.
In the density case, observations can be weighted by segment areas via the
weights=
argument. By default (weights=NULL
), when density is estimated
the weights are set to the segment areas (using segment.area
or by
calculated from detection function object metadata and Effort
data).
Alternatively weights=1
will set the weights to all be equal. A third
alternative is to pass in a vector of length equal to the number of
segments, containing appropriate weights.
A example analyses are available at http://examples.distancesampling.org.
Value
a glm
, gam
, gamm
or
bam
object, with an additional element, $ddf
which holds the
detection function object.
Units
It is often the case that distances are collected in metres and segment
lengths are recorded in kilometres. dsm
allows you to provide a conversion
factor (convert.units
) to multiply the areas by. For example: if distances
are in metres and segment lengths are in kilometres setting
convert.units=1000
will lead to the analysis being in metres. Setting
convert.units=1/1000
will lead to the analysis being in kilometres. The
conversion factor will be applied to segment.area
if that is specified.
Large models
For large models, engine="bam"
with method="fREML"
may be useful. Models
specified for bam
should be as gam
. Read bam
before using
this option; this option is considered EXPERIMENTAL at the moment. In
particular note that the default basis choice (thin plate regression
splines) will be slow and that in general fitting is less stable than when
using gam
. For negative binomial response, theta must be
specified when using bam
.
Author(s)
David L. Miller
References
Hedley, S. and S. T. Buckland. 2004. Spatial models for line transect sampling. JABES 9:181-199.
Miller, D. L., Burt, M. L., Rexstad, E. A., Thomas, L. (2013), Spatial models for distance sampling data: recent developments and future directions. Methods in Ecology and Evolution, 4: 1001-1010. doi: 10.1111/2041-210X.12105 (Open Access)
Wood, S.N. 2006. Generalized Additive Models: An Introduction with R. CRC/Chapman & Hall.
Examples
## Not run:
library(Distance)
library(dsm)
# load the Gulf of Mexico dolphin data (see ?mexdolphins)
data(mexdolphins)
# fit a detection function and look at the summary
hr.model <- ds(distdata, truncation=6000,
key = "hr", adjustment = NULL)
summary(hr.model)
# fit a simple smooth of x and y to counts
mod1 <- dsm(count~s(x,y), hr.model, segdata, obsdata)
summary(mod1)
# predict over a grid
mod1.pred <- predict(mod1, preddata, preddata$area)
# calculate the predicted abundance over the grid
sum(mod1.pred)
# plot the smooth
plot(mod1)
## End(Not run)
Data format for DSM
Description
Two data.frame
s must be provided to dsm
. They are referred to as
observation.data
and segment.data
.
Details
The segment.data
table has the sample identifiers which define the
segments, the corresponding effort (line length) expended and the
environmental covariates that will be used to model abundance/density.
observation.data
provides a link table between the observations used in
the detection function and the samples (segments), so that we can aggregate
the observations to the segments (i.e., observation.data
is a "look-up
table" between the observations and the segments).
observation.data
- the observation data.frame
must have (at least) the
following columns:
-
object
unique object identifier -
Sample.Label
the identifier for the segment where observation occurred -
size
the size of each observed group (e.g., 1 if all animals occurred individually) -
distance
distance to observation
One can often also use observation.data
to fit a detection function (so
additional columns for detection function covariates are allowed in this
table).
segment.data
: the segment data.frame
must have (at least) the following
columns:
-
Effort
the effort (in terms of length of the segment) -
Sample.Label
identifier for the segment (unique!) ??? environmental covariates, for example location (projected latitude and longitude), and other relevant covariates (sea surface temperature, foliage type, altitude, bathymetry etc).
Multiple detection functions
If multiple detection functions are to be used, then a column named ddfobj
must be included in observation.data
and segment.data
. This lets the
model know which detection function each observation is from. These are
numeric and ordered as the ddf.obj
argument to dsm
, e.g.,
ddf.obj=list(ship_ddf, aerial_ddf)
means ship detections have ddfobj=1
and aerial detections have ddfobj=2
in the observation data.
Mark-recapture distance sampling models
When using mrds
models that include mark-recapture components (currently
independent observer and trial modes are supported) then the format of the
observation data needs to be checked to ensure that observations are not
duplicated. The observer
column is also required in the
observation.data
.
-
Independent observer mode only unique observations (unique object IDs) are required.
-
Trial mode only observations made by observer 1 are required.
Check for autocorrelation in residuals
Description
This function is deprecated, use dsm_cor.
Usage
dsm.cor(
dsm.obj,
Transect.Label = "Transect.Label",
Segment.Label = "Segment.Label",
max.lag = 10,
resid.type = "scaled.pearson",
fun = cor,
ylim = c(0, 1),
subset = "all",
...
)
Arguments
dsm.obj |
a fitted dsm object. |
Transect.Label |
label for the transect (default: |
Segment.Label |
label for the segments (default: |
max.lag |
maximum lag to calculate at. |
resid.type |
the type of residuals used, see
|
fun |
the function to use, by default |
ylim |
user defined limits in y direction. |
subset |
which subset of the data should the correlation function be calculated on? |
... |
other options to pass to |
Prediction variance estimation assuming independence
Description
This function is deprecated, use dsm_var_gam.
Usage
dsm.var.gam(
dsm.obj,
pred.data,
off.set,
seglen.varname = "Effort",
type.pred = "response"
)
Arguments
dsm.obj |
a model object fitted by |
pred.data |
either: a single prediction grid or list of prediction
grids. Each grid should be a |
off.set |
a a vector or list of vectors with as many elements as there
are in |
seglen.varname |
name for the column which holds the segment length
(default value |
type.pred |
should the predictions be on the "response" or "link"
scale? (default |
Variance estimation via parametric moving block bootstrap
Description
This function is deprecated, use dsm_var_movblk.
Usage
dsm.var.movblk(
dsm.object,
pred.data,
n.boot,
block.size,
off.set,
ds.uncertainty = FALSE,
samp.unit.name = "Transect.Label",
progress.file = NULL,
bs.file = NULL,
bar = TRUE
)
Arguments
dsm.object |
object returned from |
pred.data |
either: a single prediction grid or list of prediction
grids. Each grid should be a |
n.boot |
number of bootstrap resamples. |
block.size |
number of segments in each block. |
off.set |
a a vector or list of vectors with as many elements as there
are in |
ds.uncertainty |
incorporate uncertainty in the detection function? See Details, below. Note that this feature is EXPERIMENTAL at the moment. |
samp.unit.name |
name sampling unit to resample (default 'Transect.Label'). |
progress.file |
path to a file to be used (usually by Distance) to
generate a progress bar (default |
bs.file |
path to a file to store each bootstrap round. This stores all
of the bootstrap results rather than just the summaries, enabling
outliers to be detected and removed. (Default |
bar |
should a progress bar be printed to screen? (Default |
Prediction variance propagation for DSMs
Description
This function is deprecated, use dsm_var_prop.
Usage
dsm.var.prop(
dsm.obj,
pred.data,
off.set,
seglen.varname = "Effort",
type.pred = "response"
)
Arguments
dsm.obj |
a model object fitted by |
pred.data |
either: a single prediction grid or list of prediction
grids. Each grid should be a |
off.set |
a a vector or list of vectors with as many elements as there
are in |
seglen.varname |
name for the column which holds the segment length
(default value |
type.pred |
should the predictions be on the "response" or "link"
scale? (default |
Check for autocorrelation in residuals
Description
Once a DSM has been fitted to data, this function can be used to check for autocorrelation in the residuals.
Usage
dsm_cor(
dsm.obj,
Transect.Label = "Transect.Label",
Segment.Label = "Segment.Label",
max.lag = 10,
resid.type = "scaled.pearson",
fun = cor,
ylim = c(0, 1),
subset = "all",
...
)
Arguments
dsm.obj |
a fitted dsm object. |
Transect.Label |
label for the transect (default: |
Segment.Label |
label for the segments (default: |
max.lag |
maximum lag to calculate at. |
resid.type |
the type of residuals used, see
|
fun |
the function to use, by default |
ylim |
user defined limits in y direction. |
subset |
which subset of the data should the correlation function be calculated on? |
... |
other options to pass to |
Value
a plot or a vector of fun
applied at the lags.
Details
Within each Transect.Label
, segments will be sorted
according to their Segment.Label
s. This may require some time to get right
for your particular data. If one has multiple surveys where transects are
revisited, for example, one may want to make Transect.Label
a unique
transect-survey identifier. Neither label need to be included in the model,
they must just be present in the $data
field in the model. This usually
means that they have to be in the segment data passed to dsm
.
The current iteration of this function will only plot correlations nicely, other things are up to you but you can get the function to return the data (by assigning the result to an object).
If there are NA values in the residuals then the correlogram will not be
calculated. This usually occurs due to NA
values in the covariates (so the
smoother will not have fitted values there). Code like
any(is.na(dsm.obj$data))
might be helpful.
Author(s)
David L. Miller
Examples
library(Distance)
library(dsm)
# load the data, see ?mexdolphins
data(mexdolphins)
# fit a model
hr.model <- ds(distdata, truncation=6000,
key = "hr", adjustment = NULL)
mod1 <- dsm(count~s(x,y), hr.model, segdata, obsdata)
# look at lag 1 differences up to a maximum of lag 9, using deviance
# residuals
dsm_cor(mod1, resid.type="deviance", max.lag=9,
Segment.Label="Sample.Label")
Prediction variance estimation assuming independence
Description
If one is willing to assume the the detection function and spatial model are independent, this function will produce estimates of variance of predictions of abundance, using the result that squared coefficients of variation will add.
Usage
dsm_var_gam(
dsm.obj,
pred.data,
off.set,
seglen.varname = "Effort",
type.pred = "response"
)
Arguments
dsm.obj |
a model object fitted by |
pred.data |
either: a single prediction grid or list of prediction
grids. Each grid should be a |
off.set |
a a vector or list of vectors with as many elements as there
are in |
seglen.varname |
name for the column which holds the segment length
(default value |
type.pred |
should the predictions be on the "response" or "link"
scale? (default |
Value
a list
with elements
-
model
the fitted model object -
pred.var
variance of the regions given inpred.data
. -
bootstrap
logical, alwaysFALSE
-
model
the fitted model with the extra term -
dsm.object
the original model (dsm.obj
above)
Author(s)
David L. Miller
Examples
## Not run:
library(Distance)
library(dsm)
# load the Gulf of Mexico dolphin data (see ?mexdolphins)
data(mexdolphins)
# fit a detection function and look at the summary
hr.model <- ds(distdata, truncation=6000,
key = "hr", adjustment = NULL)
summary(hr.model)
# fit a simple smooth of x and y
mod1 <- dsm(count~s(x, y), hr.model, segdata, obsdata)
# Calculate the variance
# this will give a summary over the whole area in mexdolphins$preddata
mod1.var <- dsm_var_gam(mod1, preddata, off.set=preddata$area)
## End(Not run)
Variance estimation via parametric moving block bootstrap
Description
Estimate the variance in abundance over an area using a moving block bootstrap. Two procedures are implemented, one incorporating detection function uncertainty, one not.
Usage
dsm_var_movblk(
dsm.object,
pred.data,
n.boot,
block.size,
off.set,
ds.uncertainty = FALSE,
samp.unit.name = "Transect.Label",
progress.file = NULL,
bs.file = NULL,
bar = TRUE
)
Arguments
dsm.object |
object returned from |
pred.data |
either: a single prediction grid or list of prediction
grids. Each grid should be a |
n.boot |
number of bootstrap resamples. |
block.size |
number of segments in each block. |
off.set |
a a vector or list of vectors with as many elements as there
are in |
ds.uncertainty |
incorporate uncertainty in the detection function? See Details, below. Note that this feature is EXPERIMENTAL at the moment. |
samp.unit.name |
name sampling unit to resample (default 'Transect.Label'). |
progress.file |
path to a file to be used (usually by Distance) to
generate a progress bar (default |
bs.file |
path to a file to store each bootstrap round. This stores all
of the bootstrap results rather than just the summaries, enabling
outliers to be detected and removed. (Default |
bar |
should a progress bar be printed to screen? (Default |
Details
Setting ds.uncertainty=TRUE
will incorporate detection function
uncertainty directly into the bootstrap. This is done by generating
observations from the fitted detection function and then re-fitting a new
detection function (of the same form), then calculating a new effective
strip width. Rejection sampling is used to generate the observations
(except in the half-normal case) so the procedure can be rather slow. Note
that this is currently not supported with covariates in the detection
function.
Setting ds.uncertainty=FALSE
will incorporate detection function
uncertainty using the delta method. This assumes that the detection
function and the spatial model are INDEPENDENT. This is probably not
reasonable.
Examples
## Not run:
library(Distance)
library(dsm)
# load the Gulf of Mexico dolphin data (see ?mexdolphins)
data(mexdolphins)
# fit a detection function and look at the summary
hr.model <- ds(distdata, truncation=6000,
key = "hr", adjustment = NULL)
summary(hr.model)
# fit a simple smooth of x and y
mod1 <- dsm(count~s(x, y), hr.model, segdata, obsdata)
summary(mod1)
# calculate the variance by 500 moving block bootstraps
mod1.movblk <- dsm_var_movblk(mod1, preddata, n.boot = 500,
block.size = 3, samp.unit.name = "Transect.Label",
off.set = preddata$area,
bar = TRUE, bs.file = "mexico-bs.csv", ds.uncertainty = TRUE)
## End(Not run)
Prediction variance propagation for DSMs
Description
To ensure that uncertainty from the detection function is correctly propagated to the final variance estimate of abundance, this function uses a method first detailed in Williams et al (2011), further explanation is given in Bravington et al. (2021).
Usage
dsm_var_prop(
dsm.obj,
pred.data,
off.set,
seglen.varname = "Effort",
type.pred = "response"
)
Arguments
dsm.obj |
a model object fitted by |
pred.data |
either: a single prediction grid or list of prediction
grids. Each grid should be a |
off.set |
a a vector or list of vectors with as many elements as there
are in |
seglen.varname |
name for the column which holds the segment length
(default value |
type.pred |
should the predictions be on the "response" or "link"
scale? (default |
Details
The idea is to refit the spatial model but including an extra random effect. This random effect has zero mean and hence to effect on point estimates. Its variance is the Hessian of the detection function. Variance estimates then incorporate detection function uncertainty. Further mathematical details are given in the paper in the references below.
Many prediction grids can be supplied by supplying a list of data.frame
s
to the function.
Note that this routine simply calls dsm_varprop
. If you
don't require multiple prediction grids, the other routine will probably be
faster.
This routine is only useful if a detection function with covariates has been used in the DSM.
Value
a list
with elements
-
model
the fitted model object -
pred.var
variance of each region given inpred.data
-
bootstrap
logical, alwaysFALSE
-
pred.data
as above -
off.set
as above -
model
the fitted model with the extra term -
dsm.object
the original model, as above -
model.check
simple check of subtracting the coefficients of the two models to see if there is a large difference -
deriv
numerically calculated Hessian of the offset
Diagnostics
The summary output from the function includes a simply diagnostic that shows
the average probability of detection from the "original" fitted model (the
model supplied to this function; column Fitted.model
) and the probability
of detection from the refitted model (used for variance propagation; column
Refitted.model
) along with the standard error of the probability of
detection from the fitted model (Fitted.model.se
), at the unique values of
any factor covariates used in the detection function (for continuous
covariates the 5%, 50% and 95% quantiles are shown). If there are large
differences between the probabilities of detection then there are
potentially problems with the fitted model, the variance propagation or
both. This can be because the fitted model does not account for enough of
the variability in the data and in refitting the variance model accounts for
this in the random effect.
Limitations
Note that this routine is only useful if a detection function has been used
in the DSM. It cannot be used when the abundance.est
or density.est
responses are used. Importantly this requires that if the detection function
has covariates, then these do not vary within a segment (so, for example
covariates like sex cannot be used).
Author(s)
Mark V. Bravington, Sharon L. Hedley. Bugs added by David L. Miller.
References
Bravington, M. V., Miller, D. L., & Hedley, S. L. (2021). Variance Propagation for Density Surface Models. Journal of Agricultural, Biological and Environmental Statistics. https://doi.org/10.1007/s13253-021-00438-2
Williams, R., Hedley, S.L., Branch, T.A., Bravington, M.V., Zerbini, A.N. and Findlay, K.P. (2011). Chilean Blue Whales as a Case Study to Illustrate Methods to Estimate Abundance and Evaluate Conservation Status of Rare Species. Conservation Biology 25(3), 526-535.
Variance propagation for density surface models
Description
Calculate the uncertainty in predictions from a fitted DSM, including uncertainty from the detection function.
Usage
dsm_varprop(
model,
newdata = NULL,
trace = FALSE,
var.type = "Vp",
var_type = NULL
)
Arguments
model |
a fitted |
newdata |
the prediction grid. Set to |
trace |
for debugging, see how the scale parameter estimation is going. |
var.type |
which variance-covariance matrix should be used ( |
var_type |
deprecated, use |
Details
When we make predictions from a spatial model, we also want to know the uncertainty about that abundance estimate. Since density surface models are 2 (or more) stage models, we need to incorporate the uncertainty from the earlier stages (i.e. the detection function) into our "final" uncertainty estimate.
This function will refit the spatial model but include the Hessian of the offset as an extra term. Variance estimates using this new model can then be used to calculate the variance of predicted abundance estimates which incorporate detection function uncertainty. Importantly this requires that if the detection function has covariates, then these do not vary within a segment (so, for example covariates like sex cannot be used).
For more information on how to construct the prediction grid data.frame
,
newdata
, see predict.dsm
.
This routine is only useful if a detection function with covariates has been used in the DSM.
Note that we can use var.type="Vc"
here (see gamObject
), which is the
variance-covariance matrix for the spatial model, corrected for smoothing
parameter uncertainty. See Wood, Pya & S\"afken (2016) for more
information.
Models with fixed scale parameters (e.g., negative binomial) do not require an extra round of optimisation.
Value
a list
with elements:
-
old_model
fitted model supplied to the function asmodel
-
refit
refitted model object, with extra term -
pred
point estimates of predictions atnewdata
-
var
total variance calculated over all ofnewdata
-
ses
standard error for each prediction cell innewdata
ifnewdata=NULL
then the last three entries areNA
.
Diagnostics
The summary output from the function includes a simply diagnostic that shows
the average probability of detection from the "original" fitted model (the
model supplied to this function; column Fitted.model
) and the probability
of detection from the refitted model (used for variance propagation; column
Refitted.model
) along with the standard error of the probability of
detection from the fitted model (Fitted.model.se
), at the unique values of
any factor covariates used in the detection function (for continuous
covariates the 5%, 50% and 95% quantiles are shown). If there are large
differences between the probabilities of detection then there are
potentially problems with the fitted model, the variance propagation or
both. This can be because the fitted model does not account for enough of
the variability in the data and in refitting the variance model accounts for
this in the random effect.
Author(s)
David L. Miller, based on code from Mark V. Bravington and Sharon L. Hedley.
References
Bravington, M. V., Miller, D. L., & Hedley, S. L. (2021). Variance Propagation for Density Surface Models. Journal of Agricultural, Biological and Environmental Statistics. https://doi.org/10.1007/s13253-021-00438-2
Williams, R., Hedley, S.L., Branch, T.A., Bravington, M.V., Zerbini, A.N. and Findlay, K.P. (2011). Chilean Blue Whales as a Case Study to Illustrate Methods to Estimate Abundance and Evaluate Conservation Status of Rare Species. Conservation Biology 25(3), 526-535.
Wood, S.N., Pya, N. and S\"afken, B. (2016) Smoothing parameter and model selection for general smooth models. Journal of the American Statistical Association, 1-45.
Detection function objects when detection is certain
Description
Create a detection function object for strip/plot surveys for use in density surface models.
Usage
dummy_ddf(object, size = 1, width, left = 0, transect = "line")
Arguments
object |
numeric vector of object identifiers, relating to the |
size |
group size for each observation (default all groups size 1) |
width |
right truncation |
left |
left truncation (default 0, no left truncation) |
transect |
|
Author(s)
David L Miller
Generate data from a fitted detection function
Description
When using dsm.var.movblk
if ds.uncertainty=TRUE
, this
procedure generates data from the fitted detection function (assuming that
it is correct).
Usage
generate.ds.uncertainty(ds.object)
Arguments
ds.object |
a fitted detection function object (as returned by a call
to |
Note
This function changes the random number generator seed. To avoid any
potential side-effects, use something like: seed <- get(".Random.seed",envir=.GlobalEnv)
before running code and
assign(".Random.seed",seed,envir=.GlobalEnv)
after.
Author(s)
David L. Miller
Moving block bootstrap sampler
Description
Not usually used on its own, called from within
dsm.var.movblk
.
Usage
generate.mb.sample(
num.blocks.required,
block.size,
which.blocks,
dsm.data,
unit.info,
n.units
)
Arguments
num.blocks.required |
number of blocks that we need. |
block.size |
number of segments per block. |
which.blocks |
which blocks should be sampled. |
dsm.data |
the |
unit.info |
result of calling |
n.units |
number of sampling units. |
Value
vector of log-residuals
Convert latitude and longitude to Northings and Eastings
Description
Convert longitude and latitude co-ordinates to kilometres west-east and
south-north from axes through (lon0
,lat0
) using the
"spherical law of cosines".
Usage
latlong2km(lon, lat, lon0 = sum(range(lon))/2, lat0 = sum(range(lat))/2)
Arguments
lon |
longitude |
lat |
latitude |
lon0 |
longitude reference point (defaults to mean longitude) |
lat0 |
latitude reference point (defaults to mean latitude) |
Details
WARNING: This is an approximate procedure for converting between latitude/
longitude and Northing/Easting. Consider using projection conversions
available in packages sp
, sf
and rgdal
for better results.
Value
list with elements km.e
and km.n
.
Author(s)
Simon N. Wood
Create a knot grid for the internal part of a soap film smoother.
Description
This routine simply creates a grid of knots (in the correct format) to be used as in the "internal" part of the soap film smoother
Usage
make.soapgrid(bnd, n.grid)
Arguments
bnd |
|
n.grid |
either one number giving the number of points along the |
Value
a list with elements x
and y
, containing the knot locations.
Author(s)
David L Miller
Pan-tropical spotted dolphins in the Gulf of Mexico
Description
Data from a combination of several NOAA shipboard surveys conducted on pan-tropical spotted dolphins in the Gulf of Mexico. The data consist of 47 observations of groups of dolphins. The group size was recorded, as well as the Beaufort sea state at the time of the observation. Coordinates for each observation and bathymetry data were also available as covariates for the analysis. A complete example analysis (and description of the data) is provided at http://distancesampling.org/R/vignettes/mexico-analysis.html.
References
Halpin, P.N., A.J. Read, E. Fujioka, B.D. Best, B. Donnelly, L.J. Hazen, C. Kot, K. Urian, E. LaBrecque, A. Dimatteo, J. Cleary, C. Good, L.B. Crowder, and K.D. Hyrenbach. 2009. OBIS-SEAMAP: The world data center for marine mammal, sea bird, and sea turtle distributions. Oceanography 22(2):104-115
NOAA Southeast Fisheries Science Center. 1996. Report of a Cetacean Survey of Oceanic and Selected Continental Shelf Waters of the Northern Gulf of Mexico aboard NOAA Ship Oregon II (Cruise 220)
Observed versus expected diagnostics for fitted DSMs
Description
Given a covariate, calculate the observed and expected counts for each unique value of the covariate. This can be a useful goodness of fit check for DSMs.
Usage
obs_exp(model, covar, cut = NULL)
Arguments
model |
a fitted |
covar |
covariate to aggregate by ( |
cut |
vector of cut points to aggregate at. If not supplied, the unique
values of |
Details
One strategy for model checking is to calculate observed and expected counts at different aggregations of the variable. If these match well then the model fit is good.
Value
data.frame
with values of observed and expected counts.
Author(s)
David L Miller, on the suggestion of Mark Bravington.
Examples
## Not run:
library(Distance)
library(dsm)
# example with the Gulf of Mexico dolphin data
data(mexdolphins)
hr.model <- ds(distdata, truncation=6000,
key = "hr", adjustment = NULL)
mod1 <- dsm(count~s(x,y), hr.model, segdata, obsdata)
## End(Not run)
Plot a density surface model.
Description
See plot.gam
.
Usage
## S3 method for class 'dsm'
plot(x, ...)
Arguments
x |
a model fitted by |
... |
other arguments passed to |
Value
a plot!
Author(s)
David L. Miller
See Also
Create plots of abundance uncertainty
Description
Note that the prediction data set must have x
and y
columns even if
these were not used in the model.
Usage
## S3 method for class 'dsm.var'
plot(
x,
poly = NULL,
limits = NULL,
breaks = NULL,
legend.breaks = NULL,
xlab = "x",
ylab = "y",
observations = TRUE,
plot = TRUE,
boxplot.coef = 1.5,
x.name = "x",
y.name = "y",
gg.grad = NULL,
...
)
Arguments
x |
a |
poly |
a |
limits |
limits for the fill colours |
breaks |
breaks for the colour fill |
legend.breaks |
breaks as they should be displayed |
xlab |
label for the |
ylab |
label for the |
observations |
should observations be plotted? |
plot |
actually plot the map, or just return a |
boxplot.coef |
control trimming (as in
|
x.name |
name of the variable to plot as the |
y.name |
name of the variable to plot as the |
gg.grad |
optional |
... |
any other arguments |
Value
a plot
Details
In order to get plotting to work with dsm_var_prop
and
dsm_var_gam
, one must first format the data correctly
since these functions are designed to compute very general summaries. One
summary is calculated for each element of the list pred
supplied to
dsm_var_prop
and dsm_var_gam
.
For a plot of uncertainty over a prediction grid, pred
(a data.frame
),
say, we can create the correct format by simply using pred.new <- split(pred,1:nrow(pred))
.
Author(s)
David L. Miller
See Also
dsm_var_prop
, dsm_var_gam
,
dsm_var_movblk
Spatially plot predictions per model term
Description
Plot the effect of each smooth in the model spatially. For each term in the model, plot its effect in space. Plots are made on the same scale, so that the relative influence of each smooth can be seen.
Usage
plot_pred_by_term(dsm.obj, data, location.cov = c("x", "y"))
Arguments
dsm.obj |
fitted |
data |
data to use to plot (often the same as the prediction grid), data
should also include |
location.cov |
deprecated, use |
Value
a ggplot2
plot
Author(s)
David L Miller (idea taken from inlabru
)
Examples
## Not run:
library(Distance)
library(dsm)
# load the Gulf of Mexico dolphin data and fit a model
data(mexdolphins)
hr.model <- ds(distdata, truncation=6000,
key = "hr", adjustment = NULL)
mod1 <- dsm(count~s(x,y) + s(depth), hr.model, segdata, obsdata)
preddata$width <- preddata$height <- sqrt(preddata$area)
# make the plot
plot_pred_by_term(mod1, preddata, c("x","y"))
# better plot would be
# library(viridis)
# plot_pred_by_term(mod1, preddata, c("x","y")) + scale_fill_viridis()
## End(Not run)
Predict from a fitted density surface model
Description
Make predictions of density or abundance outside (or inside) the covered area.
Usage
## S3 method for class 'dsm'
predict(object, newdata = NULL, off.set = NULL, type = "response", ...)
Arguments
object |
a fitted |
newdata |
spatially referenced covariates e.g. altitude, depth,
distance to shore, etc. Covariates in the |
off.set |
area of each of the cells in the prediction grid. Should be
in the same units as the segments/distances given to |
type |
what scale should the results be on. The default is
|
... |
any other arguments passed to |
Details
If newdata
is not supplied, predictions are made for the data that built
the model. Note that the order of the results will not necessarily be the
same as the segdata
(segment data) data.frame
that was supplied to
dsm
.
The area off.set
is used if that argument is supplied, otherwise it will
look for the areas in the column named off.set
in the newdata
. Either
way the link function (usually log
) will be applied to the offsets, so
there is no need to log them before passing them to this function.
Value
predicted values on the response scale by default (unless type
is
specified, in which case see predict.gam
).
Author(s)
David L. Miller
See Also
predict.gam
, dsm_var_gam
,
dsm_var_prop
Prediction for fake detection functions
Description
Prediction function for dummy detection functions. The function returns as
many 1s as there are rows in newdata
. If esw=TRUE
then the
strip width is returned.
Usage
## S3 method for class 'fake_ddf'
predict(
object,
newdata = NULL,
compute = FALSE,
int.range = NULL,
esw = FALSE,
...
)
Arguments
object |
model object |
newdata |
how many 1s should we return? |
compute |
unused, compatibility with |
int.range |
unused, compatibility with |
esw |
should the strip width be returned? |
... |
for S3 consistency |
Author(s)
David L Miller
Print a description of a density surface model object
Description
This method just gives a short description of the fitted model. Use the
summary.dsm
method for more information.
Usage
## S3 method for class 'dsm'
print(x, ...)
Arguments
x |
a model fitted by |
... |
unspecified and unused arguments for S3 consistency |
Value
NULL
Author(s)
David L. Miller
Print a description of a density surface model variance object
Description
This method only provides a short summary, use the
summary.dsm.var
method for information.
Usage
## S3 method for class 'dsm.var'
print(x, ...)
Arguments
x |
a |
... |
unspecified and unused arguments for S3 consistency |
Value
NULL
Author(s)
David L. Miller
See Also
Print a description of a density surface model variance object
Description
This method only provides a short summary, see
summary.dsm_varprop
.
Usage
## S3 method for class 'dsm_varprop'
print(x, ...)
Arguments
x |
a |
... |
unspecified and unused arguments for S3 consistency |
Author(s)
David L. Miller
See Also
Print summary of density surface model variance object
Description
See summary.dsm.var
for information.
Usage
## S3 method for class 'summary.dsm.var'
print(x, ...)
Arguments
x |
a summary of |
... |
unspecified and unused arguments for S3 consistency |
Value
NULL
Author(s)
David L. Miller
See Also
Print summary of density surface model variance object
Description
See summary.dsm_varprop
for information.
Usage
## S3 method for class 'summary.dsm_varprop'
print(x, ...)
Arguments
x |
a summary of |
... |
unspecified and unused arguments for S3 consistency |
Value
NULL
Author(s)
David L. Miller
See Also
Randomised quantile residuals check plot for GAMs/DSMs
Description
This function is deprecated, use rqgam_check.
Usage
rqgam.check(gam.obj, ...)
Arguments
gam.obj |
|
... |
arguments passed on to all plotting functions |
Randomised quantile residuals check plot for GAMs/DSMs
Description
Reproduces the "Resids vs. linear pred" plot from
gam.check
but using randomised quantile residuals, a la
Dunn and Smyth (1996). Checks for heteroskedasticity as as usual, looking
for "funnel"-type structures in the points, which is much easier with
randomised quantile residuals than with deviance residuals, when your model
uses a count distribution as the response.
Usage
rqgam_check(gam.obj, ...)
Arguments
gam.obj |
|
... |
arguments passed on to all plotting functions |
Details
Note that this function only works with negative binomial and Tweedie response distributions.
Earlier versions of this function produced the full
gam.check
output, but this was confusing as only one of
the plots was really useful. Checks of k
are not computed, these need to
be done using gam.check
.
Value
just plots!
Author(s)
Based on code by Natalie Kelly, bugs added by Dave Miller
Examples
library(Distance)
library(dsm)
library(tweedie)
# load the Gulf of Mexico dolphin data (see ?mexdolphins)
data(mexdolphins)
# fit a detection function and look at the summary
hr.model <- ds(distdata, truncation=6000,
key = "hr", adjustment = NULL)
# fit a simple smooth of x and y with a Tweedie response with estimated
# p parameter
mod1 <- dsm(count~s(x, y), hr.model, segdata, obsdata, family=tw())
rqgam_check(mod1)
Summarize a fitted density surface model
Description
Gives a brief summary of a fitted dsm
object.
Usage
## S3 method for class 'dsm'
summary(object, ...)
Arguments
object |
a model fitted by |
... |
other arguments passed to |
Value
a summary object
Author(s)
David L. Miller
See Also
Summarize the variance of a density surface model
Description
Gives a brief summary of a fitted dsm
variance object.
Usage
## S3 method for class 'dsm.var'
summary(
object,
alpha = 0.05,
boxplot.coef = 1.5,
bootstrap.subregions = NULL,
...
)
Arguments
object |
a |
alpha |
alpha level for confidence intervals (default 0.05 to give a 95\ confidence intervals) |
boxplot.coef |
the value of |
bootstrap.subregions |
list of vectors of logicals or indices for
subregions for which variances need to be calculated (only for bootstraps
(see |
... |
unused arguments for S3 compatibility |
Value
a summary object
Author(s)
David L. Miller
See Also
Summarize the variance of a density surface model
Description
Gives a brief summary of a fitted dsm_varprop
variance
object.
Usage
## S3 method for class 'dsm_varprop'
summary(object, alpha = 0.05, ...)
Arguments
object |
a |
alpha |
alpha level for confidence intervals (default 0.05 to give a 95% confidence internal) |
... |
unused arguments for S3 compatibility |
Value
a summary object
Author(s)
David L. Miller
See Also
Trimmed variance
Description
Trim the variance estimates from the bootstrap. This is defined as the percentage defined as amount necessary to bring median and trimmed mean within 8% of each other these are defined as 'outliers'.
Usage
trim.var(untrimmed.bootstraps, boxplot.coef = 1.5)
Arguments
untrimmed.bootstraps |
(usually the |
boxplot.coef |
the value of |
Value
trimmed variance
Author(s)
Louise Burt
Visualise concurvity between terms in a GAM
Description
This function is deprecated, use vis_concurvity.
Usage
vis.concurvity(model, type = "estimate")
Arguments
model |
fitted model |
type |
concurvity measure to plot, see |
Visualise concurvity between terms in a GAM
Description
Plot measures of how much one term in the model could be explained by another. When values are high, one should consider re-running variable selection with one of the offending variables removed to check for stability in term selection.
Usage
vis_concurvity(model, type = "estimate")
Arguments
model |
fitted model |
type |
concurvity measure to plot, see |
Details
These methods are considered somewhat experimental at this time. Consult
concurvity
for more information on how concurvity
measures are calculated.
Author(s)
David L Miller
Examples
## Not run:
library(Distance)
library(dsm)
# load the Gulf of Mexico dolphin data (see ?mexdolphins)
data(mexdolphins)
# fit a detection function and look at the summary
hr.model <- ds(distdata, truncation=6000,
key = "hr", adjustment = NULL)
# fit a simple smooth of x and y to counts
mod1 <- dsm(count~s(x,y)+s(depth), hr.model, segdata, obsdata)
# visualise concurvity using the "estimate" metric
vis_concurvity(mod1)
## End(Not run)