Help for package iNEXT.4steps

Type:

Package

Title:

Four-Step Biodiversity Analysis Based on 'iNEXT'

Version:

1.0.1

Author:

Anne Chao [aut, cre], KaiHsiang Hu [ctb]

Maintainer:

Anne Chao <chao@stat.nthu.edu.tw>

URL:

https://sites.google.com/view/chao-lab-website/software/inext-4steps/

Description:

Expands 'iNEXT' to include the estimation of sample completeness and evenness. The package provides simple functions to perform the following four-step biodiversity analysis: STEP 1: Assessment of sample completeness profiles. STEP 2a: Analysis of size-based rarefaction and extrapolation sampling curves to determine whether the asymptotic diversity can be accurately estimated. STEP 2b: Comparison of the observed and the estimated asymptotic diversity profiles. STEP 3: Analysis of non-asymptotic coverage-based rarefaction and extrapolation sampling curves. STEP 4: Assessment of evenness profiles. The analyses in STEPs 2a, 2b and STEP 3 are mainly based on the previous 'iNEXT' package. Refer to the 'iNEXT' package for details. This package is mainly focusing on the computation for STEPs 1 and 4. See Chao et al. (2020) <doi:10.1111/1440-1703.12102> for statistical background.

License:

GPL (≥ 3)

Depends:

R (≥ 4.0)

Imports:

ggplot2, reshape2, dplyr, stats, ggpubr, purrr, iNEXT.3D

Suggests:

testthat, knitr, rmarkdown

Encoding:

UTF-8

BugReports:

https://github.com/KaiHsiangHu/iNEXT.4steps/issues

LazyData:

true

RoxygenNote:

7.2.3

VignetteBuilder:

knitr

ByteCompile:

true

NeedsCompilation:

Packaged:

2024-06-17 15:07:38 UTC; stat-pc

Repository:

CRAN

Date/Publication:

2024-06-18 09:10:02 UTC

Four-step biodiversity analysis based on iNEXT

Description

This package expands iNEXT (Chao et al. 2014) to include the estimation of sample completeness and evenness under a unified framework of Hill numbers. iNEXT.4steps links sample completeness, diversity estimation, interpolation and extrapolation (iNEXT), and evenness in a fully integrated approach. An Online version of iNEXT.4steps is also available for users without an R background: https://chao.shinyapps.io/iNEXT_4steps/.
The pertinent background for the four-step methodology is provided in Chao et al. (2020). The four-step procedures are described in the following:

STEP 1. Assessment of sample completeness profile

Before performing biodiversity analysis, it is important to first quantify the sample completeness of a biological survey. Chao et al. (2020) generalized the conventional sample completeness to a class of measures parametrized by an order q \geq 0. When q = 0, sample completeness reduces to the conventional measure of completeness, i.e., the ratio of the observed species richness to the true richness (observed plus undetected). When q = 1, the measure reduces to the sample coverage (the proportion of the total number of individuals in the entire assemblage that belong to detected species), a concept original developed by Alan Turing in his cryptographic analysis during WWII. When q = 2, it represents a generalized sample coverage with each species being proportionally weighted by its squared species abundance (i.e., each individual being proportionally weighted by its species abundance); this measure thus is disproportionally sensitive to highly abundant species. For a general order q \geq 0 (not necessarily to be an integer) , the sample completeness of order q quantifies the proportion of the assemblage's individuals belonging to detected species, with each individual being proportionally weighted by the (q-1)th power of its abundance. Sample completeness profile depicts its estimate with respect to order q \geq 0; this profile fully characterizes the sample completeness of a biological survey.

iNEXT.4steps features the estimated sample-completeness profile for all orders of q \geq 0 based on the methodology developed in Chao et al. (2020). All estimates are theoretically between 0 and 1. If the estimated sample completeness profile is a horizontal line at the level of unity for all orders of q \geq 0, then the survey is complete, implying there is no undetected diversity. In most applications, the estimated profile increases with order q, revealing the existence of undetected diversity. The sample completeness estimate for q = 0 provides an upper bound for the proportion of observed species; its complement represents a lower bound for the proportion of undetected species. This interpretation is mainly because data typically do not contain sufficient information to accurately estimate species richness and only a lower bound of species richness can be well estimated. By contrast, for q \geq 1, when data are not sparse, the sample completeness value for q \geq 1 can be very accurately estimated measures. The values for q \geq 2 typically are very close to unity, signifying that almost all highly abundant species (for abundance data) or highly frequent species (for incidence data) had been detected in the reference sample.
STEP 2. Analysis of the size-based rarefaction and extrapolation sampling curves, and the asymptotic diversity profile for q between 0 and 2
(STEP 2a). For each dataset, first examine the pattern of the size-based rarefaction and extrapolation sampling curve up to double the reference sample size for q = 0, 1 and 2. If the curve stays at a fixed level (this often occurs for the measures of q = 1 and 2), then our asymptotic estimate presented in Step 2b can be used to accurately infer the true diversity of the entire assemblage. Otherwise, our asymptotic diversity estimate represents only a lower bound (this often occurs for the measures of q = 0).
(STEP 2b). When the true diversity can be accurately inferred, the extent of undetected diversity within each dataset is obtained by comparing the estimated asymptotic diversity profile and empirical profile; the difference in diversity between any two assemblages can be evaluated and tested for significance.
STEP 3. Analysis of non-asymptotic coverage-based rarefaction and extrapolation analysis for orders q = 0, 1 and 2

When sampling data do not contain sufficient information to accurately infer true diversity, fair comparisons of diversity across multiple assemblages should be made by standardizing the sample coverage (i.e., comparing diversity for a standardized fraction of an assemblage's individuals). This comparison can be done based on seamless integration of coverage-based rarefaction and extrapolation sampling curves up to a maximum coverage (Cmax = the minimum sample coverage among all samples extrapolated to double reference sizes).
STEP 4. Assessment of evenness profiles

Chao and Ricotta (2019) developed five classes of evenness measures parameterized by an order q \geq 0, the same order that is used to index sample completeness. All classes of evenness measures are functions of diversity and species richness, and all are standardized to the range of [0, 1] to adjust for the effect of differing species richness. Evenness profile depicts evenness estimate with respect to order q \geq 0. Because true species richness typically cannot be accurately estimated, evenness profile typically can only be accurately measured when both diversity and richness are computed at a fixed level of sample coverage up to a maximum coverage Cmax defined in Step 3. iNEXT.4steps shows, by default, the relevant statistics and plot for only one class of evenness measure (based on the normalized slope of a diversity profile), but all the five classes are optionally featured.

NOTE 1: Sufficient data are required to perform the 4-step analysis. If there are only a few species in users' data, it is likely that data are too sparse to use iNEXT.4steps.

NOTE 2: The analyses in STEP 2 and STEP 3 are mainly based on package iNEXT available from CRAN. Thus, iNEXT.4steps expands iNEXT to include the estimation of sample completeness and evenness.

NOTE 3: As with iNEXT, iNEXT.4steps only deals with taxonomic/species diversity. Researchers who are interested in phylogenetic diversity and functional diversity should use package iNEXT.3D available from CRAN and see the relevant paper (Chao et al. 2021) for methodology.

NOTE 4: iNEXT.4steps aims to compare within-assemblage diversity. If the goal is to assess the extent of differentiation among assemblages or to infer species compositional shift and abundance changes, users should use iNEXT.beta3D available from CRAN and see the relevant paper (Chao et al. 2023) for methodology.

There are five main functions in iNEXT.4steps:

1. iNEXT4steps computes all statistics in the complete 4-step analysis and visualizes the output. It computes sample completeness, observed and asymptotic diversity, size-based and coverage-based standardized diversity, and evenness.

2. Completeness computes sample completeness estimates of order q = 0 to q = 2 in increments of 0.2 (by default). This function is specifically for users who only require sample completeness estimates.

3. ggCompleteness visualizes the output obtained from the function Completeness.

4. Evenness computes standardized (or observed) evenness of order q = 0 to q = 2 in increments of 0.2 (by default) based on five classes of evenness measures. This function is specifically for users who only require evenness estimates.

5. ggEvenness visualizes the output obtained from the function Evenness.

Author(s)

Anne Chao, Kai-Hsiang Hu

Maintainer: Anne Chao <chao@stat.nthu.edu.tw>

References

Chao, A., Gotelli, N. G., Hsieh, T. C., Sander, E. L., Ma, K. H., Colwell, R. K. and Ellison, A. M. (2014). Rarefaction and extrapolation with Hill numbers: a framework for sampling and estimation in species biodiversity studies. Ecological Monographs, 84, 45-67.

Chao, A., Henderson, P. A., Chiu, C.-H., Moyes, F., Hu, K.-H., Dornelas, M and Magurran, A. E. (2021). Measuring temporal change in alpha diversity: a framework integrating taxonomic, phylogenetic and functional diversity and the iNEXT.3D standardization. Methods in Ecology and Evolution, 12, 1926-1940.

Chao, A., Kubota, Y., Zeleny, D., Chiu, C.-H., Li, C.-F., Kusumoto, B., Yasuhara, M., Thorn, S., Wei, C.-L., Costello, M. J. and Colwell, R. K. (2020). Quantifying sample completeness and comparing diversities among assemblages. Ecological Research, 35, 292-314.

Chao, A. and Ricotta, C. (2019). Quantifying evenness and linking it to diversity, beta diversity, and similarity. Ecology, 100(12), e02852.

Chao, A., Thorn, S., Chiu, C.-H., Moyes, F., Hu, K.-H., Chazdon, R. L., Wu, J., Magnago, L. F. S., Dornelas, M., Zeleny, D., Colwell, R. K., and Magurran, A. E. (2023). Rarefaction and extrapolation with beta diversity under a framework of Hill numbers: the iNEXT.beta3D standardization. Ecological Monographs, e1588.

Main function for STEP 1: Assessment of sample completeness

Description

Completeness computes sample completeness estimates of orders q = 0 to 2 in increments of 0.2 (by default).

Usage

Completeness(
  data,
  q = seq(0, 2, 0.2),
  datatype = "abundance",
  nboot = 30,
  conf = 0.95,
  nT = NULL
)

Arguments

data

(a) For datatype = "abundance", data can be input as a vector of species abundances (for a single assemblage), matrix/data.frame (species by assemblages), or a list of species abundance vectors.
(b) For datatype = "incidence_raw", data can be input as a list of matrix/data.frame (species by sampling units); data can also be input as a matrix/data.frame by merging all sampling units across assemblages based on species identity; in this case, the number of sampling units (nT, see below for this argument) must be input.

q

a numerical vector specifying the orders of sample completeness. Default is seq(0, 2, by = 0.2).

datatype

data type of input data: individual-based abundance data (datatype = "abundance") or species by sampling-units incidence matrix (datatype = "incidence_raw") with all entries being 0 (non-detection) or 1 (detection).

nboot

a positive integer specifying the number of bootstrap replications when assessing sampling uncertainty and constructing confidence intervals. Enter 0 to skip the bootstrap procedures. Default is 30.

conf

a positive number < 1 specifying the level of confidence interval. Default is 0.95.

nT

(required only when datatype = "incidence_raw" and input data in a single matrix/data.frame) a vector of positive integers specifying the number of sampling units in each assemblage. If assemblage names are not specified (i.e., names(nT) = NULL), then assemblages are automatically named as "Assemblage1", "Assemblage2",..., etc.

Value

a matrix of estimated sample completeness of order q:

Order.q

the order of sample completeness.

Estimate.SC

the estimated sample completeness of order q.

s.e.

standard error of sample completeness estimate.

SC.LCL, SC.UCL

the bootstrap lower and upper confidence limits for the sample completeness of order q at the specified level (with a default value of 0.95).

Assemblage

the assemblage name.

Examples


## Sample completeness for abundance data
data(Data_spider)
SC_out1 <- Completeness(data = Data_spider, datatype = "abundance")
SC_out1


## Sample completeness for incidence raw data
data(Data_woody_plant)
SC_out2 <- Completeness(data = Data_woody_plant, datatype = "incidence_raw")
SC_out2

Spider abundance data

Description

These data were sampled in a mountain forest ecosystem in the Bavarian Forest National Park, Germany (Thorn et al. 2016, 2017). A total of 12 experimental plots were established in "closed forest" stands (6 plots) and "open forest" stands with naturally occurring gaps and edges (6 plots) to assess the effects of microclimate on communities of epigeal (ground-dwelling) spiders.
Epigeal spiders were sampled over three years with four pitfall traps in each plot, yielding a total of 3171 individuals belonging to 85 species recorded in the pooled habitat. In the open forest, there were 1760 individuals representing 74 species, whereas in the closed forest, there were 1411 individuals representing 44 species.

Usage

data("Data_spider")

Format

Woody_plants is a species-by-assemblages data.frame with 85 species in two sites.
$ Open : int 350 325 237 102 91 72 68 61 53 50 ...
$ Closed: int 10 55 502 1 3 4 171 140 180 24 ...

Source

Thorn, S., Bassler, C., Svoboda, M., & Muller, J. (2017). Effects of natural disturbances and salvage logging on biodiversity - les- sons from the bohemian Forest. Forest Ecology and Management, 388, 113-119. https://doi.org/10.1016/j.foreco.2016.06.006

Thorn, S., Bubler, H., Fritze, M. -A., Goeder, P., Muller, J., Weib, I., & Seibold, S. (2016). Canopy closure determines arthropod assemblages in microhabitats created by windstorms and salvage logging. Forest Ecology and Management, 381, 188-195. https://doi.org/10.1016/j.foreco.2016.09.029

Examples

  data(Data_spider)

Incidence raw data

Description

This dataset was taken from The National Vegetation Database of Taiwan, sampled between 2003 and 2007 (Chiou et al. 2009). Only data in plots (each 20x20-m in area) belonging to two vegetation types (monsoon forest and upper_cloud forest) were used (Li et al. 2013). All woody plant individuals taller than 2 meters were recorded in each plot. In the monsoon forest, 329 species and 6814 incidences were recorded in 191 plots. In the upper cloud forest, 239 species and 3371 incidences were recorded in 153 plots (each plot is regarded as a sampling unit).

Usage

data(Data_woody_plant)

Format

Data_woody_plant a list of two species-by-sampling-unit data frames. Each element in the data frame is 1 for a detection, and 0 for a non-detection.
A list of 2:
$ Monsoon : num [1:329, 1:191] 0 0 0 0 0 0 0 0 0 0 ...
$ Upper_cloud: num [1:239, 1:153] 0 0 0 0 0 0 0 0 0 0 ...

Source

The National Vegetation Database of Taiwan (AS-TW-001) by Chiou et al. (2009).

References

Chiou, C.-R., Hsieh, C.-F., Wang, J.-C., Chen, M.-Y., Liu, H.-Y., Yeh, C.-L., ... Song, M. G.-Z. (2009). The first national vegetation inventory in Taiwan. Taiwan Journal of Forest Science, 24, 295-302.

Li, C.-F., Chytry, M., Zeleny, D., Chen, M. -Y., Chen, T.-Y., Chiou, C.-R., ... Hsieh, C.-F. (2013). Classification of Taiwan forest vegetation. Applied Vegetation Science, 16, 698-719.
https://doi.org/10.1111/avsc.12025

Examples

  data(Data_woody_plant)

Main function for STEP 4: Assessment of evenness

Description

Evenness computes standardized and observed evenness of order q = 0 to q = 2 in increments of 0.2 (by default) and depicts evenness profiles based on five classes of evenness measures developed in Chao and Ricotta (2019). Note that for q = 0 species abundances are disregarded, so it is not meaningful to evaluate evenness among abundances specifically for q = 0. As q tends to 0, all evenness values tend to 1 as a limiting value.

Usage

Evenness(
  data,
  q = seq(0, 2, 0.2),
  datatype = "abundance",
  method = "Estimated",
  nboot = 30,
  conf = 0.95,
  nT = NULL,
  E.class = 1:5,
  SC = NULL
)

Arguments

data

q

a numerical vector specifying the orders of evenness. Default is seq(0, 2, by = 0.2).

datatype

method

a binary selection of method with "Estimated" (evenness is computed under a standardized coverage value) or "Observed" (evenness is computed for the observed data).

nboot

a positive integer specifying the number of bootstrap replications when assessing sampling uncertainty and constructing confidence intervals. Enter 0 to skip the bootstrap procedures. Default is 30.

conf

a positive number < 1 specifying the level of confidence interval. Default is 0.95.

nT

(required only when datatype = "incidence_raw" and input data is matrix/data.frame) a vector of nonnegative integers specifying the number of sampling units in each assemblage. If assemblage names are not specified, then assemblages are automatically named as "Assemblage1", "Assemblage2",..., etc.

E.class

an integer vector between 1 to 5 specifying which class(es) of evenness measures are selected; default is 1:5 (select all five classes).

SC

(required only when method = "Estimated") a standardized coverage value for calculating estimated evenness. If SC = NULL, then this function computes the diversity estimates for the minimum sample coverage among all samples extrapolated to double reference sizes (Cmax).

Value

A list of several tables containing estimated (or observed) evenness with order q.
Each tables represents a class of evenness.

Order.q

the order of evenness

Evenness

the computed evenness value of order q.

s.e.

standard error of evenness value.

Even.LCL, Even.UCL

the bootstrap lower and upper confidence limits for the evenness of order q at the specified level (with a default value of 0.95).

Assemblage

the assemblage name.

Method

"Estimated" or "Observed".

SC

the standardized coverage value under which evenness values are computed (only for method = "Estimated")

References

Chao, A. and Ricotta, C. (2019). Quantifying evenness and linking it to diversity, beta diversity, and similarity. Ecology, 100(12), e02852.

Examples

## Evenness for abundance data
# The observed evenness values for abundance data
data(Data_spider)
Even_out1_obs <- Evenness(data = Data_spider, datatype = "abundance", 
                          method = "Observed", E.class = 1:5)
Even_out1_obs


# Estimated evenness for abundance data with default sample coverage value
data(Data_spider)
Even_out1_est <- Evenness(data = Data_spider, datatype = "abundance", 
                          method = "Estimated", SC = NULL, E.class = 1:5)
Even_out1_est


## Evenness for incidence raw data
# The observed evenness values for incidence raw data
data(Data_woody_plant)
Even_out2_obs <- Evenness(data = Data_woody_plant, datatype = "incidence_raw", 
                          method = "Observed", E.class = 1:5)
Even_out2_obs


# Estimated evenness for incidence data with user's specified coverage value of 0.98
data(Data_woody_plant)
Even_out2_est <- Evenness(data = Data_woody_plant, datatype = "incidence_raw", 
                          method = "Estimated", SC = 0.98, E.class = 1:5)
Even_out2_est

ggplot for depicting sample completeness profiles

Description

ggCompleteness is a ggplot2 extension for Completeness object to plot sample completeness with order q between 0 and 2.

Usage

ggCompleteness(output)

Arguments

output

output obtained from the function Completeness.

Value

a figure depicting the estimated sample completeness with respect to the order q.

Examples


## Sample completeness profile for abundance data
data(Data_spider)
SC_out1 <- Completeness(data = Data_spider, datatype = "abundance")
ggCompleteness(SC_out1)


## Sample completeness profile for incidence raw data
data(Data_woody_plant)
SC_out2 <- Completeness(data = Data_woody_plant, datatype = "incidence_raw")
ggCompleteness(SC_out2)

ggplot for depicting evenness profiles

Description

ggEvenness is a ggplot2 extension for Evenness object to plot evenness with order q.

Usage

ggEvenness(output)

Arguments

output

output obtained from the function Evenness.

Value

a figure depicting the estimated (or observed) evenness with respect to order q.

Examples

## Evenness profiles for abundance data
# The observed evenness profile for abundance data
data(Data_spider)
Even_out1_obs <- Evenness(data = Data_spider, datatype = "abundance", 
                    method = "Observed", E.class = 1:5)
ggEvenness(Even_out1_obs)


# The estimated evenness profile for abundance data with default sample coverage value
data(Data_spider)
Even_out1_est <- Evenness(data = Data_spider, datatype = "abundance", 
                          method = "Estimated", SC = NULL, E.class = 1:5)
ggEvenness(Even_out1_est)


## Evenness profiles for incidence raw data
# The observed evenness profile for incidence data
data(Data_woody_plant)
Even_out2_obs <- Evenness(data = Data_woody_plant, datatype = "incidence_raw", 
                    method = "Observed", E.class = 1:5)
ggEvenness(Even_out2_obs)


# The estimated evenness profile for incidence data with user's specified coverage value of 0.98
data(Data_woody_plant)
Even_out2_est <- Evenness(data = Data_woody_plant, datatype = "incidence_raw", 
                          method = "Estimated", SC = 0.98, E.class = 1:5)
ggEvenness(Even_out2_est)

Main function for complete 4-step analysis

Description

iNEXT4steps computes all statistics in the complete 4-step analysis and visualizes the output. It computes sample completeness, observed and asymptotic diversity, size-based and coverage-based standardized diversity, and evenness.

Usage

iNEXT4steps(
  data,
  q = seq(0, 2, 0.2),
  datatype = "abundance",
  nboot = 30,
  conf = 0.95,
  nT = NULL,
  details = FALSE
)

Arguments

data

q

a numerical vector specifying the orders of q that will be used to compute sample completeness and evenness as well as plot the relevant profiles. Default is seq(0, 2, by = 0.2).

datatype

nboot

a positive integer specifying the number of bootstrap replications when assessing sampling uncertainty and constructing confidence intervals. Enter 0 to skip the bootstrap procedures. Default is 30.

conf

a positive number < 1 specifying the level of confidence interval. Default is 0.95.

nT

details

a logical variable to indicate whether the detailed numerical values for each step are displayed. Default is FALSE.

Value

a list of three of objects:

$summary Numerical table for each individual step.

Assemblage

the assemblage names.

qTD

'Species richness' represents the taxonomic diversity of order q=0; 'Shannon diversity' represents the taxonomic diversity of order q=1, 'Simpson diversity' represents the taxonomic diversity of order q=2.

TD_obs

the observed taxonomic diversity value of order q.

TD_asy

the estimated asymptotic diversity value of order q.

s.e.

the bootstrap standard error of the estimated asymptotic diversity of order q.

qTD.LCL, qTD.UCL

the bootstrap lower and upper confidence limits for the estimated asymptotic diversity of order q at the specified level in the setting (with a default value of 0.95).

Pielou J'

a widely used evenness measure based on Shannon entropy.

$figure six figures including five individual figures (for STEPS 1, 2a, 2b, 3 and 4 respectively) and a complete set of five plots.

$details (only when details = TRUE). The numerical output for plotting all figures.

Examples


## Complete 4-step analysis for abundance data
data(Data_spider)
Four_Steps_out1 <- iNEXT4steps(data = Data_spider, datatype = "abundance")
Four_Steps_out1


## Complete 4-step analysis for incidence data
data(Data_woody_plant)
Four_Steps_out2 <- iNEXT4steps(data = Data_woody_plant, datatype = "incidence_raw")
Four_Steps_out2