Type: | Package |
Title: | Four-Step Biodiversity Analysis Based on 'iNEXT' |
Version: | 1.0.1 |
Author: | Anne Chao [aut, cre], KaiHsiang Hu [ctb] |
Maintainer: | Anne Chao <chao@stat.nthu.edu.tw> |
URL: | https://sites.google.com/view/chao-lab-website/software/inext-4steps/ |
Description: | Expands 'iNEXT' to include the estimation of sample completeness and evenness. The package provides simple functions to perform the following four-step biodiversity analysis: STEP 1: Assessment of sample completeness profiles. STEP 2a: Analysis of size-based rarefaction and extrapolation sampling curves to determine whether the asymptotic diversity can be accurately estimated. STEP 2b: Comparison of the observed and the estimated asymptotic diversity profiles. STEP 3: Analysis of non-asymptotic coverage-based rarefaction and extrapolation sampling curves. STEP 4: Assessment of evenness profiles. The analyses in STEPs 2a, 2b and STEP 3 are mainly based on the previous 'iNEXT' package. Refer to the 'iNEXT' package for details. This package is mainly focusing on the computation for STEPs 1 and 4. See Chao et al. (2020) <doi:10.1111/1440-1703.12102> for statistical background. |
License: | GPL (≥ 3) |
Depends: | R (≥ 4.0) |
Imports: | ggplot2, reshape2, dplyr, stats, ggpubr, purrr, iNEXT.3D |
Suggests: | testthat, knitr, rmarkdown |
Encoding: | UTF-8 |
BugReports: | https://github.com/KaiHsiangHu/iNEXT.4steps/issues |
LazyData: | true |
RoxygenNote: | 7.2.3 |
VignetteBuilder: | knitr |
ByteCompile: | true |
NeedsCompilation: | no |
Packaged: | 2024-06-17 15:07:38 UTC; stat-pc |
Repository: | CRAN |
Date/Publication: | 2024-06-18 09:10:02 UTC |
Four-step biodiversity analysis based on iNEXT
Description
This package expands iNEXT (Chao et al. 2014) to include the estimation of sample completeness and evenness under a unified framework of Hill numbers.
iNEXT.4steps links sample completeness, diversity estimation, interpolation and extrapolation (iNEXT), and evenness in a fully integrated approach.
An Online version of iNEXT.4steps is also available for users without an R background:
https://chao.shinyapps.io/iNEXT_4steps/.
The pertinent background for the four-step methodology is provided in Chao et al. (2020). The four-step procedures are described in the following:
-
STEP 1
. Assessment of sample completeness profileBefore performing biodiversity analysis, it is important to first quantify the sample completeness of a biological survey. Chao et al. (2020) generalized the conventional sample completeness to a class of measures parametrized by an order q
\geq
0. When q = 0, sample completeness reduces to the conventional measure of completeness, i.e., the ratio of the observed species richness to the true richness (observed plus undetected). Whenq = 1
, the measure reduces to the sample coverage (the proportion of the total number of individuals in the entire assemblage that belong to detected species), a concept original developed by Alan Turing in his cryptographic analysis during WWII. When q = 2, it represents a generalized sample coverage with each species being proportionally weighted by its squared species abundance (i.e., each individual being proportionally weighted by its species abundance); this measure thus is disproportionally sensitive to highly abundant species. For a general order q\geq
0 (not necessarily to be an integer) , the sample completeness of order q quantifies the proportion of the assemblage's individuals belonging to detected species, with each individual being proportionally weighted by the (q-1)th power of its abundance. Sample completeness profile depicts its estimate with respect to order q\geq
0; this profile fully characterizes the sample completeness of a biological survey.
iNEXT.4steps features the estimated sample-completeness profile for all orders of q
\geq
0 based on the methodology developed in Chao et al. (2020). All estimates are theoretically between 0 and 1. If the estimated sample completeness profile is a horizontal line at the level of unity for all orders of q\geq
0, then the survey is complete, implying there is no undetected diversity. In most applications, the estimated profile increases with order q, revealing the existence of undetected diversity. The sample completeness estimate for q = 0 provides an upper bound for the proportion of observed species; its complement represents a lower bound for the proportion of undetected species. This interpretation is mainly because data typically do not contain sufficient information to accurately estimate species richness and only a lower bound of species richness can be well estimated. By contrast, for q\geq
1, when data are not sparse, the sample completeness value for q\geq
1 can be very accurately estimated measures. The values for q\geq
2 typically are very close to unity, signifying that almost all highly abundant species (for abundance data) or highly frequent species (for incidence data) had been detected in the reference sample.
-
STEP 2
. Analysis of the size-based rarefaction and extrapolation sampling curves, and the asymptotic diversity profile for q between 0 and 2
-
(STEP 2a)
. For each dataset, first examine the pattern of the size-based rarefaction and extrapolation sampling curve up to double the reference sample size for q = 0, 1 and 2. If the curve stays at a fixed level (this often occurs for the measures of q = 1 and 2), then our asymptotic estimate presented in Step 2b can be used to accurately infer the true diversity of the entire assemblage. Otherwise, our asymptotic diversity estimate represents only a lower bound (this often occurs for the measures of q = 0).
-
(STEP 2b)
. When the true diversity can be accurately inferred, the extent of undetected diversity within each dataset is obtained by comparing the estimated asymptotic diversity profile and empirical profile; the difference in diversity between any two assemblages can be evaluated and tested for significance.
-
STEP 3
. Analysis of non-asymptotic coverage-based rarefaction and extrapolation analysis for orders q = 0, 1 and 2When sampling data do not contain sufficient information to accurately infer true diversity, fair comparisons of diversity across multiple assemblages should be made by standardizing the sample coverage (i.e., comparing diversity for a standardized fraction of an assemblage's individuals). This comparison can be done based on seamless integration of coverage-based rarefaction and extrapolation sampling curves up to a maximum coverage (Cmax = the minimum sample coverage among all samples extrapolated to double reference sizes).
-
STEP 4
. Assessment of evenness profilesChao and Ricotta (2019) developed five classes of evenness measures parameterized by an order q
\geq
0, the same order that is used to index sample completeness. All classes of evenness measures are functions of diversity and species richness, and all are standardized to the range of [0, 1] to adjust for the effect of differing species richness. Evenness profile depicts evenness estimate with respect to order q\geq
0. Because true species richness typically cannot be accurately estimated, evenness profile typically can only be accurately measured when both diversity and richness are computed at a fixed level of sample coverage up to a maximum coverage Cmax defined in Step 3. iNEXT.4steps shows, by default, the relevant statistics and plot for only one class of evenness measure (based on the normalized slope of a diversity profile), but all the five classes are optionally featured.
NOTE 1: Sufficient data are required to perform the 4-step analysis. If there are only a few species in users' data, it is likely that data are too sparse to use iNEXT.4steps.
NOTE 2: The analyses in STEP 2 and STEP 3 are mainly based on package iNEXT available from CRAN. Thus, iNEXT.4steps expands iNEXT to include the estimation of sample completeness and evenness.
NOTE 3: As with iNEXT, iNEXT.4steps only deals with taxonomic/species diversity. Researchers who are interested in phylogenetic diversity and functional diversity should use package iNEXT.3D available from CRAN and see the relevant paper (Chao et al. 2021) for methodology.
NOTE 4: iNEXT.4steps aims to compare within-assemblage diversity. If the goal is to assess the extent of differentiation among assemblages or to infer species compositional shift and abundance changes, users should use iNEXT.beta3D available from CRAN and see the relevant paper (Chao et al. 2023) for methodology.
There are five main functions in iNEXT.4steps:
1. iNEXT4steps
computes all statistics in the complete 4-step analysis and visualizes the output. It computes sample completeness, observed and asymptotic diversity, size-based and coverage-based standardized diversity, and evenness.
2. Completeness
computes sample completeness estimates of order q = 0 to q = 2 in increments of 0.2 (by default). This function is specifically for users who only require sample completeness estimates.
3. ggCompleteness
visualizes the output obtained from the function Completeness
.
4. Evenness
computes standardized (or observed) evenness of order q = 0 to q = 2 in increments of 0.2 (by default) based on five classes of evenness measures. This function is specifically for users who only require evenness estimates.
5. ggEvenness
visualizes the output obtained from the function Evenness
.
Author(s)
Anne Chao, Kai-Hsiang Hu
Maintainer: Anne Chao <chao@stat.nthu.edu.tw>
References
Chao, A., Gotelli, N. G., Hsieh, T. C., Sander, E. L., Ma, K. H., Colwell, R. K. and Ellison, A. M. (2014). Rarefaction and extrapolation with Hill numbers: a framework for sampling and estimation in species biodiversity studies. Ecological Monographs, 84, 45-67.
Chao, A., Henderson, P. A., Chiu, C.-H., Moyes, F., Hu, K.-H., Dornelas, M and Magurran, A. E. (2021). Measuring temporal change in alpha diversity: a framework integrating taxonomic, phylogenetic and functional diversity and the iNEXT.3D standardization. Methods in Ecology and Evolution, 12, 1926-1940.
Chao, A., Kubota, Y., Zeleny, D., Chiu, C.-H., Li, C.-F., Kusumoto, B., Yasuhara, M., Thorn, S., Wei, C.-L., Costello, M. J. and Colwell, R. K. (2020). Quantifying sample completeness and comparing diversities among assemblages. Ecological Research, 35, 292-314.
Chao, A. and Ricotta, C. (2019). Quantifying evenness and linking it to diversity, beta diversity, and similarity. Ecology, 100(12), e02852.
Chao, A., Thorn, S., Chiu, C.-H., Moyes, F., Hu, K.-H., Chazdon, R. L., Wu, J., Magnago, L. F. S., Dornelas, M., Zeleny, D., Colwell, R. K., and Magurran, A. E. (2023). Rarefaction and extrapolation with beta diversity under a framework of Hill numbers: the iNEXT.beta3D standardization. Ecological Monographs, e1588.
Main function for STEP 1: Assessment of sample completeness
Description
Completeness
computes sample completeness estimates of orders q = 0 to 2 in increments of 0.2 (by default).
Usage
Completeness(
data,
q = seq(0, 2, 0.2),
datatype = "abundance",
nboot = 30,
conf = 0.95,
nT = NULL
)
Arguments
data |
(a) For |
q |
a numerical vector specifying the orders of sample completeness. Default is |
datatype |
data type of input data: individual-based abundance data ( |
nboot |
a positive integer specifying the number of bootstrap replications when assessing sampling uncertainty and constructing confidence intervals. Enter 0 to skip the bootstrap procedures. Default is 30. |
conf |
a positive number < 1 specifying the level of confidence interval. Default is 0.95. |
nT |
(required only when |
Value
a matrix of estimated sample completeness of order q:
Order.q |
the order of sample completeness. |
Estimate.SC |
the estimated sample completeness of order q. |
s.e. |
standard error of sample completeness estimate. |
SC.LCL , SC.UCL |
the bootstrap lower and upper confidence limits for the sample completeness of order q at the specified level (with a default value of |
Assemblage |
the assemblage name. |
Examples
## Sample completeness for abundance data
data(Data_spider)
SC_out1 <- Completeness(data = Data_spider, datatype = "abundance")
SC_out1
## Sample completeness for incidence raw data
data(Data_woody_plant)
SC_out2 <- Completeness(data = Data_woody_plant, datatype = "incidence_raw")
SC_out2
Spider abundance data
Description
These data were sampled in a mountain forest ecosystem in the Bavarian Forest National Park, Germany (Thorn et al. 2016, 2017).
A total of 12 experimental plots were established in "closed forest" stands (6 plots) and "open forest" stands with naturally occurring gaps and edges (6 plots) to assess the effects of microclimate on communities of epigeal (ground-dwelling) spiders.
Epigeal spiders were sampled over three years with four pitfall traps in each plot, yielding a total of 3171 individuals belonging to 85 species recorded in the pooled habitat. In the open forest, there were 1760 individuals representing 74 species, whereas in the closed forest, there were 1411 individuals representing 44 species.
Usage
data("Data_spider")
Format
Woody_plants
is a species-by-assemblages data.frame with 85 species in two sites.
$ Open : int 350 325 237 102 91 72 68 61 53 50 ...
$ Closed: int 10 55 502 1 3 4 171 140 180 24 ...
Source
Thorn, S., Bassler, C., Svoboda, M., & Muller, J. (2017). Effects of natural disturbances and salvage logging on biodiversity - les- sons from the bohemian Forest. Forest Ecology and Management, 388, 113-119. https://doi.org/10.1016/j.foreco.2016.06.006
Thorn, S., Bubler, H., Fritze, M. -A., Goeder, P., Muller, J., Weib, I., & Seibold, S. (2016). Canopy closure determines arthropod assemblages in microhabitats created by windstorms and salvage logging. Forest Ecology and Management, 381, 188-195. https://doi.org/10.1016/j.foreco.2016.09.029
Examples
data(Data_spider)
Incidence raw data
Description
This dataset was taken from The National Vegetation Database of Taiwan, sampled between 2003 and 2007 (Chiou et al. 2009). Only data in plots (each 20x20-m in area) belonging to two vegetation types (monsoon forest and upper_cloud forest) were used (Li et al. 2013). All woody plant individuals taller than 2 meters were recorded in each plot. In the monsoon forest, 329 species and 6814 incidences were recorded in 191 plots. In the upper cloud forest, 239 species and 3371 incidences were recorded in 153 plots (each plot is regarded as a sampling unit).
Usage
data(Data_woody_plant)
Format
Data_woody_plant
a list of two species-by-sampling-unit data frames. Each element in the data frame is 1 for a detection, and 0 for a non-detection.
A list of 2:
$ Monsoon : num [1:329, 1:191] 0 0 0 0 0 0 0 0 0 0 ...
$ Upper_cloud: num [1:239, 1:153] 0 0 0 0 0 0 0 0 0 0 ...
Source
The National Vegetation Database of Taiwan (AS-TW-001) by Chiou et al. (2009).
References
Chiou, C.-R., Hsieh, C.-F., Wang, J.-C., Chen, M.-Y., Liu, H.-Y., Yeh, C.-L., ... Song, M. G.-Z. (2009). The first national vegetation inventory in Taiwan. Taiwan Journal of Forest Science, 24, 295-302.
Li, C.-F., Chytry, M., Zeleny, D., Chen, M. -Y., Chen, T.-Y., Chiou, C.-R., ... Hsieh, C.-F. (2013). Classification of Taiwan forest vegetation. Applied Vegetation Science, 16, 698-719.
https://doi.org/10.1111/avsc.12025
Examples
data(Data_woody_plant)
Main function for STEP 4: Assessment of evenness
Description
Evenness
computes standardized and observed evenness of order q = 0 to q = 2 in increments of 0.2 (by default) and depicts evenness profiles based on five classes of evenness measures developed
in Chao and Ricotta (2019). Note that for q = 0 species abundances are disregarded, so it is not meaningful to evaluate evenness among abundances specifically for q = 0. As q tends to 0, all evenness values tend to 1 as a limiting value.
Usage
Evenness(
data,
q = seq(0, 2, 0.2),
datatype = "abundance",
method = "Estimated",
nboot = 30,
conf = 0.95,
nT = NULL,
E.class = 1:5,
SC = NULL
)
Arguments
data |
(a) For |
q |
a numerical vector specifying the orders of evenness. Default is |
datatype |
data type of input data: individual-based abundance data ( |
method |
a binary selection of method with |
nboot |
a positive integer specifying the number of bootstrap replications when assessing sampling uncertainty and constructing confidence intervals. Enter 0 to skip the bootstrap procedures. Default is |
conf |
a positive number < |
nT |
(required only when |
E.class |
an integer vector between 1 to 5 specifying which class(es) of evenness measures are selected; default is 1:5 (select all five classes). |
SC |
(required only when |
Value
A list of several tables containing estimated (or observed) evenness with order q.
Each tables represents a class of evenness.
Order.q |
the order of evenness |
Evenness |
the computed evenness value of order q. |
s.e. |
standard error of evenness value. |
Even.LCL , Even.UCL |
the bootstrap lower and upper confidence limits for the evenness of order q at the specified level (with a default value of |
Assemblage |
the assemblage name. |
Method |
|
SC |
the standardized coverage value under which evenness values are computed (only for |
References
Chao, A. and Ricotta, C. (2019). Quantifying evenness and linking it to diversity, beta diversity, and similarity. Ecology, 100(12), e02852.
Examples
## Evenness for abundance data
# The observed evenness values for abundance data
data(Data_spider)
Even_out1_obs <- Evenness(data = Data_spider, datatype = "abundance",
method = "Observed", E.class = 1:5)
Even_out1_obs
# Estimated evenness for abundance data with default sample coverage value
data(Data_spider)
Even_out1_est <- Evenness(data = Data_spider, datatype = "abundance",
method = "Estimated", SC = NULL, E.class = 1:5)
Even_out1_est
## Evenness for incidence raw data
# The observed evenness values for incidence raw data
data(Data_woody_plant)
Even_out2_obs <- Evenness(data = Data_woody_plant, datatype = "incidence_raw",
method = "Observed", E.class = 1:5)
Even_out2_obs
# Estimated evenness for incidence data with user's specified coverage value of 0.98
data(Data_woody_plant)
Even_out2_est <- Evenness(data = Data_woody_plant, datatype = "incidence_raw",
method = "Estimated", SC = 0.98, E.class = 1:5)
Even_out2_est
ggplot for depicting sample completeness profiles
Description
ggCompleteness
is a ggplot2
extension for Completeness
object to plot sample completeness with order q between 0 and 2.
Usage
ggCompleteness(output)
Arguments
output |
output obtained from the function |
Value
a figure depicting the estimated sample completeness with respect to the order q.
Examples
## Sample completeness profile for abundance data
data(Data_spider)
SC_out1 <- Completeness(data = Data_spider, datatype = "abundance")
ggCompleteness(SC_out1)
## Sample completeness profile for incidence raw data
data(Data_woody_plant)
SC_out2 <- Completeness(data = Data_woody_plant, datatype = "incidence_raw")
ggCompleteness(SC_out2)
ggplot for depicting evenness profiles
Description
ggEvenness
is a ggplot2
extension for Evenness
object to plot evenness with order q.
Usage
ggEvenness(output)
Arguments
output |
output obtained from the function |
Value
a figure depicting the estimated (or observed) evenness with respect to order q.
Examples
## Evenness profiles for abundance data
# The observed evenness profile for abundance data
data(Data_spider)
Even_out1_obs <- Evenness(data = Data_spider, datatype = "abundance",
method = "Observed", E.class = 1:5)
ggEvenness(Even_out1_obs)
# The estimated evenness profile for abundance data with default sample coverage value
data(Data_spider)
Even_out1_est <- Evenness(data = Data_spider, datatype = "abundance",
method = "Estimated", SC = NULL, E.class = 1:5)
ggEvenness(Even_out1_est)
## Evenness profiles for incidence raw data
# The observed evenness profile for incidence data
data(Data_woody_plant)
Even_out2_obs <- Evenness(data = Data_woody_plant, datatype = "incidence_raw",
method = "Observed", E.class = 1:5)
ggEvenness(Even_out2_obs)
# The estimated evenness profile for incidence data with user's specified coverage value of 0.98
data(Data_woody_plant)
Even_out2_est <- Evenness(data = Data_woody_plant, datatype = "incidence_raw",
method = "Estimated", SC = 0.98, E.class = 1:5)
ggEvenness(Even_out2_est)
Main function for complete 4-step analysis
Description
iNEXT4steps
computes all statistics in the complete 4-step analysis and visualizes the output. It computes sample completeness, observed and asymptotic diversity, size-based and coverage-based standardized diversity, and evenness.
Usage
iNEXT4steps(
data,
q = seq(0, 2, 0.2),
datatype = "abundance",
nboot = 30,
conf = 0.95,
nT = NULL,
details = FALSE
)
Arguments
data |
(a) For |
q |
a numerical vector specifying the orders of q that will be used to compute sample completeness and evenness as well as plot the relevant profiles. Default is |
datatype |
data type of input data: individual-based abundance data ( |
nboot |
a positive integer specifying the number of bootstrap replications when assessing sampling uncertainty and constructing confidence intervals. Enter 0 to skip the bootstrap procedures. Default is 30. |
conf |
a positive number < 1 specifying the level of confidence interval. Default is 0.95. |
nT |
(required only when |
details |
a logical variable to indicate whether the detailed numerical values for each step are displayed. Default is |
Value
a list of three of objects:
$summary
Numerical table for each individual step.
Assemblage |
the assemblage names. |
qTD |
'Species richness' represents the taxonomic diversity of order q=0; 'Shannon diversity' represents the taxonomic diversity of order q=1, 'Simpson diversity' represents the taxonomic diversity of order q=2. |
TD_obs |
the observed taxonomic diversity value of order q. |
TD_asy |
the estimated asymptotic diversity value of order q. |
s.e. |
the bootstrap standard error of the estimated asymptotic diversity of order q. |
qTD.LCL , qTD.UCL |
the bootstrap lower and upper confidence limits for the estimated asymptotic diversity of order q at the specified level in the setting (with a default value of 0.95). |
Pielou J' |
a widely used evenness measure based on Shannon entropy. |
$figure
six figures including five individual figures (for STEPS 1, 2a, 2b, 3 and 4 respectively) and a complete set of five plots.
$details
(only when details = TRUE
). The numerical output for plotting all figures.
Examples
## Complete 4-step analysis for abundance data
data(Data_spider)
Four_Steps_out1 <- iNEXT4steps(data = Data_spider, datatype = "abundance")
Four_Steps_out1
## Complete 4-step analysis for incidence data
data(Data_woody_plant)
Four_Steps_out2 <- iNEXT4steps(data = Data_woody_plant, datatype = "incidence_raw")
Four_Steps_out2