Home Software Services About Contact usearch manual
Alpha diversity metrics

See also
 
alpha_div command
  beta_div command

Alpha diversity metrics are calculated using the alpha_div command. It is more accurate to say that alpha_div calculates single-sample metrics because several of the metrics are not diversity metrics.

Some metrics just consider presence / absence of an OTU, e.g. richness, but most are based on OTU frequencies. Interpreting frequencies is difficult because amplification bias causes the number of reads to correlate very badly with the number of cells, so for example the OTU with highest frequency in the reads is often not the most abundant species. Because of cross-talk, even presence / absence of a given OTU in a given sample cannot be reliably established when the OTU has low abundance. Because of these issues, it is difficult to interpret diversity metrics from traditional numerical ecology when they are applied to next-generation marker gene sequencing.

Chao-1 attempts to estimate the total number of OTUs in the community including those that were not observed. In my opinion, estimators have little value in amplicon sequencing experiments because low-abundance OTUs are often spurious which makes reliable extrapolation impossible.

Confusingly, some metrics use different units so cannot be compared with each other. For example, the popular Shannon index is a measure of entropy where the unit is bits of information if the logarithms are base 2, but people sometimes use natural logarithms (base e) or base 10. None of these variants of the Shannon index have an obvious connection to the number of OTUs, and people often do not say which variant they used, so the numerical values are difficult to interpret. Metrics using unfamiliar units can be interpreted by converting to an effective number of OTUs. The effective number of OTUs for the Shannon index is the Jost index of order 1.

Diversity metrics
Name Units Description
richness OTUs Number of OTUs with at least one read for the sample.
 
chao1 OTUs Chao-1 estimator, calculated as N + S2 / (2 D2) where N is the number of OTUs, S is the number of singleton OTUs and D is the number of doublet OTUs, i.e. OTUs with abundance 2.
 
shannon_2 bits Shannon index (logs to base 2).
 
shannon_e nats Shannon index (logs to base e).
 
shannon_10 dits Shannon index (logs to base 10).
 
jost OTUs Jost index of order q where q is specified by the -jostq command-line option, default 1.5.
 
jost1 OTUs Jost index of order 1, the effective number of species given by the Shannon index.
 

Evenness metrics
Name Units Description
simpson Probability Simpson index, calculated as the sum over OTUs of f2 where f is the frequency of the OTU. It is the probability that two randomly selected reads will belong to the same OTU. A value close to 1 indicates that a single large OTU dominates the sample, small values indicate that the reads are distributed over many OTUs.
 
dominance
 
Probability Probability that two randomly selected reads will belong to different OTUs. Calculated as 1 – simpson.
 
equitability ? Entropy (Shannon index) divided by the logarithm of the number of OTUs. Value of 1 indicates perfectly even (equal abundances), small values indicate a highly skewed abundance distribution.
 
robbins Frequency Robbins index, calculated as S / (N + 1) where S is the number of singleton OTUs and N is the total number of OTUs.
 
berger_parker Frequency Berger-Parker index. Frequency of the most abundant OTU. A value close to 1 indicates that a single large OTU dominates the sample, small values indicate that the reads are distributed over many OTUs.
 

 Other
Name Units Description
reads Reads Total number of reads for the sample.