The fasta_rarify command computes a
rarefaction curve from the size annotations in
a FASTA file. This is a fast, approximate method for generating an abundance
curve for OTUs which is especially useful when
singletons are discarded, as recommended in the
UPARSE pipeline. See
abundance rarefaction for the correct, but more computationally expensive,
The -mingroupsize option specifies the minimum abundance
of a unique sequence to be counted as an observation. The default of 1 means
that all sequences are counted, so the results should be close to those
predicted by the "standard" rarefaction formula. Set
mingroupsize to 2 to discard singletons before counting the number of uniques.
The -iters option specifies the number of iterations to try for each subset size
(0, 1%, 2% ... 100% of the unique reads in the input file), default is 32
Output is in tabbed text format to a filename given by the
-output option. Fields are:
1. Percentage of sequences for subset.
2. Size of subset (total number of sequences).
3. Average number of unique sequences.
The output file can readily be imported into a spreadsheet
or other software that can generate graphs.
usearch -fasta_rarify otus.fa -mingroupsize 2 -iters
100 -output rare.txt