Commands > OTU analysis
Defining and interpreting OTUs
The otutab command
generates an OTU table by
mapping reads to OTUs.
OTU table output
See OTU table output options.
Normalizing the table
After generating the table, you
should use the otutab_norm command to
normalize all samples to the same number of reads.
The query file can be
in FASTQ or FASTA format. Every query sequence must be labeled with a
sample identifier. The
can be used to check that your sample names are formatted correctly.
Query sequences are
typically raw reads, i.e. reads after paired read merging, if applicable,
but before quality filtering. Low-quality reads and singletons can often be
mapped successfully to an OTU, so including them accounts for a larger
fraction of the reads. The
fastx_uniques_persample command can be used to find the unique sequences
and abundances for all samples. This compresses the input data and makes the
otutab command somewhat faster but probably not as much as you might expect
(typically, the compression is only ~2x).
The search database is either a set of OTU sequences or "ZOTU"
i.e. denoised sequences. Each query sequence is
mapped to the closest database sequence. Ties are broken systematically by
picking the first in database file order. A udb
database can be used. Database sequences must be labeled with
OTU identifiers. The database file is
specified by the -otus or -zotus option. Use -zotus if the OTUs are
denoised, -otus otherwise.
Identity threshold for mapping
The -id option sets the minimum
fractional identity. Default is 0.97,
corresponding to 97% identity. Denoised OTUs also use a 97% identity
default to allow for sequencing and PCR error. See
defining and interpreting OTUs for discussion.
default, reads are assumed to be on the same strand as the OTU sequences. You can use -strand both
to search both strands.
The -notmatched option specifies a
FASTA filename for sequences which are not assigned to an OTU.
The -notmatchedfq option specifies a FASTQ file for
unassigned sequences (input must be FASTQ).
are stripped from the OTU sequence labels unless the -keep_annots option is
Multithreading and standard output files
usearch -otutab reads.fq -otus
otus.udb -otutabout otutab.txt -biomout otutab.json \