Home Software Services About Contact usearch manual
cluster_otus command

See also
  UPARSE pipeline
  UPARSE commands
  OTU benchmark results
  Mapping reads to OTUs

The cluster_otus command performs OTU clustering using the UPARSE-OTU algorithm.

Input is a FASTA file containing quality filtered and globally trimmed reads from a marker gene amplicon sequencing experiment, e.g. 16S or ITS. Paired reads must be merged before clustering. It is generally recommended that singleton reads should be discarded before clustering to minimize spurious OTUs.

Input sequences must be globally alignable with no terminal gaps. This is critically important as cluster_otus considers terminal gaps to be differences, unlike other commands. See global trimming for further discussion.

Input sequence labels must have size annotations. Note that FASTA labels are preserved in the output, so the size annotations in the -otus output file are the input sizes, not the number of reads assigned to a given OTU. To get the number of reads in each OTU, you must map reads back to OTUs.

Versions 7.0.1002 and earlier: The -otuid option specifies the minimum identity between an OTU member sequence and the representative sequence as a fractional identity 0.0 to 1.0. Default is 0.97, corresponding to a minimum identity of 97%. It is usually not recommended to use an otuid value less than 0.97; see UPARSE OTU radius.

Versions 7.0.1003 and later: The -otu_radius_pct option specifies the OTU "radius" as a percentage, i.e. the minimum difference between an OTU member sequence and the representative sequence of that OTU. Default is 3.0, corresponding to a minimum identity of 97%. It usually not recommended to use an otu_radius_pct value greater than 3; see UPARSE OTU radius.

The -otus option specifies a FASTA output file for the OTU representative sequences.

The -fastaout option specifies a FASTA output file containing all input sequences with labels annotated according to their UPARSE-REF models.

Parsimony score options are supported.

Alignment parameters and heuristics are supported.

Example

usearch -cluster_otus filtered_reads.fasta -otus otus.fasta