Home Software Services About Contact usearch manual
Adding sizes to OTU labels

See also
  Generating OTUs and ZOTUs
  cluster_otus command
  unoise3 command

It is sometimes convenient to put size annotations in the OTU or ZOTU sequence labels. For example, a quick glance tells you if the OTU is highly abundant (e.g. size=256786) and therefore probably correct, or very low abundance (e.g. size=2) and therefore more likely to be spurious.

You can add the annotations using standard features of the usearch_global command to search reads against the OTUs. The -dbmatched output file contains the database sequences with at least one hit, and the -sizeout option adds annotations with the number of hits. If the input sequences have size annotations, then you should use the -sizein option, otherwise each input sequence counts as 1 to the output size. To increase sensitivity of the search heuristics, use -maxaccepts 4 -maxrejects 128 -top_hit_only.

You can use the quality-filtered input file to cluster_otus or unoise3, but for best sensitivity you should use reads before quality filtering and discarding low-abundance uniques.

To delete annotations, use the fastx_strip_annots command.

 Example

usearch -usearch_global reads.fq -db otus.fa -strand plus -id 0.97 -dbmatched otus_size.fa \
  -maxaccepts 4 -maxrejects 128 -top_hit_only -sizeout