Home Software Services About Contact     
 
USEARCH v11

cluster_smallmem command

 See also
 
cluster_fast
  cluster_otus
  cluster_agg
  cluster_aggd

Clusters sequences in a FASTA or FASTQ file using a variant of the UCLUST algorithm designed to minimize memory use.

It's is the user's responsibility to sort the input sequences in an appropriate order before running cluster_smallmem; see UCLUST sort order for discussion. By default, input sequences are expected to be sorted by decreasing length. If some other sort order is used, the -sortedby option should be specified. Valid values are length (default), size and other. If -sortedby other is specified, then USEARCH does not assume or check for any particular order. See also sortbysize and sortbylength.

An identity threshold must be specified using the -id option.

Multithreading is not supported as this would require significant memory overhead.

By default, nucleotide matching is done on the forward strand only. For matching on both strands, use -strand both.

See also
  Standard output file options
 
Accept options
  Indexing options
  Termination options
  Masking options
  Alignment parameters
  Alignment heuristics

  Cluster sizes
  Memory requirements

Example

usearch -cluster_smallmem query.fasta -id 0.9 -centroids nr.fasta -uc clusters.uc