Home Software Services About Contact usearch manual
cluster_agg command
Commands > Clustering

See also
 
cluster_otus
  cluster_fast
  cluster_smallmem
 
cluster_aggd


The cluster_agg command performs agglomerative clustering of sequences in a FASTA or FASTQ file.

Cluster linkage is specified using the ‑linkage option, which may be set to max (the default), min or avg.

Output is reported as a tree in Newick format specified by the -treeout option and/or as a clusters file specified by the -clusterout option. If a clusters file is specified, then the -id option must be given to specify the identity threshold.

The first step in the algorithm is to compute a distance matrix, which can be saved to a file by specifying the -distmxout option. See calc_distmx for options related to the distance matrix calculation and output file.

Example

usearch -cluster_agg seqs.fasta -treeout seqs.tree -clusterout clusters.txt -id 0.80 -linkage min