The cluster_agg command performs
agglomerative clustering of sequences in a FASTA
or FASTQ file.
Cluster linkage is specified
using the ‑linkage option, which may be set to
max (the default), min or avg.
Output is reported as a tree in
Newick format specified by the -treeout option
and/or as a clusters file specified by the -clusterout
option. If a clusters file is specified, then the -id
option must be given to specify the identity threshold.
The first step in the algorithm is to compute a
distance matrix, which can be saved to a file by
specifying the -distmxout option. See
calc_distmx for options related to the
distance matrix calculation and output file.
usearch -cluster_agg seqs.fasta -treeout seqs.tree -clusterout clusters.txt -id 0.80 -linkage min