UTAX reference data downloads
Taxonomy benchmark results
The cluster_otus_utax command generates OTUs based on predicted taxonomies for a set of query sequences in FASTA or FASTQ format.
A reference database in UDB format is required. The makeudb_utax command is used to create the database.
See UTAX downloads page for available reference files.
A taxonomic level, e.g. genus or family, must be specified. Each taxon at that level defines an OTU. If the confidence value falls below the threshold, or there is no prediction, then the query is assigned to an "unclassified" OTU.
The taxonomic level is specified by the -utax_level option, e.g. -utax_level g for genus. See taxonomy annotations for supported levels.
The confidence threshold is specified by the -utax_cutoff option (default 0.9).
The -utaxotusout option specifies a tabbed text output file with one record per OTU.
The -tabbedout option specifies an output file in utax output format.
The -otus option specifies a FASTA file name. One representative sequence for each OTU is written to this file. Fields are appended to the sequence labels giving otu= and tax= annotations corresponding to fields 1 and 6 in the utaxout file. The representative sequence is the first found in the query set so the input should be sorted appropriately, typically in order of decreasing abundance of unique sequences obtained by derep_fulllength.
Taxonomy predictions for all query sequences written to a utaxout file if the -utaxout option is specified.
The strand option must be specified.
Multithreading is not supported.
usearch -cluster_otus_utax reads.fq -db 16s_ref.udb -utax_level
f -otus otus.fa \
-strand plus -utaxotusout otus.txt -utaxout out.utax