Home Software Services About Contact usearch manual

OTUs with radius >3%
The cluster_otus command command has an otu_radius_pct option for specifying a radius different from the default of 3%. However, please note that it is not recommended to use larger values. This is because chimera detection is an integral part of the clustering algorithm. Each input sequence is run through UPARSE-REF using the current set of OTUs as a reference database. If the optimal model is chimeric, the sequence is discarded. If an OTU radius > 3% is used, then chimera detection becomes more difficult because more true biological sequences will also be discarded when they don't create new OTUs. The set of OTU sequences becomes sparser, and the correct parents of a chimera will more often be missing from the OTU database. Chimeras can still be detected when there are OTUs which are sufficiently close to their parents, but the false negative rate will tend to increase. I therefore recommend a different procedure rather than using the otu_radius_pct option.

Recommended procedure for larger OTU radius
I have not tested OTU pipelines with OTU radius different from 3%, so these ideas are preliminary. If this is important to you, then I would welcome a discussion and will be glad to work with you to help you analyze your particular data -- you're welcome to email me.

The basic idea is to make a set of OTUs using cluster_otus at a small radius, then use UCLUST to re-cluster at a higher radius. For example, if you want OTUs at a radius of 4%, then you could use cluster_otus at 2% then cluster_smallmem with -id 0.96 (corresponding to a radius of 4%)

usearch -cluster_otus sorted.fa -otu_radius_pct 2 -otus otus98.fa

usearch -cluster_smallmem otus98.fa -id 0.96 -centroids otus.fa