USEARCH manual

FAQ: Should you use UPARSE or UNOISE?

There are two different ways to make OTUs: 97% clustering and denoising.

The UPARSE algorithm makes 97% OTUs.

The UNOISE algorithm does denoising, i.e. error-correction.

If UPARSE works perfectly, it will give you a subset of the correct biological sequences in your reads such that no two sequences are >97% identical to each other. It is implemented in the cluster_otus command.

If UNOISE works perfectly, it will give you all the correct biological sequences in the reads. It is implemented in the unoise3 command.

(Of course, no algorithm is perfect so you should expect some mistakes).

The UPARSE pipeline and UNOISE pipeline are very similar, the main difference is whether you run cluster_otus or unoise3 as the clustering step.

Once you have made an OTU table, you can proceed with diversity analysis etc. in the same way, regardless of whether you used UPARSE or UNOISE.

Which should you choose? I suggest you try both.

Pros and cons
Almost all published papers use 97% clustering, so this will be easier to explain to your PI and to referees. The main disadvantage of 97% clustering is that you discard some correct biological sequences that are present in your reads. If these represent strains or species with a different phenotype, then you lose relevant information and the corresponding reads will be lumped together into one OTU that contains multiple phenotypes.

The main advantage of denoising is that you get better resolution by keeping all the biological sequences. The main disadvantage of denoising is that species often have variations between individuals and paralogs that are not 100% identical. If you have intra-species variations in the region that you sequenced, then you will get two or more OTUs for one species. For most purposes, this really doesn't matter -- it might even be better if this enables you to detect strains with different phenotypes -- so if I have to recommend one method, then I would recommend denoising.