Home Software Services About Contact usearch manual
Sample pooling

I usually recommend pooling samples for OTU clustering, for the following reasons.

Comparing samples
Creating a single set of OTUs is the most natural and intuitive basis for sample comparison, e.g. using a beta diversity metric. If you create separate OTUs for each sample, they are not directly comparable.

Improved amplicon abundance estimation and singleton detection
Samples are pooled, then a sequence that appears as a singleton in one sample may also appear in another sample. If singletons are discarded after pooling (as usually recommended in order to reduce spurious OTUs), then more low-abundance species will be retained.

Chimera detection
The UPARSE-OTU and UCHIME de novo algorithms both require that a chimera has lower read abundance than its parents. Chimeras are not detected if a parent has the same number or fewer reads. This most often happens with low-abundance parents, e.g. when a chimera and one of its parents are both present in exactly two reads. If samples are pooled, parent abundances usually increase because they are found in multiple samples, while chimeras are only rarely reproduced so will usually be found only in a single sample. Even if chimeras are reproduced, pooling will tend to increase both chimera and parent abundances, leading to a more accurate reflection of amplicon abundance so that parent abundances become greater than their chimeras. Conversely, pooling is highly unlikely to increase the abundance of a chimera relative to its parents. Pooling is therefore effective in reducing the number of spurious OTUs due to chimeras.

Samples should be combined after non-biological sequences such as barcodes have been stripped from the reads, and before dereplication. This is required so that dereplication reflects the abundances of unique biological sequences across all samples.