Suppose you observe *K* different species, finding *N _{k}* individuals in the

My colleague Henrik Flyvbjerg and I have developed the modified formula for *S*(*n*) when singletons are discarded; let me know if you are interested and I will send you the details.

**Rarefaction for OTUs** If species with exactly one observation are ignored, then the above formula does not apply. Thus, if singleton reads are discarded, as recommended in the UPARSE pipeline, then you cannot use standard rarefaction software and the above formula does not give the correct result. If singletons are retained, then the formula is a reasonable approximation, but is not exactly correct because

With *de novo* OTUs, including those made by UPARSE, then strictly speaking, rarefaction curves must be generated by running the pipeline from scratch for each random subsample and noting the number of OTUs obtained.