Comments on Wescott and Schloss 2017 OptiClust paper
Westcott SL and Schloss PD. 2017,
an improved method for assigning
amplicon-based sequence data to operational taxonomic
units, mSphere 2:e00073-17.
OptiClust greatly inflates richness on mock community reads
Defining and interpreting OTUs
Matthews Correlation Coefficient (MCC)
Does MCC consider unique sequence abundance?
Common case where MCC fails
Westcott and Schloss
fail to consider several popular methods, including UPARSE and denoisers. It seems clear to me that
denoising (error-correction) should be the preferred pre-processing step for
OTU clustering because otherwise the input data for the clustering algorithm
will have noise due to PCR and sequencing in addition to biological
variation. However, there is no mention of denoising in the
I attempted to reproduce the authors' read pre-processing protocol, and
found that it failed to filter
many bad reads and chimeras on mock community data. Clustering to define OTUs is a
very different problem from
clustering to account for noise, and in my opinion these should be
considered separately. The authors do not explain why it is informative to
use noisy data to test clustering algorithms which assume or require
With these considerations in mind, I found that
OptiClust generated >5,000 OTUs on reads of a mock community with 22 strains,
after making my best effort to pre-process the reads according to the
procedures described in the paper.
The authors implicitly propose rules for defining OTUs
(which should have been made explicit and discussed), but their rules are impossible to satisfy on real
data which calls their conceptual approach into question. By contrast, the
UPARSE clustering rules, which also
construct 97% OTUs, can always be satisfied on real data.
The OptiClust algorithm constructs OTUs by seeking to maximize the
Matthews Correlation Coefficient (MCC). Their benchmark tests
also use MCC as
an accuracy metric, which are therefore strongly biased towards OptiClust.
MCC is not universally accepted -- on the contrarary, I have not
been able to find any
papers from outside the Schloss lab which use MCC to define or assess OTUs.
From my own perspective, I do not agree that MCC is a good definition and I believe that the
UPARSE clustering criteria are better. Also, MCC fails in some common cases.
Therefore, in my opinion MCC is not
justified as a gold standard.
It is not clear to me from the paper
whether W&S consider unique sequence abundance in calculating MCC in
OptiClust or for benchmarking, but both choices have
programs report different identities for a given pair of sequences, so using one program's
measurements of identity (here, mothur), as a gold standard for benchmarking other
programs would cause bias in favor of mothur even if the programs were all
designed to maximize MCC, which is not the case. Pair-wise sequence Identities from mothur are especially dubious because
it uses the NAST algorithm which intentionally
introduces alignment errors to preserve the number of columns in a
W&S say: "Several metrics have emerged for
assessing the quality of OTU assignment algorithms... Unfortunately, these
methods fail to directly quantify the quality of the OTU assignments." I
agree with this critique of the papers they cite. However, they fail to
mention other approaches (e.g. Ye,
2010, Edgar, 2013,
Callahan et al. 2016,
which directly quantify OTU quality on mock communities by measuring the number of errors in a representative sequence
for each OTU and identifying OTUs which contain sequences which are due to
chimeras, contaminants and