Commands > OTU analysis
Example OTU tables and reports
The uncross command detects and filters cross-talk
(sample mis-assignment) in a OTU table using
the UNCROSS algorithm. In a
typical run, about 2% of reads are assigned to the wrong sample. If some
samples contain large numbers of reads for a given OTU, these often "bleed"
into other samples which may not in fact contain that OTU. This can cause
may spurious counts wihch should be zero, giving inflated estimates of
richness, alpha diversity and beta diversity.
You can clearly see cross-talk in this
GAIIx example and this MiSeq example.
You can use this example data to
try the uncross command.
Please note that I do not consider the UNCROSS algorithm to be a robust
solution for cross-talk. The mechanism(s) causing cross-talk are not well
understood. Many different indexing schemes are used. Cross-talk rates in
your data may be quite different from the datasets on which UNCROSS was
designed and tested, in which case the accuracy of UNCROSS on your data may
be lower. Also, cross-talk may be hard or impossible to detect when the
number of multiplexed samples is large, say around 100 or more. It is much
better to use multiplexing strategies that are designed to reduce
cross-talk. UNCROSS is best understood as a simplisitc hack that is the best
we can do with exisitng data.
Input is an OTU table in
QIIME classic format generated from the all of
the reads in a single run. Runs should NOT be combined for this analysis. It
is important to include ALL samples that were sequenced in the same run,
even if they contain samples for different experiments.
If the run has mock community samples, mock sample names should start with
"mock" (case-insensitive), e.g. Mock1, mock or mock_13. OTU identifiers for
sequences that are in the designed mock community should contain one of the
following strings (case-sensitive): ";mock=yes;", ";annot=perfect;" or
";annot=noisy;" The annot command can be used
to generate these annotations in the sequence labels before
generating the OTU table.
The -tabbedout option specifies an output file in tabbed text format.
The -report option specifies a text file name for a summary report.
The -otutabout option specifies a filename to store the filtered OTU table.
By default, entries predicted to be spurious due to cross-talk are set to
zero and undetermined entries are kept. Specify -uncross_undet_zero to set
undetermined entries to zero.
The following options specify user-settable parameters:
-uncross_maxxt (default 0.5). Maximum cross-talk frequency as a percentage.
-uncross_minvalid (default 2.0). Minimum valid frequency as a percentage.
-uncross_minvalidtotal (default 75.0). Minimum fraction of valid reads in an
OTU as a percentage.
usearch -uncross otutab.txt -tabbedout out.txt -report rep.txt -otutabout