Identifies a possible "core microbiome" of OTUs which are present in more samples than others.
The presence of an OTU in some or many samples can be spurious because of cross-talk or because the OTU itself is spurious. To enable manual review, the otutab_core command generates a report indicating cases where the presence of an OTU may be spurious due to cross-talk, and where an OTU may be spurious due to sequence errors.
If a sintax tabbed file is provided using the -sintaxin option, then the taxonomy of the core OTUs is included in the report.
If a distance matrix is provided using the distmxin option, this is used to identify possible dominant OTUs, i.e. high-abundance OTUs which are similar to a low-abundance OTUs in the report. If there is a dominant OTU, this may indicate that the low-abundance OTU is spurious.
The -tabbedout option specifies the output file. OTUs are sorted in order of decreasing number of samples where they are present. Fields are:
#1. OTU = name of the OTU.
#2. Samples = number of samples where the OTU has a non-zero count.
#3. Size = total number of reads assigned to this OTU.
#4. DomOTU = high-abundance "dominant" OTU which is very similar to this OTU, if any.
#5. DomSize = total number of reads assigned to the dominant OTU.
#6. DomId = identity of the dominant OTU with this OTU.
#7. Min = minimum count for this OTU.
#8. LoQ = low quartile count for this OTU.
#9. Med = median count for this OTU.
#10. HiQ = high quartile count for this OTU.
#11 Max = maximum count for this OTU.
#12 Taxonomy = condensed taxonomy prediction.
If the minimum or LoQ count is much smaller than the maximum count, this suggests that the smaller counts may be due to cross-talk.
If the size of an OTU is much smaller than a neighboring "dominant" OTU, then the OTU itself may be spurious due to sequence error.
usearch -calc_distmx otus.fa -tabbedout distmx.txt \
-sparsemx_minid 0.9 -termid 0.8
usearch -sintax otus.fa -strand both -db ref16s.txt \
usearch -otutab_core otutab.txt -distmxin distmx.txt \
-sintaxin sintax.txt -tabbedout core.txt