unbias command
Home Software Services About Contact     
Follow on twitter

Robert C. Edgar on twitter

11-Aug-2018 New paper describes octave plots for visualizing alpha diversity.

12-Jun-2018 New paper shows that one in five taxonomy annotations in SILVA and Greengenes are wrong.

18-Apr-2018 New paper shows that taxonomy prediction accuracy is <50% for V4 sequences.

05-Oct-2017 PeerJ paper shows low accuracy of closed- and open-ref. QIIME OTUs.

22-Sep-2017 New paper shows 97% threshold is wrong, OTUs should be 99% full-length 16S, 100% for V4.

UPARSE tutorial video posted on YouTube. Make OTUs from MiSeq reads.



unbias command

See also
UNBIAS algoriithm
  UNBIAS paper
  Download unbias reference databases

The unbias command attempts to correct for abundance bias in amplicon sequencing by adjusting the counts in an OTU table.

Unfortunately, the UNBIAS algorithm does not work very well as a practical tool, achieving only a small improvement in OTU frequency estimates. Even after running UNBIAS, the correlation between cell frequency and OTU frequency is very low, and I believe that the problem of abundance bias is not solvable at the present time. So, why provide a command that doesn't work? Mostly, to draw attention to the problem and emphasize the limitations of 16S amplicon sequencing as a quantitative method. Plus, doing something is better than nothing -- there is some improvement, even if it's not a full solution.

The unbias command requires three input files: an OTU table in QIIME classic format and two tabbed text files with predictions for the SSU operon copy number and number of primer differences for each OTU. The tabbed text files are specified by the -copynrin and -diffsin options. The tabbed text files must have at least two fields. The first field is the OTU identifier, the second field is an integer giving the predicted copy number or number of primer differences for the OTU. These files can be generated using the sinaps command using reference databases which can be downloaded here. Predictions are needed for every OTU regardless of confidence, so the -noboot option of sinaps can be used for faster execution.

The -tabbedout option specifies a tabbed text file with one line for each OTU; fields are: OTU identifier,  copy number, primer diffs, copy number correction factor, primer diffs correction factor, final correction factor.

Output is written in QIIME classic format to the file specified by the -output option.


usearch -sinaps otus.fa -db primerdiffs.udb -strand plus -tabbedout primerdiffs.txt \
  -noboot -attr v4diffs

usearch -sinaps otus.fa -db copynr.udb -strand plus -tabbedout copynr.txt \
  -noboot -attr copynr

usearch -unbias otutab.txt -diffsin primerdiffs.txt -copynrin copynr.txt \
  -output otutab_unbias.txt -tabbedout unbias.txt