11-Aug-2018 New paper describes octave plots for visualizing alpha diversity.

12-Jun-2018 New paper shows that one in five taxonomy annotations in SILVA and Greengenes are wrong.

18-Apr-2018 New paper shows that taxonomy prediction accuracy is <50% for V4 sequences.

05-Oct-2017 PeerJ paper shows low accuracy of closed- and open-ref. QIIME OTUs.

22-Sep-2017 New paper shows 97% threshold is wrong, OTUs should be 99% full-length 16S, 100% for V4.

UPARSE tutorial video posted on YouTube. Make OTUs from MiSeq reads.



Pipeline example: MiSeq 2x250 16S V4

This example shows a typical analysis pipeline for MiSeq paired reads. There are four samples: Human, Mouse, Soil and Mock with ~4k reads each. Human and Mouse are fecal samples. Data is from Kozich et al. 2013. The mock reads have spurious OTUs due to cross-talk.

You need to set the environment variable $usearch to the name (or path) for your usearch binary file. The bash script assumes that the FASTQ files and the sintax reference database are stored in a directory called ../data. Output is written to a directory called ../out which is deleted and re-build. I suggest making a project directory, say ex_miseq, with sub-directories named data and scripts. Put the ex_miseq.bash script in the scripts sub-directory and execute it from there.

Download commands and data
Reads: right-click on ex_miseq_reads.tar.gz and click Save As. Use tar -zxvf *.tar.gz to extract.
Sintax reference database: right-click on rdp_16s_v16.fa.gz and click Save As. Use gunzip *.gz to extract.
Bash script: select all text below (ctrl+A) and copy/paste into a text editor, or right-click on ex_miseq.bash and click Save As.