USEARCH manual

Pipeline example: MiSeq 2x300 fungal ITS

Description
This tutorial uses data from study PRJEB7970 deposited by CEH at the European Nucleotide Archive derived from samples of Scots Pine (Pinus sylvestris) needles collected from forests and plantations in Scotland. Libraries were constructed using fITS7 (forward) and ITS4 (reverse) primers described in Ihrmark et al. (2012) targeting the 58S and LSU rRNA genes flanking the ITS2 region. Sequencing was done using MiSeq 2x300 PE. To reduce the dataset size for this tutorial, 5,000 reads were taken at random from the first ten pairs of FASTQ files.

Configuration
You need to set the environment variable $usearch to the name (or path) for your usearch binary file. The bash script assumes that the FASTQ files and the sintax reference database are stored in a directory called ../data/. Output is written to a directory called ../out/ which is deleted and re-built each time the script is run. I suggest making a project directory, say ex_miseq_its, with sub-directories named data/ and scripts/. Put the ex_miseq.bash script in the scripts/ sub-directory and execute it from there.

Download commands and data
Reads: right-click on ex_miseq_its_reads.tar.gz and click Save As. Use tar -zxvf *.tar.gz to extract.
Sintax reference database: right-click on rdp_its_v2.fa.gz and click Save As. Use gunzip *.gz to extract.
Bash script: select all text below (ctrl+A) and copy/paste into a text editor, or right-click on ex_miseq_its.bash and click Save As.