 unoise command
unoise command
See also
  
	UNOISE paper
Uses the UNOISE algorithm to perform denoising (error-correction) of amplicon reads.
Input is a set of quality-filtered unique read sequences with size=nnn; abundance annotations. See UNOISE pipeline for details of how reads should be pre-processed. The input should be a complete set of reads without any clustering (except for finding uniques), so for example you should not use 97% OTUs as input. In other words, unoise cannot error-correct the output from cluster_otus or a subset of the FASTQ reads. It is ok to run unoise on the FASTQs for a single sample, though I generally recommend pooling samples before denoising.
See Tutorials for example scripts & data.
Errors are corrected as follows:
	  - Reads with sequencing error are identified and 
	removed.
  - Abundances are corrected (when the
	OTU table is generated).
  - Chimeras are removed.
  - 
	PhiX sequences are removed.
  - Low-complexity sequences due to 
	Illumina artifacts are removed.
The algorithm is designed for Illumina reads, not other technologies such as 454 pyrosequencing.
Corrected amplicon sequences are written to the -fastaout file.
The -relabel prefix option specifies a prefix for sequence labels in the output file. An integer 1, 2, 3... is appended to the prefix (requires v9.0.2140 or later).
The -minampsize option specifies the minimum abundance (size= annotation) for an error-corrected amplicon. Default is 4 in v9.0.2159 and later (it was 8 in previous versions).
An OTU table can be generated using the usearch_global command. Reads must have sample identifiers in the labels. I suggest using 97% identity for matching reads to denoised amplicon sequences (this is not an clustering identity; rather, using 97% allows for up to 3% read errors).
Example
usearch -unoise uniques.fa -tabbedout out.txt -fastaout 
	denoised.fa