Home Software Services About Contact     
 
Follow on twitter

Robert C. Edgar on twitter

11-Aug-2018 New paper describes octave plots for visualizing alpha diversity.

12-Jun-2018 New paper shows that one in five taxonomy annotations in SILVA and Greengenes are wrong.

18-Apr-2018 New paper shows that taxonomy prediction accuracy is <50% for V4 sequences.

05-Oct-2017 PeerJ paper shows low accuracy of closed- and open-ref. QIIME OTUs.

22-Sep-2017 New paper shows 97% threshold is wrong, OTUs should be 99% full-length 16S, 100% for V4.

24-Nov-2016
UPARSE tutorial video posted on YouTube. Make OTUs from MiSeq reads.

 

USEARCH v11

Alignment parameters

See also
  Alignment heuristics
 
These parameters determine the score of an alignment. They include substitution scores and gap penalties. These are distinct from heuristic parameters, which control fast but approximate methods for finding the alignment with the highest score. Ideally, changing heuristic parameters would not change the reported alignment (because the best alignment would always be found). By contrast, changing alignment scoring parameters will tend to change the alignment, e.g. increasing gap penalties will reduce the number of gaps. All scoring parameters are floating-point values and may be specified as integers or real numbers.

If local alignment parameters are changed, then the Karlin-Altschul K and Lambda parameters must also be changed in order to get correct E-values.
 

Option Local/Global
Protein/Nucleotide
Default Description
-lopen L PN 10.0 Local gap open
-lext L PN 1.0 Local gap extend
-match LG N +1.0 Match score
-mismatch LG N -2.0 Mismatch score
-matrix filename LG PN BLOSUM62 (aa)
+1/-2 (nt)
Substitution matrix in NCBI BLAST format. See BLOSUM62 for an example.

Gap penalties for global alignments
With global alignments, gap penalties are specified using the -gapopen and -gapext options. Up to 12 separate penalties can be specified: all combinations of query / target, left / interior / terminal, and open / extend can be assigned different penalties.

Image

Default penalties are shown in the following table.

Penalty   Default
Interior gap open   10.0 nucleotides, 17.0 proteins
End gap open   1.0
Interior gap extend   1.0
End gap extend   0.5

The nucleotide defaults would be set using these options:

  -gapopen 10.0I/1.0E -gapext 1.0I/0.5E

A numerical value for a penalty is optionally followed by one or more letters that specify particular types of gap. Here, "10.0I" means "Interior gap=10.0", and "1.0E" means "End gap=1.0". If no letters are given after the numerical value, then the penalty applies to all gaps. More than one letter can be specified, so for example "0.5IE" means "Interior and End gap=0.5", which is the same as all gaps. Following are valid letters: I=Interior, E=End, L=Left, R=Right, Q=Query and T=Target. If more than one numerical value is specified, then they must be separated by a slash character '/'. White space is not allowed. If a star (*) is used as the numerical value, then the gap is forbidden. Using * in an open penalty means that the gap will never be allowed, using * in an extension penalty means that gaps longer than one will be forbidden. So, for example, *LQ in -gapopen means "left end-gaps in the query are not allowed". A sign (plus or minus) is not allowed in the numerical value, which can be integer or floating-point (in which case a period '.' must be used for the decimal point). The -gapopen and -gapext options are interpreted first by setting the defaults, then by scanning the string left-to-right. Later values override previous values.