search_pcr command

See also
PCR amplicon prediction
search_pcr2 command

Search for predicted amplicons. The database contains two or more primers (oligonucleotide sequences). Each query sequence is searched for matches to a pair of primers that would generate a PCR amplicon. To search for matches to single primers or probes, use search_oligodb.

Wildcard letters indicating degenerate positions in the primer are supported. See IUPAC codes for details.

An amplicon is predicted when two primer sequences (which can be the same) align to a query sequence within a given distance range. The primers must match opposite strands of the query. The distance from the first base in the first hit to the last base in the second hit is the length of the predicted amplicon. The range of amplicon lengths is given by -minamp (minimum length, default 50) and -maxamp (default 1000).

Pairs of primers satisfying the criteria for amplicons are reported in the -pcrout tabbed text output file. See pcrout file for details.

The -ampout option specifies a FASTA output file for predicted amplicon sequences. The predicted amplicon extends from the first base matching the first primer to the last base matching the second primer.

The -ampoutq option specifies a FASTQ output file for predicted amplicon sequences.

The algorithm uses a fast and exact method; there are no heuristics, so all matches meeting the accept criteria are guaranteed to be found. Alignments are global; all letters of the database sequence must be aligned to a letter in the query sequence. Gaps are not permitted, except for terminal gaps in the query sequence.

Termination options are supported. By default, termination is disabled, equivalent to -maxaccepts 0 -maxrejects 0. In other words, by default the entire database (all primers) is searched.

Accept options are supported. By default, -maxdiffs 2 is assumed and other accept criteria are not used. Note that accept criteria are applied to the alignments of one primer to a query sequence, not to the pair of alignments that constitute a hit. So with the default criteria, there could be 2 diffs to the first primer and 2 diffs to the second primer also, for a total of 4 diffs.

If the -pcr_strip_primers option is specified, the query sequence segments matching the primers are removed from the predicted amplicon before writing to the -prcrout and -ampout files. This option does not change the amplicon length range set by the -minamp and -maxamp options, which always include the primers.

The query file may be in FASTA or FASTQ format.

A database file must be specified using the -db option and must be in FASTA format.

The -strand option is required.

Multithreading is supported.

Example

usearch -search_pcr greengenes.fa -db 16s_primers.fa -strand both \
-maxdiffs 3 -minamp 30 -maxamp 2000 -pcrout hits.txt -ampout amplicons.fasta