translated search

If a nucleotide query is used with a protein database, a translated search is performed. Each ORF in the nucleotide sequence is translated, and the resulting amino acid sequences are used to search the database. Translated searches are supported by most search commands. Metagenomic gene search with ublast is the most common use case.

Each ORF is an independent search, so termination conditions are applied separately. (Note that termination conditions are disabled by default in ublast). This is because a single nucleotide sequence may span more than one gene. The ‑orfstyle and ‑mincodons options control ORF identification.

Currently, USEARCH does not extend alignments through frame-shifts, though two adjacent hits in different frames may be found, allowing frame-shifts to be inferred. Please let me know if you need better support for frame-shifts.

Translated searches with a nucleotide query and nucleotide database (like TBLASTX), or with a protein query and nucleotide database (like TBLASTN) are not supported directly. See quick start for BLAST users. These cases can be handled by using findorfs with ‑xlat and using protein search on the amino acid sequences.