Home Software Services About Contact     
Follow on twitter

Robert C. Edgar on twitter

11-Aug-2018 New paper describes octave plots for visualizing alpha diversity.

12-Jun-2018 New paper shows that one in five taxonomy annotations in SILVA and Greengenes are wrong.

18-Apr-2018 New paper shows that taxonomy prediction accuracy is <50% for V4 sequences.

05-Oct-2017 PeerJ paper shows low accuracy of closed- and open-ref. QIIME OTUs.

22-Sep-2017 New paper shows 97% threshold is wrong, OTUs should be 99% full-length 16S, 100% for V4.

UPARSE tutorial video posted on YouTube. Make OTUs from MiSeq reads.



sam_filter command

The sam_filter command processes an input file in SAM format. Alignments are validated and converted to human-readable and/or BLAST 6 format.

Records for unmapped query sequences are discarded unless the -output_no_hits option is specified.

If the SAM records contain MD tags, or if the original search database is specified using the -db option, then the alignments in the records are validated for consistency with the target sequences. This can catch bugs in the CIGAR string or MD tag, which are surprisingly common in popular software that supports SAM.

Human-readable alignments can be written to a file specified by the -alnout option. This requires that MD tags are present in the SAM records or that the -db option is specified.

Alignments in BLAST 6 format can be written to a file specified by the -blast6out option.

E-values are calculated for the output files, which requires that the database size is known. If a database file is given, the size of the database is used to calculate E-values. Otherwise, the ka_dbsize option can be used to specify the database size in letters.


usearch -sam_filter hits.sam -db genome.fa -alnout hits.aln