sintax command
Home Software Services About Contact     
 
Follow on twitter

Robert C. Edgar on twitter

11-Aug-2018 New paper describes octave plots for visualizing alpha diversity.

12-Jun-2018 New paper shows that one in five taxonomy annotations in SILVA and Greengenes are wrong.

18-Apr-2018 New paper shows that taxonomy prediction accuracy is <50% for V4 sequences.

05-Oct-2017 PeerJ paper shows low accuracy of closed- and open-ref. QIIME OTUs.

22-Sep-2017 New paper shows 97% threshold is wrong, OTUs should be 99% full-length 16S, 100% for V4.

24-Nov-2016
UPARSE tutorial video posted on YouTube. Make OTUs from MiSeq reads.

 

USEARCH v11

sintax command


ImageSee also
 
SINTAX reference data downloads
  Which taxonomy database should I use?
  SINTAX algorithm
  makeudb_sintax command
  Can SINTAX predict species?
  sintax_summary command
  Cross-validation by identity
  Taxonomy confidence measures

The sintax command uses the SINTAX algorithm to predict taxonomy for query sequences  in FASTA or FASTQ format.

You can use the sintax_summary command to get a tabbed text file for making figures.

The search database must have taxonomy annotations. The makeudb_sintax command can be used to create a UDB database, which is faster to load. See SINTAX downloads page for available reference files in FASTA format. See also Which database should I use?

Taxonomy predictions with bootstrap confidence values are written to the -tabbedout file. The first three fields are (1) query sequence label, (2) prediction with bootstrap values and (3) strand. If the -sintax_cutoff option is given then predictions are written a second time after applying the confidence threshold, keeping only ranks with high enough confidence. On V4 reads, using a cutoff of 0.8 gives predictions with similar accuracy to RDP at 80% bootstrap cutoff.

The strand option must be specified.

Multithreading is supported.

Example

usearch -sintax reads.fastq -db 16s.udb -tabbedout reads.sintax \
  -strand both -sintax_cutoff 0.8
 


References (please cite)
R.C. Edgar (2016), SINTAX: a simple non-Bayesian taxonomy classifier for 16S and ITS sequences, https://doi.org/10.1101/074161
  • SINTAX taxonomy prediction algorithm

  • Fast and simple method, accuracy comparable to RDP Classifier


R.C. Edgar (2018), Accuracy of taxonomy prediction for 16S rRNA and fungal ITS sequences, PeerJ 6:e4652
  • Cross-validation by identity, novel benchmark strategy enabling realistic accuracy estimates

  • Genus accuracy of best methods is 50% on V4 sequences

  • Recent algorithms do not improve on RDP Classifier or SINTAX


R.C. Edgar (2018), Taxonomy annotation and guide tree errors in 16S rRNA databases, PeerJ 6:e5030
  • Approx. one in five SILVA and Greengenes taxonomy annotations are wrong

  • SILVA and Greengenes trees have pervasive conflicts with type strain taxonomies