SINTAX algorithm

See also
   Should I use UTAX or SINTAX? Which database?

The SINTAX algorithm predicts the taxonomy of marker gene reads such as 16S or ITS. It is implemented in the sintax command. See SINTAX paper for details.

Bootstrap confidence values are provided for all predicted ranks.

The algorithm is similar to the RDP Naive Bayesian Classifier except that k-mer similarity is used to identify the top taxonomy rather than Bayesian posteriors so there is no need for training. Also, SINTAX does not require that the lowest ("training") rank be specified for all reference sequences which allows the use of large databases such as SILVA or Greengenes as a reference.

On short tags such as V4, SINTAX has similar accuracy to RDP. On full-length 16S and ITS sequences SINTAX has a lower rate of over-classification errors and will thus have a lower overall error rate on typical data.