Home Software Services About Contact     
 
USEARCH v11

NBC calculation of genus-specific conditional probability

See also
  RDP Naive Bayesian Classifier algorithm

As a starting point, the simplest estimate is the frequency observed in the training set = m(w_i)/M, and it would make sense to add pseudo-counts to model unobserved sequences, but I don't understand why they chose specifically to add P_i in the numerator or add 1 in the denominator.


 

Reference
Wang,Q. et al. (2007) Naive Bayesian classifier for rapid assignment of rRNA sequences into the new bacterial taxonomy. AEM 73, 5261-7.