the formula for naive bayes is

P(A/B1,B2,B3...Bn) =

[ P(A) * P(B1/A) * P(B2/A) * . . . P(Bn/A) ] / [ P(B1) * P(B2) *....P(Bn) ]

i am working on a project to classify email as spam or not. i have a large data set.
i am using nltk package in python.
my question is how to find the probabilities of the right side of the expression.
and after that how should i set my threshold value?

Recommended Answers

All 2 Replies

thanks for the link. i have read it.
now i want to know is there any advantage of coding naive bayes in python rather than any other language like c++?

Be a part of the DaniWeb community

We're a friendly, industry-focused community of 1.21 million developers, IT pros, digital marketers, and technology enthusiasts learning and sharing knowledge.