Chinese Yellow Pages | Classifieds | Knowledge | Tax | IME

Bayes theorem:

d3c7c452b3d01f5415dd9bf15d2ab822.png (216×48)

where A and B are events and P(B) ≠ 0.

  • P(A) and P(B) are the probabilities of observing A and B without regard to each other.
  • P(A | B), a conditional probability, is the probability of observing event A given that B is true.
  • P(B | A) is the probability of observing event B given that A is true.

https://upload.wikimedia.org/wikipedia/commons/6/61/Bayes_theorem_tree_diagrams.svg

To understand it, see the graph:

 

so: P(A AND B ) = P(A|B)P(B) = P(B|A)P(A)

from the graph, it is quite easy to understand : P(B|A) = P( A AND B) / P(A)

==> P(B|A) = P( A|B) P(B)/P(A)

 

How to use Bayes as a classifier:

It fits well as a classifier.  The challenge is how to apply it to real life problems.

The following thought process of how to use bayes classifier as an email spam filter may shed some light on it.

The classifying spam email problem is:

Some emails are labeled as normal, some emails are labeled as spam.

now when a new email comes, what is the probability that this email will be a spam?

This could be expressed as:  P( spam | this email )

According to Bayes’ theorem:  P( spam | this email ) = P( this email | spam) / P( this email )

The key is what is P( this email)? how to calculate this?

Well, it is the probability of (this email) shown among all of our emails, the counting space is all of our emails.

We know email is composed by words, so P(this email) is somehow related to:  P( word_i ), where the P (word_i) is the probability of this word_i will be shown  in all words in our emails

Our instinct tell us: P(this email) =P(word_1, word_2, word_n) ~=P(word_1) * P( word_2) * … * P(word_n)

thus we changed our counting space from abstract emails to  word space ( that is the key!!), now we know how to compute those probabilities.

P( word_i ) = ( number of word_i in our training set /( number of all the words in our training set)

P( this email | spam) = P(word1|spam) * … * P( word_n |spam)

where

P(word_i | spam) = ( number of word_i shown in the spam emails) /( all the words in the spam emails).

At this point any decent programmer should be able to code this up.

 

 

 

 

References:

https://en.wikipedia.org/wiki/Bayes%27_theorem

Leave a Reply

Your email address will not be published. Required fields are marked *