Chinese Yellow Pages | Classifieds | Knowledge | Tax | IME

Decision tree works just like computer language if.

In AI/ML world, the problem is usually like this:

Given training set with features [( f1,f2 ….), ….] and known category/label [c1, ….], how can we learn from this training set/data and design a decision tree , so that for any new data, we can predict which category/label  it will be.

In plain English:

We can just try to split the dataset using any of feature, to see which one is the best at the first/top level, then recursively go down/split the subset.

But how should we choose which feature to split ( as the decision condition)?

The information theory ( entropy ) will help us:  after the split/decision, the data should be in order more, thus entropy will decrease! This will guide us to choose which one as the decision point.

Quota from https://en.wikipedia.org/wiki/ID3_algorithm

  1. Calculate the entropy of every attribute using the data set S
  2. Split the set S into subsets using the attribute for which entropy is minimum (or, equivalently, information gain is maximum)
  3. Make a decision tree node containing that attribute
  4. Recurse on subsets using remaining attributes.

 

References:

https://en.wikipedia.org/wiki/Decision_tree_learning

https://en.wikipedia.org/wiki/ID3_algorithm

Leave a Reply

Your email address will not be published. Required fields are marked *