Splitting Criteria in Decision Trees


So far, we have understood that a decision tree makes decisions by asking questions at each node.

But an important question arises:
Which question should be asked first?
In other words, how does the decision tree decide which attribute to choose for splitting the data?

To answer this, decision trees use certain splitting criteria that help select the best attribute at each node.

What is Splitting?
Splitting is the process of dividing data at a node into smaller subsets based on a feature.


The goal of splitting is to make child nodes more pure (i.e., data in each node belongs to mostly one class).

Different decision tree algorithms use different splitting criteria to choose the best feature.

The commonly used splitting criteria are:

  • Entropy and Information Gain (ID3)
  • Gini Index (CART)
  • Gain Ratio (C4.5)