Numerical example of decision tree using ID3

Numerical example of decision tree using ID3
Suppose, we have, 2 features
Feature 1,
(Feature = F1)
Feature

Explanation
Root Node:
Feature → 9 Yes / 5 No
 First Split (C1):
6 Yes / 2 No → More pure node
 Second Split (C2):
3 Yes / 3 No → Highly impure node
Step 1: Entropy of Root Node
Entropy Formula

Root node is impure
Step 2: Entropy of Child Node C1 (6Y / 2N)

More pure node
Step 3: Entropy of Child Node C2 (3Y / 3N)

Highly impure node (maximum entropy)
Information gain

Where:

→ Entire dataset (parent node)
→ Attribute / feature used for splitting
→ All possible values of attribute
→ Subset of where attribute
→ Total number of samples
→ Number of samples in subset
→ Entropy of parent node
→ Entropy of child node

Step 4: Weighted Entropy After Split

Step 5: Information Gain
Formula

(Feature 2)
Root (S): 9Y / 5N → Total = 14
Child C1: 5Y / 1N → Total = 6
Child C2: 4Y / 4N → Total = 8
Step 1: Entropy of Root Node

Step 2: Entropy of Child C1 (5Y / 1N)

Step 3: Entropy of Child C2 (4Y / 4N)

Step 4: Weighted Entropy After Split

Step 5: Information Gain for Feature 2

Feature 1

Root: 9Y / 5N
Children:
C1 → 6Y / 2N
C2 → 3Y / 3N

Feature 2
Root: 9Y / 5N
Children:
C1 → 5Y / 1N
C2 → 4Y / 4N

Comparison Table

Feature	Information Gain
Feature 1	≈ 0.05
Feature 2	≈ 0.09

Final Conclusion (ID3 Decision)
ID3 algorithm always selects the feature with the highest Information Gain as the root
node.
Since:

Feature 2 is selected as the ROOT FEATURE