What is Gini Index?


Gini Index (or Gini Impurity) is a measure used in Decision Trees (CART algorithm) to
decide how to split the data at each node.
�� It tells how impure or mixed a node is.
 Lower Gini value → purer node
 Higher Gini value → more impure node

2️⃣ Where is Gini Index Used?


 Used in CART (Classification and Regression Trees)
 Mainly for classification problems
 CART always creates binary splits
Gini Index Formula


�� General Formula

Where:

  • = probability of class in node
  • = number of classes

�� Binary Classification Formula

Where:

  • = probability of Yes
  • = probability of No

4️⃣ Gini Index of a Split (Very Important)
When a node is split into child nodes:

Where:

  • = parent node
  • = child nodes
  • = samples in child
  • = total samples