Centroid 

Centroid is the mean (average) of all data points in a cluster and is used to represent the cluster.

Mathematical Definition

If a cluster has n points:

y1,y2,…,yn

Then the centroid is:

Cx=x1+x2++xnn

Cy=y1+y2++ynn

Example 

Cluster C1 contains points:

  • P1 (2,15)
  • P2 (3,18)
  • P3 (4,12)
  • P7 (4,16)
  • P8 (3,14)

Centroid calculation:

Cx=2+3+4+4+35=3.2

Cy=15+18+12+16+145=15

 Centroid = (3.2, 15)

đŸ”¹ Why Centroid is Important in K-Means?

  1. It represents the cluster center
  2. Used to assign nearest data points
  3. Updated repeatedly until it stops changing
  4. Helps minimize within-cluster distance