Distance Measures in K-Means

1️ Euclidean Distance (Most Common)

  • Measures the straight-line distance between two points.
  • Works best for numerical data.
  • Most widely used distance measure in K-Means.

Formula (for two features):

d=(x1-x2)2+(y1-y2)2

Use case:
When data is continuous and features are on a similar scale.

2️ Manhattan Distance

  • Measures distance as the sum of absolute differences.
  • Also called city-block distance.
  • Movement is only in horizontal and vertical directions.

Formula:

d=∣x1-x2∣+∣y1-y2∣

Use case:
Useful when data has grid-like structure.