Overestimation and Underestimation in Heuristic in A*

Overestimation and Underestimation in Heuristic in A*

What is a Heuristic Function?

A heuristic function, h(n), gives an estimated cost from the current node n to the goal node.
It helps the A* algorithm decide which path seems better.

However, since h(n) is only an estimate, it can sometimes be:

  • Too low (underestimation)
  • Too high (overestimation)

 Actual Cost vs Estimated Cost

Let’s define:

  • a(n) → Actual cost from node n to the goal
  • e(n) → Estimated cost (heuristic) from node n to the goal

So ideally:

e(n)≤a(n)

This means your heuristic should never overestimate the actual cost.

wo Cases of Estimation

🅰️ Case 1: Overestimation

  • When e(n) > a(n)
  • The heuristic thinks the goal is farther than it really is.

 Effect:
The algorithm may skip the optimal path because it appears more expensive.
Hence, A* might not give the best path.

🅱️ Case 2: Underestimation

  • When e(n) < a(n)
  • The heuristic thinks the goal is closer than it really is.

 Effect:
A* might explore a few extra nodes but will always find the optimal path.
This type of heuristic is called Admissible.

Case 1: Overestimation

Problem Setup

We have this graph:

S —5→ X —10→ G

 \         

  5        

   \       

    Y —20→ G

Actual edge costs:

From–ToCost
S–X5
X–G10
S–Y5
Y–G20

So:

  • Path S–X–G = 15 (optimal path)
  • Path S–Y–G = 25 (longer path)

 Step 1: A* Formula

f(n)=g(n)+h(n)

g(n) = actual cost from start to node

h(n) = estimated cost from node to goal


 Step 2: Assign Values

Nodeg(n)h(n)f(n) = g + h
X55055
Y54045

 Step 3: Choose Node with Lowest f(n)

 Between X (f=55) and Y (f=45),
A* will expand Y first, because 45 < 55.


Step 4: Expand Y

From Y → G:

g(G)=g(Y)+cost(Y,G)=5+20=25

and h(G)=0
So f(G)=25+0=25

 A* finds a path S → Y → G with total cost = 25.
But that’s not optimal, because there’s a better path (S–X–G = 15).


Step 5: Why This Happened

Because h(X) = 50 was too large (overestimated),
A* thought that going through X would be much more expensive than it really is.

So, it skipped the correct path S–X–G and took the longer one (S–Y–G).


 Step 6: Understanding in Words

 Overestimation = heuristic gives a value higher than the real cost.
That makes A* think the node is farther from the goal than it truly is.

Result:

A* may choose a wrong node for expansion.

It can miss the optimal path.

The algorithm becomes non-admissible (not guaranteed to find shortest path).

Case 2: Underestimation

Now suppose:

h(X) = 5

h(Y) = 8

Actuals remain the same:

a(X) = 10

a(Y) = 20

Compute f-values again:

Nodeg(n)h(n)f(n) = g + h
X5510
Y5813
  •  Now the algorithm expands X first (f=10),
    then moves directly to G via X → G = 10 + 5 = 15 (actual cost).
  •  Meaning:
    Underestimation still gives us the correct, optimal path,
    but it might explore more nodes before reaching it.

Underestimation is GOOD (safe).
 Overestimation is BAD (risky).
But — both affect performance differently.

Let’s unpack it properly.


 What We Want from a Heuristic

A heuristic in A* is supposed to guide the search — to guess how far the goal is.
But it should never mislead the algorithm.

That’s why, in A*, the ideal heuristic must satisfy:

h(n)≤ h*(n)

where

  • h(n) = estimated cost to reach the goal,
  • h*(n) = true (actual) cost to the goal.

If this holds for all nodes, the heuristic is called admissible.

Admissible Heuristic = Underestimation or Exact Estimation.


  Underestimation vs Overestimation — What Happens

TypeConditionBehaviorResult
Underestimationh(n) < actualHeuristic is conservative (thinks goal is closer)Safe, finds optimal path but may expand extra nodes
Exacth(n) = actualPerfect estimate Best performance (fast & optimal)
Overestimationh(n) > actualHeuristic is too optimistic (thinks goal is farther) May skip optimal path, not guaranteed to find best solution

Intuitive Explanation

Think of Google Maps :

  • If it underestimates time (says 10 mins but actually 15 mins):
    → You might leave early, explore a few roads, but you’ll definitely reach correctly.
  • If it overestimates time (says 30 mins but actually 15 mins):
    → You might avoid the best route because it “looks longer,” and end up taking a worse one.

So:
 Underestimate = Safe but Slow
 Overestimate = Fast but Wrong

Properties of A*

PropertyDescriptionMeaning for Students
TypeInformed SearchUses knowledge (heuristics)
Complete YesIt will find a solution if one exists
Optimal Yes (if h is admissible)Finds the least-cost path
Time ComplexityO(b^d)May expand many nodes, depends on heuristic
Space ComplexityO(b^d)Stores all generated nodes in memory
Best Used ForPathfinding & shortest pathMaps, games, AI navigation