Overestimation and Underestimation in Heuristic in A*
What is a Heuristic Function?
A heuristic function, h(n), gives an estimated cost from the current node n to the goal node.
It helps the A* algorithm decide which path seems better.
However, since h(n) is only an estimate, it can sometimes be:
- Too low (underestimation)
- Too high (overestimation)
Actual Cost vs Estimated Cost
Let’s define:
- a(n) → Actual cost from node n to the goal
- e(n) → Estimated cost (heuristic) from node n to the goal
So ideally:
e(n)≤a(n)
This means your heuristic should never overestimate the actual cost.
wo Cases of Estimation
🅰️ Case 1: Overestimation
- When e(n) > a(n)
- The heuristic thinks the goal is farther than it really is.
Effect:
The algorithm may skip the optimal path because it appears more expensive.
Hence, A* might not give the best path.
🅱️ Case 2: Underestimation
- When e(n) < a(n)
- The heuristic thinks the goal is closer than it really is.
Effect:
A* might explore a few extra nodes but will always find the optimal path.
This type of heuristic is called Admissible.
Case 1: Overestimation
Problem Setup
We have this graph:
S —5→ X —10→ G
\
5
\
Y —20→ G
Actual edge costs:
| From–To | Cost |
| S–X | 5 |
| X–G | 10 |
| S–Y | 5 |
| Y–G | 20 |
So:
- Path S–X–G = 15 (optimal path)
- Path S–Y–G = 25 (longer path)
Step 1: A* Formula
f(n)=g(n)+h(n)
g(n) = actual cost from start to node
h(n) = estimated cost from node to goal
Step 2: Assign Values
| Node | g(n) | h(n) | f(n) = g + h |
| X | 5 | 50 | 55 |
| Y | 5 | 40 | 45 |
Step 3: Choose Node with Lowest f(n)
Between X (f=55) and Y (f=45),
A* will expand Y first, because 45 < 55.
Step 4: Expand Y
From Y → G:
g(G)=g(Y)+cost(Y,G)=5+20=25
and h(G)=0
So f(G)=25+0=25
A* finds a path S → Y → G with total cost = 25.
But that’s not optimal, because there’s a better path (S–X–G = 15).
Step 5: Why This Happened
Because h(X) = 50 was too large (overestimated),
A* thought that going through X would be much more expensive than it really is.
So, it skipped the correct path S–X–G and took the longer one (S–Y–G).
Step 6: Understanding in Words
Overestimation = heuristic gives a value higher than the real cost.
That makes A* think the node is farther from the goal than it truly is.
Result:
A* may choose a wrong node for expansion.
It can miss the optimal path.
The algorithm becomes non-admissible (not guaranteed to find shortest path).
Case 2: Underestimation
Now suppose:
h(X) = 5
h(Y) = 8
Actuals remain the same:
a(X) = 10
a(Y) = 20
Compute f-values again:
| Node | g(n) | h(n) | f(n) = g + h |
| X | 5 | 5 | 10 |
| Y | 5 | 8 | 13 |
- Now the algorithm expands X first (f=10),
then moves directly to G via X → G = 10 + 5 = 15 (actual cost). - Meaning:
Underestimation still gives us the correct, optimal path,
but it might explore more nodes before reaching it.
Underestimation is GOOD (safe).
Overestimation is BAD (risky).
But — both affect performance differently.
Let’s unpack it properly.
What We Want from a Heuristic
A heuristic in A* is supposed to guide the search — to guess how far the goal is.
But it should never mislead the algorithm.
That’s why, in A*, the ideal heuristic must satisfy:
h(n)≤ h*(n)
where
- h(n) = estimated cost to reach the goal,
- h*(n) = true (actual) cost to the goal.
If this holds for all nodes, the heuristic is called admissible.
Admissible Heuristic = Underestimation or Exact Estimation.
Underestimation vs Overestimation — What Happens
| Type | Condition | Behavior | Result |
| Underestimation | h(n) < actual | Heuristic is conservative (thinks goal is closer) | Safe, finds optimal path but may expand extra nodes |
| Exact | h(n) = actual | Perfect estimate | Best performance (fast & optimal) |
| Overestimation | h(n) > actual | Heuristic is too optimistic (thinks goal is farther) | May skip optimal path, not guaranteed to find best solution |
Intuitive Explanation
Think of Google Maps :
- If it underestimates time (says 10 mins but actually 15 mins):
→ You might leave early, explore a few roads, but you’ll definitely reach correctly. - If it overestimates time (says 30 mins but actually 15 mins):
→ You might avoid the best route because it “looks longer,” and end up taking a worse one.
So:
Underestimate = Safe but Slow
Overestimate = Fast but Wrong
Properties of A*
| Property | Description | Meaning for Students |
| Type | Informed Search | Uses knowledge (heuristics) |
| Complete | Yes | It will find a solution if one exists |
| Optimal | Yes (if h is admissible) | Finds the least-cost path |
| Time Complexity | O(b^d) | May expand many nodes, depends on heuristic |
| Space Complexity | O(b^d) | Stores all generated nodes in memory |
| Best Used For | Pathfinding & shortest path | Maps, games, AI navigation |
