1. Introduction
Floods are frequent natural disasters caused by excessive rainfall and rising river levels. Traditional methods fail to predict floods accurately because they cannot analyze multiple factors simultaneously.
This case study demonstrates how an AI-based system uses historical and real-time data to predict flood occurrence.
2. Problem Statement
We want to build an AI system that can:
Predict whether a flood will occur based on:
- Rainfall
- River water level
- Soil moisture
Output:
- Flood (Yes / No)
3. Example Data Collection
Assume we collected the following dataset:
| Day | Rainfall (mm) | River Level (m) | Soil Moisture | Flood |
| 1 | 120 | 8.5 | High | Yes |
| 2 | 80 | 7.0 | Medium | No |
| 3 | 150 | 9.2 | High | Yes |
| 4 | 40 | 4.5 | Low | No |
| 5 | 110 | 8.0 | High | Yes |
| 6 | 60 | 5.5 | Medium | No |
This is supervised learning data (output already known)
4. Data Preprocessing (Applied on Example Data)
✔ Step 1: Convert Categorical Data
Soil Moisture:
- Low = 0
- Medium = 1
- High = 2
Flood:
- No = 0
- Yes = 1
✔ Step 2: Converted Dataset
| Rainfall | River Level | Soil | Flood |
| 120 | 8.5 | 2 | 1 |
| 80 | 7.0 | 1 | 0 |
| 150 | 9.2 | 2 | 1 |
| 40 | 4.5 | 0 | 0 |
| 110 | 8.0 | 2 | 1 |
| 60 | 5.5 | 1 | 0 |
✔ Step 3: Handle Missing/Errors (Assumption)
- No missing values here
- If missing → fill using average
✔ Step 4: Feature Selection
Selected features:
- Rainfall ✔
- River Level ✔
- Soil Moisture ✔
5. Train-Test Split
Training Data (First 4 rows):
| Rainfall | River | Soil | Flood |
| 120 | 8.5 | 2 | 1 |
| 80 | 7.0 | 1 | 0 |
| 150 | 9.2 | 2 | 1 |
| 40 | 4.5 | 0 | 0 |
Testing Data (Last 2 rows):
| Rainfall | River | Soil | Flood |
| 110 | 8.0 | 2 | 1 |
| 60 | 5.5 | 1 | 0 |
6. Model Selection
We select:
Decision Tree Classifier
Reason:
- Easy to understand
- Works well for classification
7. Model Training (Using Training Data)
The model learns patterns:
Learned Rules:
Rule 1:
- If Rainfall > 100 AND River Level > 8
➡ Flood = Yes
Rule 2:
- If Rainfall < 70
➡ Flood = No
Rule 3:
- Medium conditions → depend on soil
These rules are automatically learned, not manually coded
8. Model Testing (Using Test Data)
🔍 Test Case 1:
Input:
- Rainfall = 110
- River = 8.0
- Soil = High
Model Prediction: Flood (1)
Actual: Flood (1)
🔍 Test Case 2:
Input:
- Rainfall = 60
- River = 5.5
Model Prediction: No Flood (0)
Actual: No Flood (0)
9. Model Evaluation
✔ Accuracy Calculation
Accuracy = Correct Predictions / Total
= 2 / 2 = 100%
✔ Confusion Matrix
| Actual \ Predicted | Flood | No Flood |
| Flood | 1 | 0 |
| No Flood | 0 | 1 |
✔ Important Insight
In real-world:
- Data is large
- Accuracy may be lower
- Focus on detecting floods correctly (Recall)
10. Final Model Deployment
Now system is ready for real-time use.
🔍 New Input:
| Rainfall | River Level | Soil |
| 140 | 9.0 | High |
Model Output: Flood Likely
