# 🌲 Forest Cover Type Classification

📌 Project Overview

This project focuses on predicting the forest cover type from cartographic and environmental variables using machine learning.

It trained and compared three classification models — Decision Tree, Random Forest, and XGBoost — and further improved performance with hyperparameter tuning.

The project demonstrates the full machine learning pipeline: data cleaning, preprocessing, training, evaluation, visualization, and tuning.

📂 Dataset

Source: UCI Machine Learning Repository — Covertype Dataset
Format: .data (converted to .csv for processing)
Number of Instances: 581,012
Number of Features: 54 cartographic variables (e.g., elevation, slope, soil type, distance to hydrology, etc.)
Target Variable: Cover_Type (multi-class, 7 forest cover categories)

🎯 Classes

The target variable Cover_Type has 7 categories representing different types of forest cover:

Spruce/Fir
Lodgepole Pine
Ponderosa Pine
Cottonwood/Willow
Aspen
Douglas-fir
Krummholz

🧠 Models Used

1. Decision Tree

A tree-like model where decisions are made by splitting features into branches.
Easy to interpret, but can overfit if the tree is too deep.
Good baseline model for classification.

2. Random Forest

An ensemble of multiple decision trees (a "forest").
Each tree is trained on a random subset of data and features.
More accurate and robust than a single Decision Tree because it reduces overfitting.

3. XGBoost (Extreme Gradient Boosting)

A boosting algorithm that builds trees sequentially.
Each new tree corrects the errors of the previous ones.
Very powerful for structured/tabular data and often achieves state-of-the-art results.
Requires careful hyperparameter tuning for best performance.

🛠️ Steps Followed

Data Cleaning & Preprocessing
- Added column names from the dataset description.
- Checked for missing values (none were found).
- Converted categorical features into usable formats.
- (Optional) Outlier detection using Z-score.
Train-Test Split
- 80% training, 20% testing.
Model Training
- Trained Decision Tree, Random Forest, and XGBoost classifiers.
Model Evaluation
- Accuracy Score
- Precision, Recall, F1-Score
- Confusion Matrix (heatmap visualization)
Feature Importance
- Visualized the most important features for tree-based models.
Hyperparameter Tuning
- Used GridSearchCV and RandomizedSearchCV to optimize hyperparameters.
- Reduced overfitting and improved accuracy.

📊 Results (Accuracy Scores)

Model	Accuracy Score
Decision Tree (Default)	0.9059
Random Forest (Default)	0.9308
XGBoost (Default)	0.8711
Decision Tree (Tuned)	0.9124
Random Forest (Tuned)	0.9524
XGBoost (Tuned)	0.9581

✅ Best Model: XGBoost (Tuned) with 95.8% accuracy.

📈 Visualizations

Confusion Matrix Heatmap → to check per-class predictions.
Feature Importance Bar Chart → to identify top predictive features (e.g., Elevation, Horizontal Distance to Roadways).

🚀 How to Run

Install dependencies:

pip install -r requirements.txt

clone this repository:

git clone https://github.com/Adeeba-Shahzadi/ForestCoverClassification-MultiClassificationModel.git

Navigate to the project folder:

cd ForestCoverClassification-MultiClassificationModel

Run the notebook or script:

jupyter ForestCoverTypeClassification.ipynb

OR

python forestcovertypeclassification.py

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.gitattributes		.gitattributes
ForestCoverTypeClassification.ipynb		ForestCoverTypeClassification.ipynb
README.md		README.md
covtype.data		covtype.data
forestcovertypeclassification.py		forestcovertypeclassification.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

# 🌲 Forest Cover Type Classification

📌 Project Overview

📂 Dataset

🎯 Classes

🧠 Models Used

1. Decision Tree

2. Random Forest

3. XGBoost (Extreme Gradient Boosting)

🛠️ Steps Followed

📊 Results (Accuracy Scores)

📈 Visualizations

🚀 How to Run

About

Uh oh!

Releases

Packages

Languages

Adeeba-Shahzadi/ForestCoverTypeClassification-MultiClassificationModel

Folders and files

Latest commit

History

Repository files navigation

# 🌲 Forest Cover Type Classification

📌 Project Overview

📂 Dataset

🎯 Classes

🧠 Models Used

1. Decision Tree

2. Random Forest

3. XGBoost (Extreme Gradient Boosting)

🛠️ Steps Followed

📊 Results (Accuracy Scores)

📈 Visualizations

🚀 How to Run

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages