Delhivery Logistics Data Analysis 📦🚚

This project analyzes Delhivery's logistics delivery dataset to understand delivery performance, route efficiency, and operational patterns using data analytics techniques.

The analysis focuses on transforming raw segment-level logistics data into meaningful trip-level insights that can help improve delivery efficiency and route planning.

Project Objective

The objective of this project is to analyze delivery operations and identify factors affecting delivery time using statistical analysis and data visualization.

Key goals include:

Cleaning and preprocessing logistics data
Aggregating segment-level data into trip-level insights
Performing exploratory data analysis (EDA)
Detecting and handling outliers
Engineering meaningful features
Performing hypothesis testing
Extracting business insights and recommendations

Dataset Information

The dataset contains detailed logistics delivery information including:

Trip creation timestamps
Source and destination centers
Actual delivery time
OSRM estimated time and distance
Segment-level travel metrics

The dataset covers deliveries between 12 September 2018 and 8 October 2018.

Technologies Used

Python
Pandas
NumPy
Matplotlib
Seaborn
Scikit-learn
SciPy

Project Workflow

1 Data Cleaning

Converted time columns to datetime format
Removed missing values
Verified data consistency

2 Data Aggregation

Segment-level data was aggregated into:

Segment-level summaries
Trip-level summaries

3 Feature Engineering

New features created:

Trip duration
Month, day, weekday, hour
Source city and state
Destination city and state
Average delivery speed
Time prediction error

4 Exploratory Data Analysis

Performed:

Univariate analysis
Bivariate analysis
Multivariate analysis

Key visualizations included:

Delivery time distribution
Distance distribution
Route type comparison
Correlation heatmaps

5 Outlier Detection

Outliers were identified using boxplots and treated using the IQR method.

6 Feature Encoding and Scaling

Categorical variables encoded using One-Hot Encoding
Numerical variables normalized using MinMaxScaler

7 Hypothesis Testing

Statistical hypothesis testing was conducted to validate relationships between aggregated delivery metrics.

Tests performed:

Actual delivery time vs OSRM estimated time
Actual time vs segment-level actual time
OSRM distance vs segment OSRM distance
OSRM time vs segment OSRM time

Paired t-tests were used for statistical validation.

Key Insights

Actual delivery times are significantly higher than OSRM estimated times.
Delivery time strongly correlates with delivery distance.
Major logistics hubs include Bengaluru, Mumbai, and Gurgaon.
The highest delivery activity occurs in Maharashtra, Karnataka, and Haryana.
Operational delays may occur between delivery segments due to hub processing or logistics operations.

Business Recommendations

Improve route prediction models by incorporating traffic and operational delays.
Optimize high-volume logistics corridors.
Improve processing efficiency at intermediate logistics hubs.
Focus operational improvements on high-volume cities such as Bengaluru, Mumbai, and Gurgaon.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
.gitattributes		.gitattributes
Business_Case_Delhivery_Jiyansh_Garg.ipynb		Business_Case_Delhivery_Jiyansh_Garg.ipynb
Business_Case_Delhivery_Jiyansh_Garg.pdf		Business_Case_Delhivery_Jiyansh_Garg.pdf
README.md		README.md
delhivery_raw_data.csv		delhivery_raw_data.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Delhivery Logistics Data Analysis 📦🚚

Project Objective

Dataset Information

Technologies Used

Project Workflow

1 Data Cleaning

2 Data Aggregation

3 Feature Engineering

4 Exploratory Data Analysis

5 Outlier Detection

6 Feature Encoding and Scaling

7 Hypothesis Testing

Key Insights

Business Recommendations

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Delhivery Logistics Data Analysis 📦🚚

Project Objective

Dataset Information

Technologies Used

Project Workflow

1 Data Cleaning

2 Data Aggregation

3 Feature Engineering

4 Exploratory Data Analysis

5 Outlier Detection

6 Feature Encoding and Scaling

7 Hypothesis Testing

Key Insights

Business Recommendations

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages