Performance Analysis of Parallel Matrix Computation (CPU vs GPU)

📌 Project Overview

This project demonstrates the practical benefits of High Performance Computing (HPC) by benchmarking large-scale matrix multiplication on CPU vs GPU.

A comparative study is performed between:

CPU-based serial computation using NumPy
GPU-accelerated parallel computation using CuPy (CUDA)

The project is designed to run out-of-the-box on Google Colab using an NVIDIA Tesla T4 GPU.

🚀 Motivation

In modern scientific computing, machine learning, and data analytics, sequential CPU execution often becomes a performance bottleneck. This project aims to:

Quantify the performance improvement achieved through GPU acceleration
Demonstrate CUDA-enabled parallel computing using high-level Python libraries
Provide an academically sound, reproducible reference for MSc-level HPC coursework

🛠 Technologies Used

Language: Python 3.10
CPU Computation: NumPy
GPU Computation: CuPy
Parallel Platform: NVIDIA CUDA
Execution Environment: Google Colab (Jupyter Notebook)

📂 Repository Structure

HPC-Matrix-Benchmark-GPU/
├── colab_notebook.ipynb        # MAIN FILE (Run on Google Colab)
├── README.md                   # Project Documentation
├── requirements.txt            # Optional local dependencies
├── src/
│   ├── cpu_version.py          # CPU implementation (NumPy)
│   ├── gpu_version.py          # GPU implementation (CuPy)
│   └── benchmark.py            # Benchmark driver
├── results/                    # Benchmark logs / outputs
└── report/
    └── HPC_Project_Report.md   # Full academic project report

⚡ How to Run

Option 1: Google Colab (Recommended)

Download colab_notebook.ipynb from this repository
Upload it to https://colab.research.google.com
Go to Runtime → Change runtime type
Select GPU (T4) as the hardware accelerator
Click Runtime → Run all

✅ No local setup required

Option 2: Local Execution (Requires NVIDIA GPU)

Requires:

NVIDIA GPU
CUDA drivers
Compatible CuPy installation

# Clone the repository
git clone https://github.com/partha392/HPC-Matrix-Benchmark-GPU.git
cd HPC-Matrix-Benchmark-GPU

# Install dependencies
pip install -r requirements.txt

# Run benchmark
python src/benchmark.py

📊 Sample Output

(Observed on Google Colab with NVIDIA Tesla T4 GPU)

Matrix Size: 4000 x 4000
CPU Time: 12.8 – 18.1 sec
GPU Time: 0.08 – 0.15 sec
-------------------------
Observed Speedup: ~80x – 180x

⚠️ Exact performance may vary due to shared cloud infrastructure and runtime conditions.

📜 Academic Report

A detailed academic report is available in the report/ directory, covering:

HPC and CUDA architecture overview
Experimental methodology
Performance benchmarking and speedup analysis
Limitations and future scope

📄 File: report/HPC_Project_Report.md

🤝 Contributions & Extensions

This is an academic HPC benchmark project. Possible extensions include:

Multi-GPU benchmarking (NCCL)
Mixed-precision computation using Tensor Cores
Additional benchmarks (FFT, Monte Carlo, reductions)

Suggestions and improvements are welcome.

📝 License

This project is licensed under the MIT License and is free to use for educational and academic purposes.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Performance Analysis of Parallel Matrix Computation (CPU vs GPU)

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
docs		docs
report		report
results		results
src		src
HPC_Submission.zip		HPC_Submission.zip
README.md		README.md
colab_notebook.ipynb		colab_notebook.ipynb
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

Performance Analysis of Parallel Matrix Computation (CPU vs GPU)

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages