This project implements and compares two versions of the Miller-Rabin primality test:
- Sequential: A standard C++ implementation running entirely on the CPU.
- Parallel: A hybrid implementation using OpenMP for parallel random base generation on the CPU and CUDA for massively parallel primality test computations on the GPU.
- Arbitrary-precision integer arithmetic using GMP.
- High-quality random number generation for bases using
/dev/urandom. - CUDA implementation for GPU acceleration.
- OpenMP for parallelizing CPU-bound tasks (base generation).
- Benchmarking script to compare performance and calculate speedup.
- Detailed performance metrics for the parallel implementation.
- A C++11 compliant compiler (like g++)
- NVIDIA CUDA Toolkit (if running the parallel version)
- GNU Multiple Precision Arithmetic Library (GMP)
- Standard Unix utilities:
make,bc,xxd,dd
-
Open a Colab Notebook: Start a new Colab notebook.
-
Select GPU Runtime (Required for Parallel Version):
- Go to
Runtime->Change runtime type. - Select
GPUfrom theHardware acceleratordropdown menu. - Click
Save.
- Go to
-
Clone the Repository:
!git clone <your-repository-url> %cd <your-repository-directory>
(Replace
<your-repository-url>and<your-repository-directory>) -
Install Dependencies:
!apt-get update !apt-get install -y build-essential libgmp-dev bc xxd # Install CUDA toolkit if needed (might already be present on GPU runtimes) !apt-get install -y nvidia-cuda-toolkit
-
Compile the Code:
!make clean !make
Note: You might see warnings about file modification times, which can usually be ignored on Colab.
-
Make the Benchmark Script Executable:
!chmod +x benchmark.sh
-
Run the Benchmark:
!./benchmark.sh
This will run both the sequential and parallel versions, testing primality for numbers of different bit sizes (512, 1024, 2048, 4096 by default) and outputting performance results.
-
View the Results:
!cat results/summary.md
This command will display the summary report comparing the execution times and speedup achieved by the parallel version.
.
├── Makefile # Builds the project
├── benchmark.sh # Runs tests and generates performance reports
├── README.md # This file
├── results/ # Directory for benchmark output (created by benchmark.sh)
└── src/
├── miller_rabin.h # Header file with common declarations
├── miller_rabin_cpu.cpp # CPU-specific functions (decompose, test_cpu)
├── miller_rabin_par.cu # Parallel implementation (OpenMP + CUDA)
├── miller_rabin_seq.cpp # Sequential implementation
├── utils.h # Header for utility functions
└── utils.cpp # Utility functions (arg parsing, random base generation, timer)
- Sequential (
miller_rabin_seq): Takes a number and iteration count. It generates random bases sequentially using/dev/urandomand performs the Miller-Rabin test entirely on the CPU using GMP for calculations. - Parallel (
miller_rabin_par): Takes a number and iteration count. It uses OpenMP to generate random bases in parallel on CPU cores (reading from/dev/urandomfor each). The computationally intensive part (modular exponentiation) is offloaded to the GPU using a CUDA kernel. Data is transferred between CPU and GPU memory as needed. It falls back to a CPU implementation for numbers exceeding theMAX_BITSlimit defined in the code. - Benchmarking (
benchmark.sh): Generates a large random number for each specified bit size. It then runs both the sequential and parallel executables with this same number and measures the execution time. It repeats this process multiple times and calculates the average times and speedup.