Skip to content

Commit 44060ec

Browse files
Merge pull request #284 from coding-for-reproducible-research/iss283_fix_parallel_computing
Fix issues with parallel computing course
2 parents f3642ba + 12bfd05 commit 44060ec

10 files changed

+241
-259
lines changed

_toc.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -258,7 +258,7 @@ parts:
258258
sections:
259259
- file: individual_modules/parallel_computing/architecture_and_concurrency
260260
- file: individual_modules/parallel_computing/multithreading_io
261-
- file: individual_modules/parallel_computing/multiprocessing_fractal
261+
- file: individual_modules/parallel_computing/multiprocessing_cpu
262262
- file: individual_modules/parallel_computing/mpi_hello_world
263263
- file: individual_modules/parallel_computing/mpi_simple_communication
264264
- file: individual_modules/parallel_computing/mpi_collective_comms
Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
import time
2+
3+
def main():
4+
start_time = time.perf_counter()
5+
for _ in range(100):
6+
fibonacci(30)
7+
run_time = time.perf_counter() - start_time
8+
print(f"Run time was {run_time} seconds")
9+
10+
def fibonacci(n):
11+
return n if n < 2 else fibonacci(n - 1) + fibonacci(n - 2)
12+
13+
if __name__ == "__main__":
14+
main()
15+
Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,16 @@
1+
import time
2+
from concurrent.futures import ProcessPoolExecutor
3+
4+
def main():
5+
start_time = time.perf_counter()
6+
with ProcessPoolExecutor(max_workers=5) as executor:
7+
executor.map(fibonacci, [30]*100)
8+
run_time = time.perf_counter() - start_time
9+
print(f"Run time was {run_time} seconds")
10+
11+
def fibonacci(n):
12+
return n if n < 2 else fibonacci(n - 1) + fibonacci(n - 2)
13+
14+
if __name__ == "__main__":
15+
main()
16+

individual_modules/parallel_computing/complete_files/multiprocessing_fractal_complete.py

Lines changed: 0 additions & 50 deletions
This file was deleted.

individual_modules/parallel_computing/complete_files/resources_report_complete.py

Lines changed: 2 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -7,9 +7,8 @@ def get_cpu_info():
77
# Get CPU count
88
cpu_count = psutil.cpu_count(logical=False) # Get physical cores
99
cores = psutil.cpu_count(logical=True) # Get logical cores
10-
cpu_usage = psutil.cpu_percent() # Get overall CPU usage
1110

12-
return f"CPU: {cpu_count} physical cores, {cores} logical cores\nCPU Usage: {cpu_usage}%"
11+
return f"CPU: {cpu_count} physical cores, {cores} logical cores"
1312

1413
except Exception as e:
1514
print(f"Error getting CPU info: {e}")
@@ -21,10 +20,8 @@ def get_ram_info():
2120
try:
2221
memory_info = psutil.virtual_memory() # Get virtual memory usage
2322
total_memory_gb = memory_info.total / (1024**3) # Convert to GB
24-
available_memory_gb = memory_info.available / (1024**3) # Convert to GB
25-
used_memory_gb = memory_info.used / (1024**3) # Convert to GB
2623

27-
return f"RAM: Total {total_memory_gb:.2f} GB, Available {available_memory_gb:.2f} GB, Used {used_memory_gb:.2f} GB"
24+
return f"RAM: Total {total_memory_gb:.2f} GB"
2825

2926
except Exception as e:
3027
print(f"Error getting RAM info: {e}")

individual_modules/parallel_computing/mpi_parallel_fractal.ipynb

Lines changed: 82 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -18,9 +18,89 @@
1818
" \n",
1919
"## MPI real world example problem\n",
2020
"\n",
21-
"In a previous lesson we have seen *multi-processing* being used to solve the generation of the Julia set. An alternative approach is to use *message passing*.\n",
21+
"The problem we will attempt to solve is constructing a fractal. This kind of problem is often known as \"embarrassingly parallel\" meaning that each element of the result has no dependency on any of the other elements, meaning that we can solve this problem in parallel without too much difficulty. \\Let's get started by creating a new script - `mpi_fractal.py`:\n",
2222
"\n",
23-
"As mentioned earlier, this is a relatively simple problem to parallelise. If we consider running the program with multiple processes, all we need to do to divide the work is to divide the complex grid up between the processes. Thinking back to previous sections, we covered an MPI function that can achieve this - the `scatter` method of the MPI communicator.\n",
23+
"### Setting up our serial problem\n",
24+
"\n",
25+
"Let's first think about our problem in serial - we want to construct the [Julia set](https://en.wikipedia.org/wiki/Julia_set) fractal, so we need to create a grid of complex numbers to operate over. We can create a simple function to do this:\n",
26+
"\n",
27+
"```python\n",
28+
"# fractal.py\n",
29+
"import numpy as np\n",
30+
"\n",
31+
"def complex_grid(extent, n_cells, grid_range):\n",
32+
" mesh_range = np.arange(-extent, extent, extent/ncells)\n",
33+
" x, y = np.meshgrid(grid_range * 1j, grid_range)\n",
34+
" z = x + y\n",
35+
"\n",
36+
" return z\n",
37+
"```\n",
38+
"\n",
39+
"Now, we can create a function that will calculate the Julia set convergence for each element in the complex grid:\n",
40+
"\n",
41+
"```python\n",
42+
"import warnings\n",
43+
"\n",
44+
"...\n",
45+
"\n",
46+
"def julia_set(grid):\n",
47+
"\n",
48+
" fractal = np.zeros(np.shape(grid))\n",
49+
"\n",
50+
" # Iterate through the operation z := z**2 + c.\n",
51+
" for j in range(num_iter):\n",
52+
" grid = grid ** 2 + c\n",
53+
" # Catch the overflow warning because it's annoying\n",
54+
" with warnings.catch_warnings():\n",
55+
" warnings.simplefilter(\"ignore\")\n",
56+
" index = np.abs(grid) < np.inf\n",
57+
" fractal[index] = fractal[index] + 1\n",
58+
"\n",
59+
" return fractal\n",
60+
"```\n",
61+
"\n",
62+
"This function calculates how many iterations it takes for each element in the complex grid to reach infinity (if ever) when operated on with the equation `x = x**2 + c`. The function itself is not the focus of this exercise as much as it is a way to make the computer perform some work! Let's use these functions to set up our problem in serial, without any parallelism:\n",
63+
"\n",
64+
"```python\n",
65+
"\n",
66+
"...\n",
67+
"\n",
68+
"c = -0.8 - 0.22 * 1j\n",
69+
"extent = 2\n",
70+
"cells = 2000\n",
71+
"\n",
72+
"grid = complex_grid(extent, cells)\n",
73+
"fractal = julia_set(grid, 80, c)\n",
74+
"```\n",
75+
"\n",
76+
"If we run the python script (`python fractal.py`) it takes a few seconds to complete (this will vary depending on your machine), so we can already see that we are making our computer work reasonably hard with just a few lines of code. If we use the `time` command we can get a simple overview of how much time and resource are being used:\n",
77+
"\n",
78+
"```\n",
79+
"$ time python parallel_fractal_complete.py\n",
80+
"python parallel_fractal_complete.py 5.96s user 3.37s system 123% cpu 7.558 total\n",
81+
"```\n",
82+
"\n",
83+
"\n",
84+
"\n",
85+
"```{note}\n",
86+
" We can also visualise the Julia set with the code snippet:\n",
87+
"`\n",
88+
"import matplotlib.pyplot as plt\n",
89+
"\n",
90+
"...\n",
91+
"\n",
92+
"plt.imshow(fractal, extent=[-extent, extent, -extent, extent], aspect='equal')\n",
93+
"plt.show()\n",
94+
"`\n",
95+
"but doing so will impact the numbers returned when we time our function, so it's important to remember this before trying to measure how long the function takes.\n",
96+
"```\n",
97+
"\n",
98+
"### Download Complete Serial File \n",
99+
"[Download complete serial fractal example file](complete_files/fractal_complete.py)\n",
100+
"\n",
101+
"### Parallelising our serial problem\n",
102+
"\n",
103+
"Next we are going to sovle the Julia set problem in parallel using *message passing*. As mentioned earlier, this is a relatively simple problem to parallelise. If we consider running the program with multiple processes, all we need to do to divide the work is to divide the complex grid up between the processes. Thinking back to previous sections, we covered an MPI function that can achieve this - the `scatter` method of the MPI communicator.\n",
24104
"\n",
25105
"We can directly take the example from the previous chapter and apply it to the complex mesh creation function:\n",
26106
"\n",
Lines changed: 76 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,76 @@
1+
{
2+
"cells": [
3+
{
4+
"cell_type": "markdown",
5+
"id": "0d031ccb-54d9-44a8-8282-63064ec52ba0",
6+
"metadata": {},
7+
"source": [
8+
"# Python Multiprocessing\n",
9+
"\n",
10+
"## Learning Objectives\n",
11+
"\n",
12+
"By the end of this lesson, learners will be able to:\n",
13+
"\n",
14+
"- Use Python's `multiprocessing` library to parallelize a CPU bound problem.\n",
15+
"- Set up a pool of workers and delegate tasks to different processes to run concurrently, using the `ProcessPoolExecutor` class.\n",
16+
"\n",
17+
"\n",
18+
"## IO bound example with Python multithreading\n",
19+
"\n",
20+
"In this simple example we will create an expensive CPU bound recursive function that generates the n-th Fibonacci number several times, and report the amount of time it took to run for benchmarking purposes. Note there are much more efficient ways to implement this function, but this expensive impplimentation is deliberate. First we will do it serially, and then use the multiprocessing library to delegate blocks of webpages to different threads. This problem is CPU bound, as the time it takes for process to complete is dependent on the speed of the CPU."
21+
]
22+
},
23+
{
24+
"cell_type": "markdown",
25+
"id": "b45c5b07-4d05-4705-93fe-fd841171e4cc",
26+
"metadata": {},
27+
"source": [
28+
"### The serial example\n",
29+
"\n",
30+
"[Download complete serial cpu bound example file](complete_files/cpu_bound_complete.py)"
31+
]
32+
},
33+
{
34+
"cell_type": "markdown",
35+
"id": "56a0bb9c",
36+
"metadata": {},
37+
"source": [
38+
"### The multithreading example\n",
39+
"\n",
40+
"[Download complete threaded cpu_bound example file](complete_files/multithreading_cpu_bound_complete.py)"
41+
]
42+
},
43+
{
44+
"cell_type": "code",
45+
"execution_count": null,
46+
"id": "b2758e5a-7cc5-43e9-a417-78cc9dabc07c",
47+
"metadata": {},
48+
"outputs": [],
49+
"source": [
50+
"from jupyterquiz import display_quiz\n",
51+
"display_quiz(\"questions/summary_multithreading.json\")"
52+
]
53+
}
54+
],
55+
"metadata": {
56+
"kernelspec": {
57+
"display_name": "Python 3 (ipykernel)",
58+
"language": "python",
59+
"name": "python3"
60+
},
61+
"language_info": {
62+
"codemirror_mode": {
63+
"name": "ipython",
64+
"version": 3
65+
},
66+
"file_extension": ".py",
67+
"mimetype": "text/x-python",
68+
"name": "python",
69+
"nbconvert_exporter": "python",
70+
"pygments_lexer": "ipython3",
71+
"version": "3.13.1"
72+
}
73+
},
74+
"nbformat": 4,
75+
"nbformat_minor": 5
76+
}

0 commit comments

Comments
 (0)