ICR-RSE-Group · msarkis-icr · Sep 5, 2025 · Sep 22, 2025 · Sep 22, 2025
diff --git a/episodes/notebooks/0-introduction.ipynb b/episodes/notebooks/0-introduction.ipynb
@@ -0,0 +1,80 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "e711a389-fa02-411b-a549-4612172c6a01",
+   "metadata": {},
+   "source": [
+    "# This course is\n",
+    "* Designed for researchers who write Python but lack formal computer science training\n",
+    "* Teaches how to assess where time is spent during the execution of a Python program\n",
+    "* Provides a high-level understanding of how code executes\n",
+    "* Explains how execution maps to performance bottlenecks and highlights good practices"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "e2ab7b89-eada-4c43-8259-066cd5ec53ab",
+   "metadata": {},
+   "source": [
+    "# Expected Outcomes: \n",
+    "After this training, participants will be able to:\n",
+    "* Use tools like cProfile and line_profiler to find which functions or lines of code take the most time.\n",
+    "* Check code to understand what slows it down.\n",
+    "* Learn about some common performance problems and apply fixes to make code run faster"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "303b5679-d006-4682-aa54-f818a77dbffb",
+   "metadata": {},
+   "source": [
+    "# Requirements to follow along the course:\n",
+    "1. Create a conda environment using python 3.11 or newer\n",
+    "   In command line, run: \n",
+    "```bash\n",
+    "conda create --name py311_env python=3.11\n",
+    "conda activate py311_env\n",
+    "```\n",
+    "2. Install the required packages:\n",
+    "```bash\n",
+    "pip install pytest snakeviz line_profiler[all] numpy pandas matplotlib\n",
+    "```\n",
+    "\n",
+    "**Note for MacOS users**:\n",
+    "`line_profiler` could also be installed as well using conda:\n",
+    "```bash\n",
+    "conda install -c conda-forge line_profiler\n",
+    "```\n",
+    "\n",
+    "3. Install jupyter lab and register an ipykernel:\n",
+    "```bash\n",
+    "pip install jupyterlab \n",
+    "conda install ipykernel -y\n",
+    "python -m ipykernel install --user --name py311_env --display-name \"py311_env\"\n",
+    "```\n"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.11.8"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
diff --git a/episodes/notebooks/1-profiling-introduction.ipynb b/episodes/notebooks/1-profiling-introduction.ipynb
@@ -0,0 +1,188 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "a47fdbc2-1d23-4913-9474-49c255ecbcd9",
+   "metadata": {},
+   "source": [
+    "# What is Performance Profiling?\n",
+    "It is the process of analysing and measuring the performance of a running code.\n",
+    "It is a dynamic analysis\n",
+    "\n",
+    "# Why should you profile your code?\n",
+    "##### 1. To assess the performance of you program/code, i.e, identify which operation is taking the longest time to execcute\n",
+    "##### 2. Useful when code grows more complex, making slow parts harder to spot \n",
+    "##### 3. Profiling highlights true bottlenecks, avoiding wasted effort on minor optimisations, and can lead to dramatic speedups.\n",
+    "##### 4. In HPC and beyond, profiling also ensures efficient use of energy and resources.\n",
+    "##### 5. It is a quick and inexpensive process, i.e., you get an instanteneous feedback about your code performance\n",
+    "* If no bottlenecks is identified, then you can be confident your code is performant\n",
+    "* Otherwise, the profiler will identify the piece of code that can benefit from an optimisation.\n",
+    "##### 6. Profiling is for everyone, not only for novices !!!\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "239476f1-1f05-4151-98f8-8586d7872897",
+   "metadata": {},
+   "source": [
+    "## Different types of profilers:\n",
+    "* Manual profiling\n",
+    "* Function-Level Profiling\n",
+    "* Line-Level Profiling, among others"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "849a0bb2-5fd2-4a74-b41b-85cfa9384158",
+   "metadata": {},
+   "source": [
+    "# 1. Manual profiling\n",
+    "* Manually adding timers around sections of code\n",
+    "* It provides a simple way to measure execution time and get a basic form of profiling.\n",
+    "* But it is intrusive to the code as we add plenty of `temporary lines of code`"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 6,
+   "id": "d2979752-71b3-4b56-89b2-5f40bce977bb",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "first hello\n",
+      "hello all\n",
+      "A: 0.0003226249828003347 seconds\n",
+      "B: 3.500000457279384e-05 seconds\n",
+      "C: 6.700001540593803e-05 seconds\n",
+      "C: 0.00044095798511989415 seconds\n"
+     ]
+    }
+   ],
+   "source": [
+    "# example of manual profiling:\n",
+    "import time\n",
+    "\n",
+    "# Record timestamps before and after different sections of code using time.monotonic()\n",
+    "t_a = time.monotonic()\n",
+    "print('first hello')\n",
+    "\n",
+    "t_b = time.monotonic()\n",
+    "a = \"hello\"\n",
+    "\n",
+    "t_c = time.monotonic()\n",
+    "c = a + \" all\"\n",
+    "print(c)\n",
+    "\n",
+    "t_d = time.monotonic()\n",
+    "mainTimer_stop = time.monotonic()\n",
+    "\n",
+    "# Calculate the time taken by subtracting the start time from the end time for each block.\n",
+    "print(f\"A: {t_b - t_a} seconds\")\n",
+    "print(f\"B: {t_c - t_b} seconds\")\n",
+    "print(f\"C: {t_d - t_c} seconds\")\n",
+    "print(f\"C: {mainTimer_stop - t_a} seconds\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 9,
+   "id": "cacf5d33-73d3-4d31-9019-929dc6f5813b",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "0.7316456299404912"
+      ]
+     },
+     "execution_count": 9,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "0.0003226249828003347/0.00044095798511989415"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "b284fa39-1d01-45c4-9346-d5e73fe6f201",
+   "metadata": {},
+   "source": [
+    "### Summary of Manual profiling :\n",
+    "* It is handy for small sections of code\n",
+    "* Increasingly impractical as a project grows in size and complexity\n",
+    "* Also, it is time consuming to be routinely adding and removing these timestamp recordings if they are not relevant as outputs"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "db4706da-ddd2-4c5a-acd7-c82889eedba2",
+   "metadata": {},
+   "source": [
+    "# 2. Function-Level Profiling\n",
+    "Software is made up of many functions, including those you write and those from the standard library or third-party packages.\n",
+    "\n",
+    "* `Function-level profiling` measures how much time your program spends in each function, including or excluding time spent in child functions\n",
+    "* Counts how often each function is called\n",
+    "* Helps identify functions that take up the most time, so you can focus on optimizing them.\n",
+    "* `Function-level profiling` may not always give enough detail, especially if a function is particularly complex.\n",
+    "\n",
+    "In this course, we will use `cProfile` for function-level profiling and `snakeviz` to visualize the results."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "7649e2fd-5abc-4004-ba53-e74db341d5dc",
+   "metadata": {},
+   "source": [
+    "# 3. Line-Level Profiling\n",
+    "\n",
+    "* `Line-level profiling` looks at how much time is spent on `each individual line of code`.\n",
+    "* This helps identify specific lines that take up a large portion of the total runtime.\n",
+    "* In this course, we will use `line_profiler` for `line-level profiling`.\n",
+    "* `line_profiler` is deterministic, tracking every line of code executed, which could be very expensive\n",
+    "* To avoid it being too costly, the profiling is restricted to methods targeted with the decorator `@profile`.\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "ebba3060-dccc-483b-bdee-95306246f6a7",
+   "metadata": {},
+   "source": [
+    "## Start Small, Scale Smart\n",
+    "A representative test-case should be profiled, that is large enough to amplify any bottlenecks whilst executing to completion quickly.\n",
+    "\n",
+    "* Profiling slows programs, so use a small, representative test-case.\n",
+    "\n",
+    "* Keep runs short (a few minutes if possible) to avoid huge output data.\n",
+    "\n",
+    "* Start small (e.g., one day of a year-long model) and scale if needed to spot bottlenecks."
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.11.8"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}