|
| 1 | +--- |
| 2 | +title: Estimating the energy cost of ML scientific software |
| 3 | +layout: gsoc_proposal |
| 4 | +project: SMARTHEP |
| 5 | +year: 2025 |
| 6 | +organization: |
| 7 | + - UManchester |
| 8 | + - CERN |
| 9 | +difficulty: medium |
| 10 | +duration: 350 |
| 11 | +mentor_avail: June-October (with 2-3 weeks mentor vacation where student will work independently with minimal guidance) |
| 12 | +--- |
| 13 | +# Description |
| 14 | + |
| 15 | +At a time where “energy crisis” is something that we hear daily, |
| 16 | +we can’t help but wonder whether our research software can be made more sustainable, |
| 17 | +and more efficient as a byproduct. |
| 18 | +In particular, this question arises for ML scientific software used in high-throughput scientific |
| 19 | +computing, where large datasets composed of many similar chunks are analysed with similar operations |
| 20 | +on each chunk of data. |
| 21 | +Moreover, CPU/GPU-efficient software algorithms are crucial for the real-time data selection (trigger) |
| 22 | +systems in LHC experiments, |
| 23 | +as the initial data analysis necessary to select interesting collision events |
| 24 | +is executed on a computing farm located at CERN that has finite CPU resources. |
| 25 | + |
| 26 | +The questions we want to start answering in this work are: |
| 27 | + * what is the trade off between performance of a ML algorithm and its energetic efficiency? |
| 28 | + * can small efficiency improvements in ML algorithms running on Large Hadron Collider data |
| 29 | + have a sizable energetic impact? |
| 30 | + * how do these energy efficiency improvements vary |
| 31 | + when using different computing architectures (1) and/or job submission systems (2)? |
| 32 | + |
| 33 | +## Task ideas |
| 34 | + |
| 35 | +The students in this project will use metrics from the [Green Software Foundation](<https://greensoftware.foundation>) |
| 36 | +and from other selected resources to estimate the energy efficiency of machine learning software from LHC experiments |
| 37 | +(namely, top tagging using ATLAS Open data) and from machine learning algorithms for data compression |
| 38 | +(there is another GSoC project developing this code, called Baler). |
| 39 | +This work will build on previous GSoC / Master's thesis work, and will expand these results for GPU architectures. |
| 40 | +If time allows, the student will then have the chance to make small changes to the code |
| 41 | +to make it more efficient, and evaluate possible savings. |
| 42 | + |
| 43 | +## Expected results and milestones |
| 44 | + |
| 45 | + * Understand and summarise the metrics for software energy consumption, focusing on computing resources at CERN; |
| 46 | + * Become familiar with running and debugging the selected software frameworks and algorithms; |
| 47 | + * Set up tests and visualisation for applying metrics to the selected software |
| 48 | + * Run tests and visualise results (preferably using a Jupyter notebook) |
| 49 | + * Vary platforms and job submission systems |
| 50 | + * Identify possible improvements, apply them, and run tests again |
| 51 | + |
| 52 | +## Requirements |
| 53 | + |
| 54 | + * Python |
| 55 | + * git |
| 56 | + * Jupyter notebooks |
| 57 | + * PyTorch or equivalent ML toolkit |
| 58 | + * Desirable: code profiling experience |
| 59 | + |
| 60 | +## Mentors |
| 61 | + |
| 62 | + * **[Caterina Doglioni ](mailto:[email protected])** |
| 63 | + * **[Tobias Fitschen ](mailto:[email protected])** as backup mentor |
| 64 | + * **[James Smith ](mailto:[email protected])** as backup mentor |
| 65 | + |
| 66 | +## Links |
| 67 | + |
| 68 | + * (1) [Green Software Foundation course](<https://learn.greensoftware.foundation/>) |
| 69 | + * (2) [Code by the previous GSoC student](<https://summerofcode.withgoogle.com/archive/2023/projects/Nks9akq7>) |
0 commit comments