Skip to content

Commit 55cea4d

Browse files
sbalduvvolkl
andauthored
Add Patatrack project proposal for GSoC 2025 (#1677)
* Add Patatrack project for year 2025 * Add project proposal for Patatrack and update list of mentors * Fix links section * Fix typo in Patatrack project * Fix path of patatrack logo * Update _gsocproposals/2025/proposal_CLUEsteringAutotuning.md --------- Co-authored-by: Valentin Volkl <[email protected]>
1 parent 262f277 commit 55cea4d

File tree

3 files changed

+73
-0
lines changed

3 files changed

+73
-0
lines changed
Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
---
2+
project: Patatrack
3+
layout: default
4+
logo: patatrack-logo.png
5+
description: |
6+
[Patatrack](https://patatrack.web.cern.ch/patatrack/index.html) project started in 2016 by a group of people with various area of expertise, such as software optimization, heterogeneous computing, track reconstruction and High Level Trigger (HLT) at the CMS experiment at CERN. The goal was to demonstrate that part of the HLT reconstruction could be efficiently offloaded on machines equipped with GPUs for parallel execution. Nowadays, Patatrack developments have been integrated into the CMS software for event reconstruction and the project focuses on the exploration of innovative software and hardware technologies to bring smart software closer to the detectors read-out at CERN experiments.
7+
---
8+
{% include gsoc_project.ext %}
Lines changed: 63 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,63 @@
1+
---
2+
title: Development of an auto-tuning tool for the CLUEstering library
3+
layout: gsoc_proposal
4+
project: Patatrack
5+
year: 2025
6+
organization: CERN
7+
---
8+
9+
## Description
10+
[CLUE][clue] is a fast and fully parallelizable density-based clustering algorithm, optimized for high-
11+
occupancy scenarios, where the number of clusters is much larger than the average number of hits
12+
in a cluster ([Rovere et al. 2020][cluepaper]). The algorithm uses a grid spatial index for fast querying of
13+
neighbors and its timing scales linearly with the number of points within the range considered. It is
14+
currently used in the CMS and CLIC event reconstruction software for clustering calorimetric hits in
15+
two dimensions based on their energy. The CLUE algorithm has been generalized to an arbitrary
16+
number of dimensions and to a wider range of applications in [CLUEstering][cluestering], a general purpose
17+
clustering library, with the backend implemented in C++ and providing a Python interface for
18+
easier use. The backend can be executed on multiple backends (serial, TBB, GPUs, ecc) thanks
19+
to the [Alpaka][alpakapaper] performance portability library. One feature currently lacking from CLUEstering
20+
and that would be extremely useful for every user, is an autotuning of the parameters, that given
21+
the expected number of clusters computes the combination of input parameters that results in the best
22+
clustering.
23+
For this task, one of the options to be explored is “The Optimizer”, a Python library developed by
24+
the Patatrack group of the CMS experiment which provides a collection of optimization algorithm,
25+
in particular MOPSO (Multi-Objective Particle Swarm Optimization).
26+
27+
## Expected results
28+
* Consider the best techniques and tools for the task
29+
* Develop an auto-tuning tool for the parameters of CLUEstering
30+
* Test it on a wide range of commonly used datasets
31+
* Benchmark and profile to identify the bottlenecks of the tool and optimize it
32+
33+
## Evaluation Task
34+
Interested students please contact [email protected]
35+
36+
## Technologies
37+
* C++, Python
38+
39+
## Desirable skills
40+
* Experience with development in C++17/20
41+
* Experience with GPU computing
42+
* Experience with machine learning and optimization techniques
43+
* Experience with development of Python libraries
44+
45+
## Additional information
46+
* Difficulty level (low, medium, hard): medium
47+
* Duration: 350 hours
48+
* Mentor availability: June-October
49+
50+
## Mentors
51+
* **[Simone Balducci](mailto:[email protected]) (CERN UNIBO)**
52+
* [Felice Pantaleo](mailto:[email protected]) (CERN)
53+
54+
## Links
55+
* [CLUE][clue]
56+
* [CLUEstering][cluestering]
57+
* [Alpaka][alpaka]
58+
59+
[clue]: https://gitlab.cern.ch/kalos/clue
60+
[cluestering]: https://github.com/cms-patatrack/CLUEstering
61+
[cluepaper]: https://www.frontiersin.org/articles/10.3389/fdata.2020.591315/full
62+
[alpakapaper]: https://arxiv.org/abs/1602.08477
63+
[alpaka]: https://github.com/alpaka-group/alpaka

gsoc/2025/mentors.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,7 @@ layout: plain
66
**Note for contributors:** entries must be sorted in **last name** alphabetic order
77

88
## Full Mentor List (Name, Email, Org)
9+
* Simone Balducci [[email protected]](mailto:[email protected]) CERN
910
* Martin Barisits [[email protected]](mailto:[email protected]) CERN
1011
* Lukas Breitwieser [[email protected]](mailto:[email protected]) CERN
1112
* Andy Buckley [[email protected]](mailto:[email protected]) UofGlasgow
@@ -18,6 +19,7 @@ layout: plain
1819
* David Lange [[email protected]](mailto:[email protected]) CompRes
1920
* Serguei Linev [[email protected]](mailto:[email protected]) GSI
2021
* Peter McKeown [[email protected]](mailto:[email protected]) CERN
22+
* Felice Pantaleo [[email protected]](mailto:[email protected]) CERN
2123
* Giacomo Parolini [[email protected]](mailto:[email protected]) CERN
2224
* Alexander Penev [[email protected]](mailto:[email protected]) CompRes/University of Plovdiv, BG
2325
* Mayank Sharma [[email protected]](mailto:[email protected]) UMich

0 commit comments

Comments
 (0)