Skip to content
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 17 additions & 0 deletions _gsocprojects/2026/project_Geant4.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
---
project: Geant4
layout: default
logo: Geant4-logo.png
description: |
[Geant4](https://geant4.web.cern.ch/) is a toolkit for the simulation of the
passage of particles through matter. Its areas of application include high
energy, nuclear and accelerator physics, as well as studies in medical and space
science. The three main reference papers for Geant4 are published in Nuclear
Instruments and Methods in Physics Research A 506 (2003) 250-303, IEEE
Transactions on Nuclear Science 53 No. 1 (2006) 270-278 and Nuclear Instruments
and Methods in Physics Research A 835 (2016) 186-225.
summary: |
[Geant4](https://geant4.web.cern.ch/) is a toolkit for the simulation of the passage of particles through matter.
---

{% include gsoc_project.ext %}
15 changes: 15 additions & 0 deletions _gsocprojects/2026/project_ML4EP.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
---
project: ML4EP
layout: default
logo:
description: |
ML4EP is a project of the CERN [SFT group](https://ep-dep-sft.web.cern.ch) focused on developing common machine learning (ML) software tools to support HEP experiments. The current ongoing activities are:
- Designing generic generative ML models for fast simulation of calorimeter showers
- Developing ML software for efficient inference in C++, such as [SOFIE](https://root.cern/manual/tmva/#sofie) and creating interfaces between externally provided ML software and HEP software like [ROOT](https://root.cern)
- Building tools for ML inference in FPGAs such as [hls4ml](https://fastmachinelearning.org/hls4ml/)
- Developing common libraries for model compression and quantization, facilitating optimized ML workflows and porting of ML HEP applications in real time environments.
summary: |
ML4EP is a project of the CERN [SFT group](https://ep-dep-sft.web.cern.ch) focused on developing common machine learning (ML) software tools to support HEP experiments.
---

{% include gsoc_project.ext %}
69 changes: 69 additions & 0 deletions _gsocproposals/2026/proposal_Geant4_fastsim_Data.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
---
title: Optimisation and validation of shower data for ML-based calorimeter simulation
layout: gsoc_proposal
project:
- Geant4
- ML4EP
year: 2026
difficulty: medium
duration: 350
mentor_avail: June-October
organization: CERN
project_mentors:
- email: [email protected]
first_name: Peter
last_name: McKeown
organization: CERN
is_preferred_contact: yes
- email: [email protected]
first_name: Anna
last_name: Zaborowska
organization: CERN
---
## Description

Particle physics experiments, such as those operated at the Large Hadron Collider, fundamentally rely on accurate simulations of interactions between particles and the detector. The Geant4 toolkit provides the state-of-the-art means of conducting these simulations with traditional Monte Carlo techniques. However, the vastly increased simulation requirements of future experiments, such as those which will be operated at the HL-LHC, require the adoption of alternative approaches. This is particularly true for particle shower simulation in the calorimeter systems of experiments. Fast simulation approaches based on generative models have been shown to provide fast yet accurate simulation surrogates, and have recently started to be deployed in production by current LHC experiments.

This project seeks to produce calorimeter shower datasets in the form of point clouds, which are a flexible representation of showers, particularly suited to highly granular calorimeters. Physics validation of the dataset will be conducted to ensure appropriate optimisation of the algorithm used to produce the point cloud, as well as sufficient coverage of the detector. This work will be done in the Key4hep framework under development for future colliders.

## First Steps

1. Gain a basic understanding of calorimeter shower simulation ([G4FastSim]((https://g4fastsim.web.cern.ch/)))
2. Try simulating some electromagnetic particle showers with the [Key4hep](https://key4hep.github.io/key4hep-doc/) framework (see test)
3. Propose a work plan towards producing a dataset appropriate for training an ML model, including studies related to physics validation

## Project Milestones

- Starting with electromagnetic showers, tune existing algorithms to produce point cloud representations of showers, ensuring appropriate preservation of single-shower observables
- Perform initial studies into detector coverage, to ensure sufficient training statistics across the detector
- Repeat these procedures for hadronic showers

## Expected Results

- A complete workflow for producing large-scale datasets of calorimeter showers in the form of point clouds that are suitable for use for training a production-ready ML model
- A complete validation of the physics performance of the obtained point cloud, which minimises the number of points while retaining physics accuracy
- If time allows, finalised datasets for both electromagnetic and hadronic showers

## Requirements

* C++, Python
* Familiarity with PyTorch could be an advantage

## Evaluation Tasks and Timeline

1. Find the test [here](https://docs.google.com/document/d/1nieJAOx0t4V1ZoxegGFsCqflkHziakCxMGor1o5zDsQ/edit?usp=sharing). Please submit it by 9:00 am CET 9th March 2026 along with a short proposal (2 pages max) describing how you would approach the problem. See submission instructions in the test document. Please don't forget to start the subject line with “GSoC’26 FastSim”.
2. We will make the selections based on the test, short proposal and resume by 17:00 CET 16th March.
3. Selected candidates will then write the full proposal and submit it according to the official GSoC timeline.

## Mentors
(As we typically receive a large number of responses and we are not able to reply to all initial messages, please only contact us after completing the test)
* [Peter McKeown](mailto:[email protected]) (CERN)
* Anna Zaborowska (CERN)

## Links
* [G4FastSim](https://g4fastsim.web.cern.ch/)
* [CaloChallenge 2022: A Community Challenge for Fast Calorimeter Simulation](https://arxiv.org/abs/2410.21611)
* [step2point dataset](https://arxiv.org/abs/2509.22340)
* [LEMURS dataset](https://arxiv.org/html/2509.05108v2)
* [A First Full Physics Benchmark for Highly Granular Calorimeter Surrogates](https://arxiv.org/abs/2511.17293)