-
Notifications
You must be signed in to change notification settings - Fork 366
Geant4 fast sim 26 GSoC proposal #1830
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
+97
−0
Merged
Changes from 3 commits
Commits
Show all changes
5 commits
Select commit
Hold shift + click to select a range
bbc0692
Add Geant4-FastSim project proposal for GSoC 2026
peter-mckeown ffca631
Update ML4EP project description
peter-mckeown d403575
Update _gsocproposals/2026/proposal_Geant4_fastsim_Data.md
vvolkl d229752
Apply suggestions from code review
vvolkl 7b760a8
Update _gsocproposals/2026/proposal_Geant4_fastsim_Data.md
vvolkl File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,17 @@ | ||
| --- | ||
| project: Geant4 | ||
| layout: default | ||
| logo: Geant4-logo.png | ||
| description: | | ||
| [Geant4](https://geant4.web.cern.ch/) is a toolkit for the simulation of the | ||
| passage of particles through matter. Its areas of application include high | ||
| energy, nuclear and accelerator physics, as well as studies in medical and space | ||
| science. The three main reference papers for Geant4 are published in Nuclear | ||
| Instruments and Methods in Physics Research A 506 (2003) 250-303, IEEE | ||
| Transactions on Nuclear Science 53 No. 1 (2006) 270-278 and Nuclear Instruments | ||
| and Methods in Physics Research A 835 (2016) 186-225. | ||
| summary: | | ||
| [Geant4](https://geant4.web.cern.ch/) is a toolkit for the simulation of the passage of particles through matter. | ||
| --- | ||
|
|
||
| {% include gsoc_project.ext %} |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,15 @@ | ||
| --- | ||
| project: ML4EP | ||
| layout: default | ||
| logo: | ||
| description: | | ||
| ML4EP is a project of the CERN [SFT group](https://ep-dep-sft.web.cern.ch) focused on developing common machine learning (ML) software tools to support HEP experiments. The current ongoing activities are: | ||
| - Designing generic generative ML models for fast simulation of calorimeter showers | ||
| - Developing ML software for efficient inference in C++, such as [SOFIE](https://root.cern/manual/tmva/#sofie) and creating interfaces between externally provided ML software and HEP software like [ROOT](https://root.cern) | ||
| - Building tools for ML inference in FPGAs such as [hls4ml](https://fastmachinelearning.org/hls4ml/) | ||
| - Developing common libraries for model compression and quantization, facilitating optimized ML workflows and porting of ML HEP applications in real time environments. | ||
| summary: | | ||
| ML4EP is a project of the CERN [SFT group](https://ep-dep-sft.web.cern.ch) focused on developing common machine learning (ML) software tools to support HEP experiments. | ||
| --- | ||
|
|
||
| {% include gsoc_project.ext %} |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,69 @@ | ||
| --- | ||
| title: Optimisation and validation of shower data for ML-based calorimeter simulation | ||
| layout: gsoc_proposal | ||
| project: | ||
| - Geant4 | ||
| - ML4EP | ||
| year: 2026 | ||
| difficulty: medium | ||
| duration: 350 | ||
| mentor_avail: June-October | ||
| organization: CERN | ||
| project_mentors: | ||
| - email: [email protected] | ||
| first_name: Peter | ||
| last_name: McKeown | ||
| organization: CERN | ||
| is_preferred_contact: yes | ||
| - email: [email protected] | ||
| first_name: Anna | ||
vvolkl marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| last_name: Zaborowska | ||
| organization: CERN | ||
| --- | ||
| ## Description | ||
|
|
||
| Particle physics experiments, such as those operated at the Large Hadron Collider, fundamentally rely on accurate simulations of interactions between particles and the detector. The Geant4 toolkit provides the state-of-the-art means of conducting these simulations with traditional Monte Carlo techniques. However, the vastly increased simulation requirements of future experiments, such as those which will be operated at the HL-LHC, require the adoption of alternative approaches. This is particularly true for particle shower simulation in the calorimeter systems of experiments. Fast simulation approaches based on generative models have been shown to provide fast yet accurate simulation surrogates, and have recently started to be deployed in production by current LHC experiments. | ||
|
|
||
| This project seeks to produce calorimeter shower datasets in the form of point clouds, which are a flexible representation of showers, particularly suited to highly granular calorimeters. Physics validation of the dataset will be conducted to ensure appropriate optimisation of the algorithm used to produce the point cloud, as well as sufficient coverage of the detector. This work will be done in the Key4hep framework under development for future colliders. | ||
|
|
||
| ## First Steps | ||
|
|
||
| 1. Gain a basic understanding of calorimeter shower simulation ([G4FastSim]((https://g4fastsim.web.cern.ch/))) | ||
| 2. Try simulating some electromagnetic particle showers with the [Key4hep](https://key4hep.github.io/key4hep-doc/) framework (see test) | ||
| 3. Propose a work plan towards producing a dataset appropriate for training an ML model, including studies related to physics validation | ||
|
|
||
| ## Project Milestones | ||
|
|
||
| - Starting with electromagnetic showers, tune existing algorithms to produce point cloud representations of showers, ensuring appropriate preservation of single-shower observables | ||
| - Perform initial studies into detector coverage, to ensure sufficient training statistics across the detector | ||
| - Repeat these procedures for hadronic showers | ||
|
|
||
| ## Expected Results | ||
|
|
||
| - A complete workflow for producing large-scale datasets of calorimeter showers in the form of point clouds that are suitable for use for training a production-ready ML model | ||
| - A complete validation of the physics performance of the obtained point cloud, which minimises the number of points while retaining physics accuracy | ||
| - If time allows, finalised datasets for both electromagnetic and hadronic showers | ||
|
|
||
| ## Requirements | ||
|
|
||
| * C++, Python | ||
| * Familiarity with PyTorch could be an advantage | ||
|
|
||
| ## Evaluation Tasks and Timeline | ||
|
|
||
| 1. Find the test [here](https://docs.google.com/document/d/1nieJAOx0t4V1ZoxegGFsCqflkHziakCxMGor1o5zDsQ/edit?usp=sharing). Please submit it by 9:00 am CET 9th March 2026 along with a short proposal (2 pages max) describing how you would approach the problem. See submission instructions in the test document. Please don't forget to start the subject line with “GSoC’26 FastSim”. | ||
| 2. We will make the selections based on the test, short proposal and resume by 17:00 CET 16th March. | ||
| 3. Selected candidates will then write the full proposal and submit it according to the official GSoC timeline. | ||
|
|
||
| ## Mentors | ||
| (As we typically receive a large number of responses and we are not able to reply to all initial messages, please only contact us after completing the test) | ||
| * [Peter McKeown](mailto:[email protected]) (CERN) | ||
| * Anna Zaborowska (CERN) | ||
vvolkl marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
|
||
| ## Links | ||
| * [G4FastSim](https://g4fastsim.web.cern.ch/) | ||
| * [CaloChallenge 2022: A Community Challenge for Fast Calorimeter Simulation](https://arxiv.org/abs/2410.21611) | ||
| * [step2point dataset](https://arxiv.org/abs/2509.22340) | ||
| * [LEMURS dataset](https://arxiv.org/html/2509.05108v2) | ||
| * [A First Full Physics Benchmark for Highly Granular Calorimeter Surrogates](https://arxiv.org/abs/2511.17293) | ||
|
|
||
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.