Skip to content

Commit ed4d3f9

Browse files
Merge branch 'HSF:main' into main
2 parents 5e82589 + acf7c34 commit ed4d3f9

21 files changed

+827
-1
lines changed

_gsocorgs/2025/princeton.md

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
---
2+
title: "Princeton University"
3+
author: "Lino Gerlach"
4+
layout: default
5+
organization: princeton
6+
logo: princeton-logo.png
7+
description: |
8+
Princeton University is a private Ivy League research university in Princeton, New Jersey.
9+
---
10+
11+
{% include gsoc_proposal.ext %}
Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
---
2+
project: BioDynamo
3+
layout: default
4+
logo: BioDynamo-logo.png
5+
description: |
6+
The Biology Dynamics Modeller [BioDynamo](https://www.biodynamo.org/home-page) is an open source, agent-based, simulation software that was originally designed to simulate the behaviour of billions of cells. Agent-based modelling (ABM) is a powerful methodology for studying complex systems in biology, epidemiology, economics, social sciences, medicine and more. BioDynaMo is a software platform to easily create, run, and visualise 3D agent-based simulations. BioDynamo allows users to reuse, adapt, or create modules that represents a specific biological behavior or entity. The core of the platform is written in C++ and is highly optimized to harness the computational power of modern hardware.
7+
8+
summary: |
9+
[BioDynamo](https://www.biodynamo.org/developer-guide/documentation) is an agent-based, simulation software that is designed to simulate the behaviour of biological entities.
10+
---
11+
12+
{% include gsoc_project.ext %}
13+
Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
---
2+
project: CICADA
3+
layout: default
4+
logo: cicada-logo.png
5+
description: |
6+
[CICADA](https://cicada.web.cern.ch/) Calorimeter Image Convolutional Anomaly Detection Algorithm (CICADA) uses low level Compact Muon Solenoid experiment's trigger calorimeter information as the input to convolutional autoencoder to find anomalies produced during Large Hadron Collider proton-proton collisionsis. Quantization Aware Training and Knowledge Distillation are used to compress the model for sub-500ns inference on Field-Programmable Gate Arrays.
7+
summary: |
8+
[CICADA](https://cicada.web.cern.ch/) Calorimeter Image Convolutional Anomaly Detection Algorithm: Real-time anomaly detection at CMS.
9+
---
10+
11+
{% include gsoc_project.ext %}
Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
---
2+
project: Geant4
3+
layout: default
4+
logo: Geant4-logo.png
5+
description: |
6+
[Geant4](https://geant4.web.cern.ch/) is a toolkit for the simulation of the
7+
passage of particles through matter. Its areas of application include high
8+
energy, nuclear and accelerator physics, as well as studies in medical and space
9+
science. The three main reference papers for Geant4 are published in Nuclear
10+
Instruments and Methods in Physics Research A 506 (2003) 250-303, IEEE
11+
Transactions on Nuclear Science 53 No. 1 (2006) 270-278 and Nuclear Instruments
12+
and Methods in Physics Research A 835 (2016) 186-225.
13+
summary: |
14+
Geant4 is a toolkit for the simulation of the passage of particles through matter.
15+
---
16+
17+
{% include gsoc_project.ext %}
Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
---
2+
project: JuliaHEP
3+
title: JuliaHEP
4+
layout: default
5+
logo: juliahep/juliaheplogo.png
6+
description: |
7+
The [JuliaHEP](https://hepsoftwarefoundation.org/activities/juliahep.html) working group brings together a community of developers and users of Julia in Particle Physics, with the aim of improving the sharing of knowledge and expertise, as well as unify effort in developing Julia packages useful for the community.
8+
---
9+
10+
{% include gsoc_project.ext %}
Lines changed: 53 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,53 @@
1+
---
2+
title: Agent-Based Simulation of CAR-T Cell Therapy Using BioDynaMo
3+
layout: gsoc_proposal
4+
project: BioDynamo
5+
year: 2025
6+
difficulty: medium
7+
duration: 350
8+
mentor_avail: June-October
9+
organization:
10+
- CERN
11+
- CompRes
12+
---
13+
14+
## Description
15+
16+
Chimeric Antigen Receptor T-cell (CAR-T) therapy has revolutionized cancer treatment by harnessing the immune system to target and destroy tumor cells. While CAR-T has demonstrated success in blood cancers, its effectiveness in solid tumors remains limited due to challenges such as poor tumor infiltration, immune suppression, and T-cell exhaustion. To improve therapy outcomes, computational modeling is essential for optimizing treatment parameters, predicting failures, and testing novel interventions. However, existing models of CAR-T behavior are often overly simplistic or computationally expensive, making them impractical for large-scale simulations.
17+
18+
This project aims to develop a scalable agent-based simulation of CAR-T therapy using BioDynaMo, an open-source high-performance biological simulation platform. By modeling T-cell migration, tumor engagement, and microenvironmental factors, we will investigate key treatment variables such as dosage, administration timing, and combination therapies. The simulation will allow researchers to explore how tumor microenvironment suppression (e.g., regulatory T-cells, hypoxia, immunosuppressive cytokines) affects CAR-T efficacy and what strategies such as checkpoint inhibitors or cytokine support can improve outcomes.
19+
20+
The final deliverable will be a fully documented, reproducible BioDynaMo simulation, along with analysis tools for visualizing treatment dynamics. The model will provide insights into the optimal CAR-T cell dosing, tumor penetration efficiency, and factors influencing therapy resistance. This project will serve as a foundation for in silico testing of immunotherapies, reducing the need for costly and time-consuming laboratory experiments while accelerating the development of more effective cancer treatments.
21+
22+
## Expected plan of work:
23+
24+
- Phase 1: Initial Setup & Simple T-cell Dynamics
25+
- Phase 2: Advanced CAR-T Cell Behavior & Tumor Interaction
26+
- Phase 3: Integration of Immunosuppressive Factors & Data Visualization
27+
28+
29+
## Expected deliverables
30+
31+
* A fully documented BioDynaMo simulation of CAR-T therapy.
32+
* Analysis scripts for visualizing tumor reduction and CAR-T efficacy.
33+
* Performance benchmarks comparing different treatment strategies.
34+
* A research-style report summarizing findings.
35+
36+
37+
## Requirements
38+
39+
* C++ (for BioDynaMo simulations)
40+
* Agent-based modeling (understanding immune dynamics)
41+
* Basic immunology & cancer biology (optional but helpful)
42+
* Data visualization (Python, Matplotlib, Seaborn)
43+
44+
## Mentors
45+
* [Vassil Vassilev](mailto:[email protected])
46+
* [Lukas Breitwieser](mailto:[email protected])
47+
48+
## Links
49+
* [Mapping CAR T-Cell Design Space Using Agent-Based Models](https://www.frontiersin.org/journals/molecular-biosciences/articles/10.3389/fmolb.2022.849363/full)
50+
* [BioDynaMo: A Modular Platform for High-Performance Agent-Based Simulation](https://cds.cern.ch/record/2800211?ln=en)
51+
* [Computational Modeling of Chimeric Antigen Receptor (CAR) T-Cell Therapy of a Binary Model of Antigen Receptors in Breast Cancer](https://ieeexplore.ieee.org/document/9669393)
52+
* [Investigating Two Modes of Cancer-Associated Antigen Presentation in CAR T-Cell Therapy Using Agent-Based Modeling](https://www.mdpi.com/2073-4409/11/19/3165)
53+
* [BioDynaMo: Cutting-Edge Software Helps Battle Cancer](https://home.cern/news/news/knowledge-sharing/biodynamo-cutting-edge-software-helps-battle-cancer)
Lines changed: 43 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,43 @@
1+
---
2+
title: Implement and improve an efficient, layered tape with prefetching capabilities
3+
layout: gsoc_proposal
4+
project: Clad
5+
year: 2025
6+
difficulty: medium
7+
duration: 350
8+
mentor_avail: June-October
9+
organization:
10+
- CompRes
11+
---
12+
13+
## Description
14+
15+
In mathematics and computer algebra, automatic differentiation (AD) is a set of techniques to numerically evaluate the derivative of a function specified by a computer program. Automatic differentiation is an alternative technique to Symbolic differentiation and Numerical differentiation (the method of finite differences). Clad is based on Clang which provides the necessary facilities for code transformation. The AD library can differentiate non-trivial functions, to find a partial derivative for trivial cases and has good unit test coverage.
16+
17+
The most heavily used entity in AD is a stack-like data structure called a tape. For example, the first-in last-out access pattern, which naturally occurs in the storage of intermediate values for reverse mode AD, lends itself towards asynchronous storage. Asynchronous prefetching of values during the reverse pass allows checkpoints deeper in the stack to be stored furthest away in the memory hierarchy. Checkpointing provides a mechanism to parallelize segments of a function that can be executed on independent cores. Inserting checkpoints in these segments using separate tapes enables keeping the memory local and not sharing memory between cores. We will research techniques for local parallelization of the gradient reverse pass, and extend it to achieve better scalability and/or lower constant overheads on CPUs and potentially accelerators. We will evaluate techniques for efficient memory use, such as multi-level checkpointing support. Combining already developed techniques will allow executing gradient segments across different cores or in heterogeneous computing systems. These techniques must be robust and user-friendly, and minimize required application code and build system changes.
18+
19+
This project aims to improve the efficiency of the clad tape and generalize it into a tool-agnostic facility that could be used outside of clad as well.
20+
21+
## Expected Results
22+
23+
* Optimize the current tape by avoiding re-allocating on resize in favor of using connected slabs of array
24+
* Enhance existing benchmarks demonstrating the efficiency of the new tape
25+
* Add the tape thread safety
26+
* Implement multilayer tape being stored in memory and on disk
27+
* [Stretch goal] Support cpu-gpu transfer of the tape
28+
* [Stretch goal] Add infrastructure to enable checkpointing offload to the new tape
29+
* [Stretch goal] Performance benchmarks
30+
31+
32+
## Requirements
33+
34+
* Automatic differentiation
35+
* C++ programming
36+
* Clang frontend
37+
38+
## Mentors
39+
* **[Vassil Vassilev](mailto:[email protected])**
40+
* [David Lange](mailto:[email protected])
41+
42+
## Links
43+
* [Repo](https://github.com/vgvassilev/clad)
Lines changed: 40 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,40 @@
1+
---
2+
title: Integrate Clad to PyTorch and compare the gradient execution times
3+
layout: gsoc_proposal
4+
project: Clad
5+
year: 2025
6+
difficulty: medium
7+
duration: 350
8+
mentor_avail: June-October
9+
organization:
10+
- CompRes
11+
---
12+
13+
## Description
14+
15+
PyTorch is a popular machine learning framework that includes its own automatic differentiation engine, while Clad is a Clang plugin for automatic differentiation that performs source-to-source transformation to generate functions capable of computing derivatives at compile time.
16+
17+
This project aims to integrate Clad-generated functions into PyTorch using its C++ API and expose them to a Python workflow. The goal is to compare the execution times of gradients computed by Clad with those computed by PyTorch’s native autograd system. Special attention will be given to CUDA-enabled gradient computations, as PyTorch also offers GPU acceleration capabilities.
18+
19+
## Expected Results
20+
21+
* Incorporate Clad’s API components (such as `clad::array` and `clad::tape`) into PyTorch using its C++ API
22+
* Pass Clad-generated derivative functions to PyTorch and expose them to Python
23+
* Perform benchmarks comparing the execution times and performance of Clad-derived gradients versus PyTorch’s autograd
24+
* Automate the integration process
25+
* Document thoroughly the integration process and the benchmark results and identify potential bottlenecks in Clad’s execution
26+
* Present the work at the relevant meetings and conferences.
27+
28+
29+
## Requirements
30+
31+
* Automatic differentiation
32+
* C++ programming
33+
* Clang frontend
34+
35+
## Mentors
36+
* **[Vassil Vassilev](mailto:[email protected])**
37+
* [Christina Koutsou](mailto:@[email protected])
38+
39+
## Links
40+
* [Repo](https://github.com/vgvassilev/clad)
Lines changed: 127 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,127 @@
1+
---
2+
title: Enable automatic differentiation of C++ STL concurrency primitives in Clad
3+
layout: gsoc_proposal
4+
project: Clad
5+
year: 2025
6+
difficulty: medium
7+
duration: 350
8+
mentor_avail: June-October
9+
organization:
10+
- CompRes
11+
---
12+
13+
## Description
14+
15+
Clad is an automatic differentiation (AD) clang plugin for C++. Given a C++ source code of a mathematical function, it can automatically generate C++ code for computing derivatives of the function. This project focuses on enabling automatic differentiation of codes that utilise C++ concurrency features such as `std::thread`, `std::mutex`, atomic operations and more. This will allow users to fully utilize their CPU resources.
16+
17+
## Expected Results
18+
19+
* Explore C++ concurrency primitives and prepare a report detailing the associated challenges involved and the features that can be feasibly supported within the given timeframe.
20+
* Add concurrency primitives support in Clad’s forward-mode automatic differentiation.
21+
* Add concurrency primitives support in Clad’s reverse-mode automatic differentiation.
22+
* Add proper tests and documentation.
23+
* Present the work at the relevant meetings and conferences.
24+
25+
An example demonstrating the use of differentiation of codes utilizing parallelization primitives:
26+
27+
```
28+
#include <cmath>
29+
#include <iostream>
30+
#include <mutex>
31+
#include <numeric>
32+
#include <thread>
33+
#include <vector>
34+
#include "clad/Differentiator/Differentiator.h"
35+
36+
using VectorD = std::vector<double>;
37+
using MatrixD = std::vector<VectorD>;
38+
39+
std::mutex m;
40+
41+
VectorD operator*(const VectorD &l, const VectorD &r) {
42+
VectorD v(l.size());
43+
for (std::size_t i = 0; i < l.size(); ++i)
44+
v[i] = l[i] * r[i];
45+
return v;
46+
}
47+
48+
double dot(const VectorD &v1, const VectorD &v2) {
49+
VectorD v = v1 * v2;
50+
return std::accumulate(v.begin(), v.end(), 0.0);
51+
}
52+
53+
double activation_fn(double z) { return 1 / (1 + std::exp(-z)); }
54+
55+
double compute_loss(double y, double y_estimate) {
56+
return -(y * std::log(y_estimate) + (1 - y) * std::log(1 - y_estimate));
57+
}
58+
59+
void compute_and_add_loss(VectorD x, double y, const VectorD &weights, double b,
60+
double &loss) {
61+
double z = dot(x, weights) + b;
62+
double y_estimate = activation_fn(z);
63+
std::lock_guard<std::mutex> guard(m);
64+
loss += compute_loss(y, y_estimate);
65+
}
66+
67+
/// Compute total loss associated with a single neural neural-network.
68+
/// y_estimate = activation_fn(dot(X[i], weights) + b)
69+
/// Loss of a training data point = - (y_actual * std::log(y_estimate) + (1 - y_actual) * std::log(1 - y_estimate))
70+
/// total loss: summation of loss for all the data points
71+
double compute_total_loss(const MatrixD &X, const VectorD &Y,
72+
const VectorD &weights, double b) {
73+
double loss = 0;
74+
const std::size_t num_of_threads = std::thread::hardware_concurrency();
75+
std::vector<std::thread> threads(num_of_threads);
76+
int thread_id = 0;
77+
for (std::size_t i = 0; i < X.size(); ++i) {
78+
if (threads[thread_id].joinable())
79+
threads[thread_id].join();
80+
threads[thread_id] =
81+
std::thread(compute_and_add_loss, std::cref(X[i]), Y[i],
82+
std::cref(weights), b, std::ref(loss));
83+
thread_id = (thread_id + 1) % num_of_threads;
84+
}
85+
for (std::size_t i = 0; i < num_of_threads; ++i) {
86+
if (threads[i].joinable())
87+
threads[i].join();
88+
}
89+
90+
return loss;
91+
}
92+
93+
int main() {
94+
auto loss_grad = clad::gradient(compute_total_loss);
95+
// Fill the values as required!
96+
MatrixD X;
97+
VectorD Y;
98+
VectorD weights;
99+
double b;
100+
101+
// derivatives
102+
// Zero the derivative variables and make them of the same dimension as the
103+
// corresponding primal values.
104+
MatrixD d_X;
105+
VectorD d_Y;
106+
VectorD d_weights;
107+
double d_b = 0;
108+
109+
loss_grad.execute(X, Y, weights, b, &d_X, &d_Y, &d_weights, &d_b);
110+
111+
std::cout << "dLossFn/dW[2]: " << d_weights[2] << "\n"; // Partial derivative of the loss function w.r.t weight[2]
112+
std::cout << "dLossFn/db: " << d_b << "\n"; // Partial derivative of the loss function w.r.t b
113+
}
114+
```
115+
116+
## Requirements
117+
118+
* Automatic differentiation
119+
* Parallel programming
120+
* Reasonable expertise in C++ programming
121+
122+
## Mentors
123+
* **[Vassil Vassilev](mailto:[email protected])**
124+
* [David Lange](mailto:[email protected])
125+
126+
## Links
127+
* [Repo](https://github.com/vgvassilev/clad)
Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,39 @@
1+
---
2+
title: Support usage of Thrust API in Clad
3+
layout: gsoc_proposal
4+
project: Clad
5+
year: 2025
6+
difficulty: medium
7+
duration: 350
8+
mentor_avail: June-October
9+
organization:
10+
- CompRes
11+
---
12+
13+
## Description
14+
15+
The rise of ML has shed light into the power of GPUs and researchers are looking for ways to incorporate them in their projects as a lightweight parallelization method. Consequently, General Purpose GPU programming is becoming a very popular way to speed up execution time.
16+
17+
Clad is a clang plugin for automatic differentiation that performs source-to-source transformation and produces a function capable of computing the derivatives of a given function at compile time. This project aims to enhance Clad by adding support for Thrust, a parallel algorithms library designed for GPUs and other accelerators. By supporting Thrust, Clad will be able to differentiate algorithms that rely on Thrust’s parallel computing primitives, unlocking new possibilities for GPU-based machine learning, scientific computing, and numerical optimization.
18+
19+
## Expected Results
20+
21+
* Research and decide on the most valuable Thrust functions to support in Clad
22+
* Create pushforward and pullback functions for these Thrust functions
23+
* Write tests that cover the additions
24+
* Include demos of using Clad on open source code examples that call Thrust functions
25+
* Write documentation on which Thrust functions are supported in Clad
26+
* Present the work at the relevant meetings and conferences.
27+
28+
## Requirements
29+
30+
* Automatic differentiation
31+
* C++ programming
32+
* Clang frontend
33+
34+
## Mentors
35+
* **[Christina Koutsou](mailto:@[email protected])**
36+
* [Vassil Vassilev](mailto:[email protected])
37+
38+
## Links
39+
* [Repo](https://github.com/vgvassilev/clad)

0 commit comments

Comments
 (0)