Skip to content

Commit 993bc3e

Browse files
authored
Merge branch 'compiler-research:master' into master
2 parents c44839c + 610d6df commit 993bc3e

15 files changed

+553
-4
lines changed

.github/actions/spelling/allow/terms.txt

Lines changed: 17 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,7 @@ Backpropagation
55
CINT
66
CMSSW
77
Caa
8+
Codegen
89
Cppyy
910
Debian
1011
EPC
@@ -30,6 +31,7 @@ Ohridski
3031
OMP
3132
OpenMP
3233
PTX
34+
RAII
3335
Resugaring
3436
SBO
3537
Slib
@@ -44,10 +46,12 @@ biodynamo
4446
bioinformatics
4547
blogs
4648
cms
49+
codegen
4750
consteval
4851
cppyy
4952
cytokine
5053
cytokines
54+
doxygen
5155
gitlab
5256
gpu
5357
gridlay
@@ -129,4 +133,16 @@ VVACAT
129133
VVCR
130134
VVLLVM
131135
VVMODE
132-
VVSNL
136+
VVSNL
137+
Autodiff
138+
cladtorch
139+
FASTQ
140+
firstprivate
141+
godbolt
142+
KWh
143+
lastprivate
144+
markdownify
145+
crcon
146+
crconlist
147+
petabyte
148+
Vkv

_data/crconlist2025.yml

Lines changed: 197 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,197 @@
1+
- name: "CompilerResearchCon 2025 (day 2)"
2+
date: 2025-11-13 15:00:00 +0200
3+
time_cest: "15:00"
4+
connect: "[Link to zoom](https://princeton.zoom.us/j/97915651167?pwd=MXJ1T2lhc3Z5QWlYbUFnMTZYQlNRdz09)"
5+
label: crcon25_part_2
6+
agenda:
7+
- title: "Implementing Debugging Support for xeus-cpp"
8+
speaker:
9+
name: "Abhinav Kumar"
10+
time_cest: "15:00 - 15:20"
11+
description: |
12+
This proposal outlines integrating debugging into the xeus-cpp kernel
13+
for Jupyter using LLDB and its Debug Adapter Protocol (lldb-dap).
14+
Modeled after xeus-python, it leverages LLDB’s Clang and JIT debugging
15+
support to enable breakpoints, variable inspection, and step-through
16+
execution. The modular design ensures compatibility with Jupyter’s
17+
frontend, enhancing interactive C++ development in notebooks.
18+
19+
This project achieved DAP protocol integration with xeus-cpp. User can
20+
use the JupyterLab’s debugger panel to debug C++ JIT code. Applying and
21+
hitting breakpoints, stepping in and out of functions are supported in
22+
xeus-cpp. Additionally, during this project I had refactored
23+
the Out-of-Process JIT execution which was the major part in implementing
24+
the debugger.
25+
26+
27+
# slides: /assets/presentations/...
28+
29+
- title: "Activity analysis for reverse-mode differentiation of (CUDA) GPU kernels"
30+
speaker:
31+
name: "Maksym Andriichuk"
32+
time_cest: "15:20 - 15:40"
33+
description: |
34+
Clad is a Clang plugin designed to provide automatic differentiation (AD) for C++
35+
mathematical functions. It generates code for computing derivatives modifying
36+
Abstract-Syntax-Tree(AST) using LLVM compiler features. It performs advanced program
37+
optimization by implementing more sophisticated analyses because it has access to a
38+
rich program representation – the Clang AST.
39+
40+
The project achieved to optimize code that contains potential data-race conditions,
41+
significantly speeding up the execution. Thread Safety Analysis is a static analysis
42+
that detects possible data-race conditions that would enable reducing atomic
43+
operations in the Clad-produced code.
44+
45+
# slides: /assets/presentations/...
46+
47+
- title: "Enable automatic differentiation of OpenMP programs with Clad"
48+
speaker:
49+
name: "Jiayang Li"
50+
time_cest: "15:40 - 16:00"
51+
description: |
52+
This project extends Clad, a Clang-based automatic differentiation tool for C++, to
53+
support OpenMP programs. This project enables Clad to parse and differentiate
54+
functions with OpenMP directives, thereby enabling gradient computation in
55+
multi-threaded environments.
56+
57+
This project achieved Clad support for both forward and reverse mode differentiation
58+
of common OpenMP directives (parallel, parallel for) and clauses (private,
59+
firstprivate, lastprivate, shared, atomic, reduction) by implementing OpenMP-related
60+
AST parsing and designing corresponding differentiation strategies. Additional
61+
contributions include example applications and comprehensive tests.
62+
63+
64+
# slides: /assets/presentations/...
65+
66+
- title: "Using ROOT in the field of Genome Sequencing"
67+
speaker:
68+
name: "Aditya Pandey"
69+
time_cest: "16:00 - 16:20"
70+
description: |
71+
The project extends ROOT, CERN's petabyte-scale data processing framework, to address
72+
the critical challenge of managing genomic data that generates upto 200GB per human
73+
genome. By leveraging ROOT's big data expertise and introducing the next-generation
74+
RNTuple columnar storage format specifically optimized for genomic sequences, the
75+
project eliminates the traditional trade-off between compression efficiency and
76+
access speed in bioinformatics.
77+
78+
The project achieved comprehensive genomic data support through validating GeneROOT
79+
baseline performance benchmarks against BAM/SAM formats, implementing RNTuple-based
80+
RAM (ROOT Alignment Maps) format with full SAM/BAM field support and smart reference
81+
management, demonstrating 23.5% smaller file sizes compared to CRAM while delivering
82+
1.9x faster large region queries and 3.2x faster full chromosome scans, optimizing
83+
FASTQ compression from 14.2GB to 6.8GB. We also developed chromosome based
84+
file-splitting for larger genome file so that chromosome based data can be extracted.
85+
86+
87+
# slides: /assets/presentations/...
88+
89+
- name: "CompilerResearchCon 2025 (day 1)"
90+
date: 2025-10-30 15:00:00 +0200
91+
time_cest: "15:00"
92+
connect: "[Link to zoom](https://princeton.zoom.us/j/97915651167?pwd=MXJ1T2lhc3Z5QWlYbUFnMTZYQlNRdz09)"
93+
label: crcon25_part_1
94+
agenda:
95+
- title: "CARTopiaX an Agent-Based Simulation of CAR -T -Cell Therapy built on BioDynaMo"
96+
speaker:
97+
name: "Salvador de la Torre Gonzalez"
98+
time_cest: "15:00 - 15:20"
99+
description: |
100+
CAR- T-cell therapy is a form of cancer immunotherapy that engineers a
101+
patient’s T cells to recognize and eliminate malignant cells. Although
102+
highly effective in leukemias and other hematological cancers, this therapy
103+
faces significant challenges in solid tumors due to the complex and
104+
heterogeneous tumor microenvironment. CARTopiaX is an advanced agent-based
105+
model developed to address this challenge, using the mathematical framework
106+
proposed in the Nature paper “In silico study of heterogeneous tumour-derived
107+
organoid response to CAR T-cell therapy,” successfully replicating its core
108+
results. Built on BioDynaMo, a high-performance, open-source platform for
109+
large-scale and modular biological modeling, CARTopiaX enables detailed
110+
exploration of complex biological interactions, hypothesis testing, and
111+
data-driven discovery within solid tumor microenvironments.
112+
113+
The project achieved major milestones, including simulations that run more than
114+
twice as fast as previous model, allowing rapid scenario exploration and robust
115+
hypothesis validation; high-quality, well-structured, and maintainable C++ code
116+
developed following modern software engineering principles; and a scalable,
117+
modular, and extensible architecture that fosters collaboration, customization,
118+
and the continuous evolution of an open-source ecosystem. Altogether, this work
119+
represents a meaningful advancement in computational biology, providing
120+
researchers with a powerful tool to investigate CAR- T- cell dynamics in solid
121+
tumors and accelerating scientific discovery while reducing the time and cost
122+
associated with experimental wet-lab research.
123+
124+
# slides: /assets/presentations/...
125+
126+
- title: "Efficient LLM Training in C++ via Compiler-Level Autodiff with Clad"
127+
speaker:
128+
name: "Rohan Timmaraju"
129+
time_cest: "15:20 - 15:40"
130+
description: |
131+
The computational demands of Large Language Model (LLM) training are
132+
often constrained by the performance of Python frameworks. This project
133+
tackles these bottlenecks by developing a high-performance LLM training
134+
pipeline in C++ using Clad, a Clang plugin for compiler-level automatic
135+
differentiation. The core of this work involved creating cladtorch, a new
136+
C++ tensor library with a PyTorch-style API designed for compatibility
137+
with Clad's differentiation capabilities. This library provides a more
138+
user-friendly interface for building and training neural networks while
139+
enabling Clad to automatically generate gradient computations for
140+
backpropagation.
141+
142+
Throughout the project, I successfully developed two distinct LLM training
143+
implementations. The first, using the cladtorch library, established a
144+
functional and flexible framework for Clad-driven AD. To further push
145+
performance boundaries, I then developed a second, highly-optimized
146+
implementation inspired by llm.c, which utilizes pre-allocated memory buffers
147+
and custom kernels. This optimized C-style approach, when benchmarked for
148+
GPT-2 training on a multithreaded CPU, outperformed the equivalent PyTorch
149+
implementation. This work successfully demonstrates the viability and
150+
performance benefits of compiler-based AD for deep learning in C++ and
151+
provides a strong foundation for future hardware acceleration, such as porting
152+
the implementation to CUDA.
153+
154+
# slides: /assets/presentations/...
155+
156+
- title: "Implement and improve an efficient, layered tape with prefetching capabilities"
157+
speaker:
158+
name: "Aditi Milind Joshi"
159+
time_cest: "15:40 - 16:00"
160+
description: |
161+
Clad relies on a tape data structure to store intermediate values during reverse
162+
mode differentiation. This project focuses on enhancing the core tape implementation
163+
in Clad to make it more efficient and scalable. Key deliverables include replacing
164+
the existing dynamic array-based tape with a slab allocation approach and small
165+
buffer optimization, enabling multilayer storage, and introducing thread safety to
166+
support concurrent access.
167+
168+
The current implementation replaces the dynamic array with a slab-based structure
169+
and a small static buffer, eliminating costly reallocations. Thread-safe access
170+
functions have been added through a mutex locking mechanism, ensuring safe parallel
171+
tape operations. Ongoing work includes developing a multilayer tape system with
172+
offloading capabilities, which will allow only the most recent slabs to remain in
173+
memory.
174+
175+
176+
# slides: /assets/presentations/...
177+
178+
- title: "Support usage of Thrust API in Clad"
179+
speaker:
180+
name: "Abdelrhman Elrawy"
181+
time_cest: "16:00 - 16:20"
182+
description: |
183+
This project integrates NVIDIA's Thrust library into Clad, a Clang-based automatic
184+
differentiation tool for C++. By extending Clad's source-to-source transformation
185+
engine to recognize and differentiate Thrust parallel algorithms, the project
186+
enables automatic gradient generation for GPU-accelerated scientific computing
187+
and machine learning applications.
188+
189+
The project achieved Thrust support in Clad through implementing custom derivatives
190+
for core algorithms including thrust::reduce, thrust::transform,
191+
thrust::transform_reduce, thrust::inner_product, thrust::copy, scan operations
192+
(inclusive/exclusive), thrust::adjacent_difference, and sorting primitives.
193+
Additional contributions include Thrust data containers like thrust::device_vector,
194+
generic functor handling for transformations, demonstration applications, and
195+
comprehensive unit tests.
196+
197+
# slides: /assets/presentations/...

_data/releases.yml

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,21 @@
1+
- date: 2025-10-01
2+
codebase: "Clad"
3+
version: "v2.1"
4+
description: |
5+
Clad 2.1 introduces major advancements in reverse mode differentiation,
6+
bringing smarter handling of loops, assignments, and method calls, alongside
7+
the new clad::restore_tracker for functions that modify their
8+
inputs. Forward mode gains static scheduling for Hessians and higher-order
9+
derivatives, while CUDA support expands with custom derivatives for key
10+
Thrust algorithms such as reduce, transform, and transform_reduce, plus
11+
optimizations that reduce unnecessary GPU atomics. The release also
12+
strengthens error estimation, simplifies adjoint initialization, improves
13+
tape efficiency, and enhances diagnostics. With a migration to C++17,
14+
support extended up to clang-21, and numerous bug fixes, Clad 2.1 delivers
15+
faster, safer, and more reliable automatic differentiation across CPU and
16+
GPU workflows.
17+
link: "https://github.com/vgvassilev/clad/releases/tag/v2.1"
18+
119
- date: 2025-07-27
220
codebase: "Clad"
321
version: "v2.0"

_data/standing_meetings.yml

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,10 +3,18 @@
33
time_cest: "17:00"
44
connect: "[Link to zoom](https://princeton.zoom.us/j/97915651167?pwd=MXJ1T2lhc3Z5QWlYbUFnMTZYQlNRdz09)"
55
agenda:
6+
- title: "Supporting Automatic Differentiation in CMS Combine profile likelihood scans"
7+
date: 2025-09-25 15:00:00 +0200
8+
speaker: "Galin Bistrev"
9+
link: "[Slides](/assets/presentations/CaaS_Weekly_25_09_2025_G_Bistrev_ADInCombine_WrapUp.pdf)"
610
- title: "Wrap-Up: Improve automatic differentiation of object-oriented paradigms using Clad"
711
date: 2025-09-18 15:00:00 +0200
812
speaker: "Petro Zarytskyi"
913
link: "[Slides](/assets/presentations/CaaS_Weekly_18_09_2025_Petro_Zarytskyi_OOPParadigms_GSoC_WrapUp.pdf)"
14+
- title: "Midterm evaluation: Enable automatic differentiation of OpenMP programs with Clad"
15+
date: 2025-08-21 15:00:00 +0200
16+
speaker: "Jiayang Li"
17+
link: "[Slides](/assets/presentations/CaaS_Weekly_21_08_2025_Jiayang_Li_GSoC25_Midterm.pdf)"
1018
- title: "Midterm evaluation: Support usage of Thrust API in Clad"
1119
date: 2025-08-14 15:00:00 +0200
1220
speaker: "Abdelrhman Elrawy"
@@ -417,4 +425,3 @@
417425
date: 2025-08-14 15:00:00 +0200
418426
speaker: "Salvador de la Torre Gonzalez"
419427
link: "[Slides](/assets/presentations/Salva_GSoC_midterm_presentation_CART.pdf)"
420-

_includes/header.html

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,7 @@
2929
</li>
3030

3131
<li class="nav-item dropdown">
32-
<a href="#" class="nav-link dropdown-toggle" id="dropdown03" data-toggle="dropdown" role="button" aria-haspopup="true" aria-expanded="false">Interactive Tools</a>
32+
<a href="#" class="nav-link dropdown-toggle" id="dropdown02" data-toggle="dropdown" role="button" aria-haspopup="true" aria-expanded="false">Interactive Tools</a>
3333
<ul class="dropdown-menu" aria-labelledby="dropdown03">
3434
<li class="dropdown-item"> <a class="nav-link" target="_blank" href="https://godbolt.org/z/3KWhY4j8M">Clad on Godbolt</a></li>
3535
<li class="dropdown-item"> <a class="nav-link" target="_blank" href="https://compiler-research.org/xeus-cpp-wasm/lab/index.html">Xeus-CPP</a></li>
@@ -52,6 +52,12 @@
5252
</li>
5353
<li class="nav-item">
5454
<a class="nav-link" href="/blog">Blog</a>
55+
</li>
56+
<li class="nav-item dropdown">
57+
<a href="#" class="nav-link dropdown-toggle" id="dropdown04" data-toggle="dropdown" role="button" aria-haspopup="true" aria-expanded="false">CompilerResearchCon</a>
58+
<ul class="dropdown-menu" aria-labelledby="dropdown04">
59+
<li class="dropdown-item"> <a class="nav-link" href="/crcon2025">2025</a></li>
60+
</ul>
5561
</li>
5662
</ul>
5763
</div>

_pages/404.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,5 +7,5 @@ permalink: /404.html
77
# 404 Page Not Found
88

99
The page you were looking for was not found.
10-
If you believe this was in error, please contact <mailto:[email protected]>
10+
If you believe this was an error, please contact <mailto:[email protected]>
1111

_pages/crcon2025.md

Lines changed: 68 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,68 @@
1+
---
2+
title: "CompilerResearchCon"
3+
layout: gridlay
4+
excerpt: "CompilerResearchCon"
5+
sitemap: false
6+
permalink: /crcon2025/
7+
---
8+
9+
# CompilerResearchCon
10+
11+
Compiler Research Conferences are focused events that bring together members and
12+
contributors to share progress and insights on specific initiatives. These
13+
virtual gatherings provide an opportunity to present completed work, discuss
14+
outcomes, and explore the impact of research efforts in compiler technology
15+
and related areas. Such conferences typically feature presentations from contributors,
16+
including participants in programs like Google Summer of Code, showcasing
17+
developments in automatic differentiation, interpretative C/C++/CUDA, and
18+
other compiler infrastructure projects. These events promote knowledge exchange
19+
and celebrate the collaborative achievements of our research community.
20+
21+
<i>If you are interested in our work you can join our
22+
[compiler-research-announce google groups forum](https://groups.google.com/g/compiler-research-announce)
23+
or follow us on [LinkedIn](https://www.linkedin.com/groups/9579649/).</i>
24+
25+
{% assign sorted_crcon = site.data.crconlist2025 | sort: "date" | reverse %}
26+
27+
{% for crcon in sorted_crcon %}
28+
29+
<div class="row">
30+
<span id="{{crcon.label}}">&nbsp;</span>
31+
<div class="clearfix">
32+
<div class="well" style="padding-left: 20px; padding-right: 20px">
33+
<a style="text-decoration:none;" href="#{{crcon.label}}">
34+
{{ crcon.name }} -- {{ crcon.date | date_to_long_string }} at {{crcon.time_cest}} Geneva (CH) Time
35+
</a>
36+
<div>Connection information: {{crcon.connect}} <br />
37+
</div><div>
38+
Agenda:
39+
<ul>{% for item in crcon.agenda %}
40+
<li><strong>{{item.time_cest}}
41+
{% if item.speaker %}
42+
{% if item.speaker.first %}
43+
{{ item.speaker.name }}
44+
<br>“{{item.title}}”</strong>
45+
{% else %}
46+
({{item.speaker}})
47+
{% endif %}
48+
{% endif %}
49+
{% if item.description %}
50+
<br /> <i>Abstract:</i>{{item.description | markdownify }}
51+
{% endif %}
52+
{% if item.slides %}
53+
<a style="text-decoration:none;" href="{{item.slides}}">Slides</a>
54+
{% endif %}
55+
{% if item.video %}
56+
<a style="text-decoration:none;" href="{{item.video}}">Video</a>
57+
{% endif %}
58+
{{ item.link }}
59+
</li>
60+
{% endfor %}</ul>
61+
</div>
62+
</div>
63+
</div>
64+
65+
</div>
66+
67+
{% endfor %}
68+
File renamed without changes.

0 commit comments

Comments
 (0)