Skip to content

Commit 1d12a67

Browse files
authored
Add initial compiler-research projects on InterOp and Clad (#1663)
1 parent c0330b0 commit 1d12a67

File tree

2 files changed

+83
-0
lines changed

2 files changed

+83
-0
lines changed
Lines changed: 43 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,43 @@
1+
---
2+
title: Implement and improve an efficient, layered tape with prefetching capabilities
3+
layout: gsoc_proposal
4+
project: Clad
5+
year: 2025
6+
difficulty: medium
7+
duration: 350
8+
mentor_avail: June-October
9+
organization:
10+
- CompRes
11+
---
12+
13+
## Description
14+
15+
In mathematics and computer algebra, automatic differentiation (AD) is a set of techniques to numerically evaluate the derivative of a function specified by a computer program. Automatic differentiation is an alternative technique to Symbolic differentiation and Numerical differentiation (the method of finite differences). Clad is based on Clang which provides the necessary facilities for code transformation. The AD library can differentiate non-trivial functions, to find a partial derivative for trivial cases and has good unit test coverage.
16+
17+
The most heavily used entity in AD is a stack-like data structure called a tape. For example, the first-in last-out access pattern, which naturally occurs in the storage of intermediate values for reverse mode AD, lends itself towards asynchronous storage. Asynchronous prefetching of values during the reverse pass allows checkpoints deeper in the stack to be stored furthest away in the memory hierarchy. Checkpointing provides a mechanism to parallelize segments of a function that can be executed on independent cores. Inserting checkpoints in these segments using separate tapes enables keeping the memory local and not sharing memory between cores. We will research techniques for local parallelization of the gradient reverse pass, and extend it to achieve better scalability and/or lower constant overheads on CPUs and potentially accelerators. We will evaluate techniques for efficient memory use, such as multi-level checkpointing support. Combining already developed techniques will allow executing gradient segments across different cores or in heterogeneous computing systems. These techniques must be robust and user-friendly, and minimize required application code and build system changes.
18+
19+
This project aims to improve the efficiency of the clad tape and generalize it into a tool-agnostic facility that could be used outside of clad as well.
20+
21+
## Expected Results
22+
23+
* Optimize the current tape by avoiding re-allocating on resize in favor of using connected slabs of array
24+
* Enhance existing benchmarks demonstrating the efficiency of the new tape
25+
* Add the tape thread safety
26+
* Implement multilayer tape being stored in memory and on disk
27+
* [Stretch goal] Support cpu-gpu transfer of the tape
28+
* [Stretch goal] Add infrastructure to enable checkpointing offload to the new tape
29+
* [Stretch goal] Performance benchmarks
30+
31+
32+
## Requirements
33+
34+
* Automatic differentiation
35+
* C++ programming
36+
* Clang frontend
37+
38+
## Mentors
39+
* **[Vassil Vassilev](mailto:[email protected])**
40+
* [David Lange](mailto:[email protected])
41+
42+
## Links
43+
* [Repo](https://github.com/vgvassilev/clad)
Lines changed: 40 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,40 @@
1+
---
2+
title: Implement CppInterOp API exposing memory, ownership and thread safety information
3+
layout: gsoc_proposal
4+
project: Cppyy
5+
year: 2025
6+
difficulty: medium
7+
duration: 350
8+
mentor_avail: June-October
9+
organization:
10+
- CompRes
11+
---
12+
13+
## Description
14+
15+
Incremental compilation pipelines process code chunk-by-chunk by building an ever-growing translation unit. Code is then lowered into the LLVM IR and subsequently run by the LLVM JIT. Such a pipeline allows creation of efficient interpreters. The interpreter enables interactive exploration and makes the C++ language more user friendly. The incremental compilation mode is used by the interactive C++ interpreter, Cling, initially developed to enable interactive high-energy physics analysis in a C++ environment.
16+
17+
Clang and LLVM provide access to C++ from other programming languages, but currently only exposes the declared public interfaces of such C++ code even when it has parsed implementation details directly. Both the high-level and the low-level program representation has enough information to capture and expose more of such details to improve language interoperability. Examples include details of memory management, ownership transfer, thread safety, externalized side-effects, etc. For example, if memory is allocated and returned, the caller needs to take ownership; if a function is pure, it can be elided; if a call provides access to a data member, it can be reduced to an address lookup. The goal of this project is to develop API for CppInterOp which are capable of extracting and exposing such information AST or from JIT-ed code and use it in cppyy (Python-C++ language bindings) as an exemplar. If time permits, extend the work to persistify this information across translation units and use it on code compiled with Clang.
18+
19+
## Project Milestones
20+
21+
* Collect and categorize possible exposed interop information kinds
22+
* Write one or more facilities to extract necessary implementation details
23+
* Design a language-independent interface to expose this information
24+
* Integrate the work in clang-repl and Cling
25+
* Implement and demonstrate its use in cppyy as an exemplar
26+
* Present the work at the relevant meetings and conferences.
27+
28+
## Requirements
29+
30+
* C++ programming
31+
* Python programming
32+
* Knowledge of Clang and LLVM
33+
34+
## Mentors
35+
* **[Vassil Vassilev](mailto:[email protected])**
36+
* [Aaron Jomy](mailto:[email protected])
37+
* [David Lange](mailto:[email protected])
38+
39+
## Links
40+
* [Repo](https://github.com/compiler-research/CppInterOp)

0 commit comments

Comments
 (0)