Introduction
This document contains the release notes for the automatic differentiation plugin for clang Clad, release 2.2. Clad is built on top of
Clang and LLVM compiler infrastructure. Here we describe the status of Clad in some detail, including major improvements from the previous release and new feature work.
Note that if you are reading this file from a git checkout, this document applies to the next release, not the current one.
What's New in Clad 2.2?
Some of the major new features and improvements to Clad are listed here. Generic improvements to Clad as a whole or to its underlying infrastructure are described first.
External Dependencies
- Clad now works with clang-11 to clang-21
- Removed unused coverage libraries for faster CI builds.
- Fixed Python 3.12 linking issue in LLVM 16 installs on macOS.
- macOS ARM CI updated to macOS 26.
Forward Mode & Reverse Mode
- Major internal cleanup and generalization of differentiation pipelines.
- Unified initialization logic for adjoints and original variables.
- Improved compatibility across pointer, tensor, and reference types.
- Added support for conversion operators and
std::reference_wrapper. - Enhanced handling of initialization lists, pseudo-destructors, and complex expressions.
Forward Mode
- OpenMP Support (Experimental): Introduced basic OpenMP support for forward mode differentiation.
- Simplified adjoint and initializer handling for forward-pass variables.
Reverse Mode
- Added reverse mode checkpointing for loops, improving memory efficiency in long iterations.
- Elidable Reverse Passes: Introduced
elidable_reverse_forwattribute to skip redundant reverse passes for trivially invertible functions. constructor_reverse_forwnow supports static scheduling andelidable_reverse_forw.- Added support for
CompoundLiteralExprand improved differentiation through compound expressions. - Simplified handling of deallocations and memory operations via attributes.
- Improved function differentiation sequence by resolving pullbacks before argument differentiation.
- Unified differentiation order for pointer and reference types.
- Optimized unary operator simplification (removed redundant &*_d_x patterns).
CUDA
- Extended Thrust differentiation support:
thrust::reduce_by_keythrust::sort_by_keythrust::adjacent_difference- segmented scans and prefix-sum operations
- Added thrust::device_vector support.
- Introduced BoW (Bag-of-Words) logistic regression demo using Thrust.
- Replaced iterator-based std::move with CUDA-safe clad::move.
- Added generic functor support for Thrust transform.
Misc
- Added thread-safe tape access functions:
- Zero overhead in single-threaded mode.
- Controlled locking for multithreaded differentiation.
- Improved handling of ill-formed code by skipping Clad runs when Clang compilation fails.
- Refined diagnostic messages and simplified deallocation functions through attribute-based design.
- Updated testing and build infrastructure:
- Added STL test coverage (starting with ).
- Cleaned up coverage configuration.
- Enhanced CI stability and performance.
Fixed Bugs
Special Kudos
This release wouldn't have happened without the efforts of our contributors, listed in the form of Firstname Lastname (#contributions):
FirstName LastName (#commits)
A B (N)
Petro Zarytskyi (33; reverse mode, loop checkpointing, elidable reverse passes)
Abdelrhman Elrawy (9; CUDA/Thrust derivatives, demos, logistic regression)
Vassil Vassilev (9; compiler integration, CI, and build infrastructure)
Matthew Barton (2; macOS and Python 3.12 CI fixes)
Aditi Joshi (1; thread-safe tape access implementation)
Max Andriychuk (1; analyses)
Errant (1; OpenMP differentiation support)
What's Changed
- Diagnose user-provided single argument pullbacks. by @vgvassilev in #1583
- Add Thrust-based demos by @a-elrawy in #1554
- Add custom derivatives for Thrust prefix sum operations by @a-elrawy in #1589
- Support classes when building namespace specifiers by @PetroZarytskyi in #1588
- [ci] Move debug build to latest supported clang. by @vgvassilev in #1592
- Add some testing infrastructure for STL starting with by @vgvassilev in #1553
- Find pullback before differentiating args in RMV::VisitCallExpr by @PetroZarytskyi in #1593
- Remove redundant branch and improve style. NFC by @vgvassilev in #1595
- Add a CXXCastPath to casts. Fix asserts locally with llvm21 by @vgvassilev in #1594
- add custom derivatives for thrust::adjacent_difference by @a-elrawy in #1597
- Simplify custom reverse_forw by @PetroZarytskyi in #1586
- Do not run clad if clang failed to compile a translation unit. by @vgvassilev in #1600
- Added Generic functor support for transform by @a-elrawy in #1599
- Introduce a single clad::Tag and reimplement other tags with
usingby @PetroZarytskyi in #1598 - Add support for conversion operators by @PetroZarytskyi in #1602
- Add support for
elidable_reverse_forwtoconstructor_reverse_forwby @PetroZarytskyi in #1601 - Simplify RMV::VisitInitListExpr by @PetroZarytskyi in #1605
- Add support for std::reference_wrapper by @PetroZarytskyi in #1606
- Update MacOS arm jobs in ci to MacOS 26 by @mcbarton in #1607
- Schedule constructor_reverse_forw statically by @PetroZarytskyi in #1604
- Add initial thrust::device_vector support and update Thrust demos by @a-elrawy in #1608
- Simplify reference-type variable differentiation by @PetroZarytskyi in #1610
- Support pseudo-destructors in the RMV and TBR by @PetroZarytskyi in #1613
- Simplify pointer-type variable differentiation by @PetroZarytskyi in #1614
- Elide memory function reverse_forw by @PetroZarytskyi in #1615
- Add custom derivatives for thrust::sort_by_key by @a-elrawy in #1611
- Don't initialize loop counters with size_t zero by @PetroZarytskyi in #1622
- Add custom derivatives for Thrust segmented scans by @a-elrawy in #1620
- Remove unary operators that cancel eath other by @PetroZarytskyi in #1618
- Add thread safe tape functions by @aditimjoshi in #1470
- Add custom derivatives for thrust::reduce_by_key by @a-elrawy in #1617
- Fix python 12 linking issue llvm 16 install osx 15 x86 by @mcbarton in #1627
- Turn on clad::restore_tracker for member function reverse_forw by @PetroZarytskyi in #1619
- Add BoW logistic regression Thrust demo by @a-elrawy in #1630
- Replace iterator-based std::move with clad::move by @PetroZarytskyi in #1626
- Support CompoundLiteralExpr in the reverse mode by @PetroZarytskyi in #1631
- [ci] Remove the unused coverage_config library by @vgvassilev in #1632
- Add support for OOP and enable respective numerical tests by @ovdiiuv in #1612
- Support loop checkpointing in the reverse mode by @PetroZarytskyi in #1616
- Add support for OpenMP in forward mode by @Errant404 in #1491
Full Changelog: v2.1...v2.2