| 
1 |  | -title: "Results from CERN Summer School 2025: Supporting Automatic   | 
2 |  | -Differentiation in CMS Combine profile likelihood scans"  | 
 | 1 | +---  | 
 | 2 | +title: "Results from CERN Summer School 2025: Supporting Automatic Differentiation in CMS Combine profile likelihood scans"  | 
3 | 3 | layout: post  | 
4 |  | -excerpt: "A CERN Summer Student 2025 project aiming at the support of   | 
5 |  | -automatic differentiation (AD) for likelihood scans in the CMS Combine   | 
6 |  | -tool to accelerate statistical inference by leveraging RooFit's AD   | 
7 |  | -support and LLVM-based gradient generation."  | 
 | 4 | +excerpt: >  | 
 | 5 | +  A CERN Summer Student 2025 project aiming at the support of  | 
 | 6 | +  automatic differentiation (AD) for likelihood scans in the CMS Combine  | 
 | 7 | +  tool to accelerate statistical inference by leveraging RooFit's AD  | 
 | 8 | +  support and LLVM-based gradient generation.  | 
8 | 9 | sitemap: false  | 
9 | 10 | author: Galin Bistrev  | 
10 | 11 | permalink: blogs/2025_galin_bistrev_results_blog/  | 
11 | 12 | banner_image: /images/blog/banner-cern.jpg  | 
12 | 13 | date: 2025-09-25  | 
13 |  | -tags: cern cms root combine c++ RooFit automatic-differentiation	  | 
 | 14 | +tags: cern cms root combine c++ RooFit automatic-differentiation  | 
14 | 15 | ---  | 
15 | 16 | 
 
  | 
16 | 17 | ### **Introduction**  | 
17 | 18 | Greetings! I’m Galin Bistrev, a fourth-year student specializing in  | 
18 |  | - Nuclear and Particle Physics at the University of Sofia "St. Kliment Ohridski."    | 
19 |  | -As part of the CERN Summer Student Programme 2025, I was working on a   | 
20 |  | -project that aimed to provide support for Automatic Differentiation   | 
 | 19 | + Nuclear and Particle Physics at the University of Sofia "St. Kliment Ohridski."  | 
 | 20 | +As part of the CERN Summer Student Programme 2025, I was working on a  | 
 | 21 | +project that aimed to provide support for Automatic Differentiation  | 
21 | 22 | (AD) into the CMS Combine tool profile likelihood scans.  | 
22 | 23 | 
 
  | 
23 | 24 | Mentors: Jonas Rembser, Vassil Vasilev, David Lange  | 
24 | 25 | 
 
  | 
25 | 26 | ### **Description of the Project**  | 
26 | 27 | 
 
  | 
27 |  | -This project aims to enhance support for Automatic Differentiation (AD)   | 
28 |  | -in likelihood scans within the CMS Combine framework, the primary   | 
29 |  | -statistical analysis tool of the CMS experiment at CERN. Combine is   | 
30 |  | -built on top of RooFit, which has recently introduced AD to improve   | 
31 |  | -minimization techniques. By providing computationally efficient   | 
32 |  | -gradients through AD, RooFit achieves substantial performance   | 
33 |  | -improvements. In RooFit, Clad converts internal likelihood   | 
34 |  | -representations into standalone C++ code, from which gradient   | 
35 |  | -routines for AD are generated. This strategy not only speeds up the   | 
36 |  | -fitting process but also increases the portability and shareability   | 
37 |  | -of likelihood models, making them usable even by those without   | 
 | 28 | +This project aims to enhance support for Automatic Differentiation (AD)  | 
 | 29 | +in likelihood scans within the CMS Combine framework, the primary  | 
 | 30 | +statistical analysis tool of the CMS experiment at CERN. Combine is  | 
 | 31 | +built on top of RooFit, which has recently introduced AD to improve  | 
 | 32 | +minimization techniques. By providing computationally efficient  | 
 | 33 | +gradients through AD, RooFit achieves substantial performance  | 
 | 34 | +improvements. In RooFit, Clad converts internal likelihood  | 
 | 35 | +representations into standalone C++ code, from which gradient  | 
 | 36 | +routines for AD are generated. This strategy not only speeds up the  | 
 | 37 | +fitting process but also increases the portability and shareability  | 
 | 38 | +of likelihood models, making them usable even by those without  | 
38 | 39 | detailed knowledge of RooFit or Combine internals.  | 
39 | 40 | 
 
  | 
40 | 41 | ### **Brief overview of the CMS Combine engine**  | 
41 |  | -Combine is a statistical analysis framework that compares models of   | 
42 |  | -expected observations with real data. It is widely used for tasks such   | 
43 |  | -as searching for new particles or processes, setting limits on   | 
44 |  | -potential new physics, and measuring physical quantities like cross-sections.   | 
45 |  | -Although developed with High Energy Physics (HEP)   | 
46 |  | -applications in mind, Combine contains no intrinsic physics assumptions,   | 
47 |  | -making it fully general and independent of any specific analysis.   | 
48 |  | -This flexibility allows it to be applied across a broad range of   | 
 | 42 | +Combine is a statistical analysis framework that compares models of  | 
 | 43 | +expected observations with real data. It is widely used for tasks such  | 
 | 44 | +as searching for new particles or processes, setting limits on  | 
 | 45 | +potential new physics, and measuring physical quantities like cross-sections.  | 
 | 46 | +Although developed with High Energy Physics (HEP)  | 
 | 47 | +applications in mind, Combine contains no intrinsic physics assumptions,  | 
 | 48 | +making it fully general and independent of any specific analysis.  | 
 | 49 | +This flexibility allows it to be applied across a broad range of  | 
49 | 50 | statistical problems.  | 
50 | 51 | 
 
  | 
51 | 52 | Roughly, Combine performs three main functions:  | 
52 | 53 | 
 
  | 
53 | 54 | - Builds a statistical model of expected observations.  | 
54 | 55 | - Runs statistical tests comparing the model with observed data.  | 
55 |  | -- Provides tools for validating, inspecting, and understanding both the   | 
 | 56 | +- Provides tools for validating, inspecting, and understanding both the  | 
56 | 57 | model and the results of the statistical tests.  | 
57 | 58 | 
 
  | 
58 | 59 | ### **Project goals**  | 
59 | 60 | 
 
  | 
60 | 61 | In order for AD to be supported in Combine likelihood scans, a number of goals needed to be achieved:  | 
61 | 62 | 
 
  | 
62 |  | -- Refactoring some of Combine's logic into RooFit, so that Combine can   | 
63 |  | -reuse the AD-enabled minimization algorithm already present there.   | 
64 |  | -- Integrate gradient computation into likelihood scans, ensuring that   | 
65 |  | -derivatives are correctly propagated for efficient and accurate minimization.    | 
66 |  | -- Validate correctness and performance, confirming that the AD-based   | 
67 |  | -scans produce results consistent with traditional methods while   | 
 | 63 | +- Refactoring some of Combine's logic into RooFit, so that Combine can  | 
 | 64 | +reuse the AD-enabled minimization algorithm already present there.  | 
 | 65 | +- Integrate gradient computation into likelihood scans, ensuring that  | 
 | 66 | +derivatives are correctly propagated for efficient and accurate minimization.  | 
 | 67 | +- Validate correctness and performance, confirming that the AD-based  | 
 | 68 | +scans produce results consistent with traditional methods while  | 
68 | 69 | offering improved performance.  | 
69 | 70 | 
 
  | 
70 | 71 | ## **Overview of Completed Work**  | 
71 | 72 | Over the course of the project, several major tasks were completed to achieve the stated objectives:  | 
72 | 73 | 
 
  | 
73 |  | -- Imported the `RooMultiPdf` class in RooFit from Combine, enabling   | 
74 |  | -switching between multiple PDF-s, applying statistical penalties,   | 
 | 74 | +- Imported the `RooMultiPdf` class in RooFit from Combine, enabling  | 
 | 75 | +switching between multiple PDF-s, applying statistical penalties,  | 
75 | 76 | and supporting code generation for AD.  | 
76 |  | -   | 
77 |  | -- The implementation of the new class was made to be supported by   | 
78 |  | -`codegen` in RooFit by adding a new function in `MathFunc.h` and   | 
 | 77 | + | 
 | 78 | +- The implementation of the new class was made to be supported by  | 
 | 79 | +`codegen` in RooFit by adding a new function in `MathFunc.h` and  | 
79 | 80 | extending `CodegenImpl.cxx` to generate code for models making use of it.  | 
80 |  | -    | 
81 |  | -- Imported three pieces of code from Combine that handle the   | 
82 |  | -minimization procedures within the framework in RooFit's `RooMinimizer.cxx`.   | 
83 |  | -The first is a class imported by Jonas Rembser   | 
84 |  | -called `FreezeDisconnectedParametersRAII`, which automatically   | 
85 |  | -freezes and unfreezes parameters disconnected from the likelihood graph.   | 
86 |  | -The second is the function `generateOrthogonalCombinations`, which   | 
87 |  | -generates a list of index combinations by initializing a base   | 
88 |  | -configuration with all indices set to zero and then varying one category at a time.   | 
89 |  | -The third and final piece of code is a function called `reorderCombinations`,   | 
90 |  | -which takes the set of indices produced by `generateOrthogonalCombinations`   | 
91 |  | -and adjusts each combination by adding the corresponding base values   | 
92 |  | -modulo the maximum allowed index, effectively shifting the combinations   | 
 | 81 | + | 
 | 82 | +- Imported three pieces of code from Combine that handle the  | 
 | 83 | +minimization procedures within the framework in RooFit's `RooMinimizer.cxx`.  | 
 | 84 | +The first is a class imported by Jonas Rembser  | 
 | 85 | +called `FreezeDisconnectedParametersRAII`, which automatically  | 
 | 86 | +freezes and unfreezes parameters disconnected from the likelihood graph.  | 
 | 87 | +The second is the function `generateOrthogonalCombinations`, which  | 
 | 88 | +generates a list of index combinations by initializing a base  | 
 | 89 | +configuration with all indices set to zero and then varying one category at a time.  | 
 | 90 | +The third and final piece of code is a function called `reorderCombinations`,  | 
 | 91 | +which takes the set of indices produced by `generateOrthogonalCombinations`  | 
 | 92 | +and adjusts each combination by adding the corresponding base values  | 
 | 93 | +modulo the maximum allowed index, effectively shifting the combinations  | 
93 | 94 | relative to the current best indices.  | 
94 | 95 | 
 
  | 
95 |  | -- Using the above-stated functions, the discrete profiling algorithm,   | 
96 |  | -which is the main minimization algorithm in Combine, was imported   | 
 | 96 | +- Using the above-stated functions, the discrete profiling algorithm,  | 
 | 97 | +which is the main minimization algorithm in Combine, was imported  | 
97 | 98 | into `RooMinimizer.cxx`.  | 
98 |  | -- A [tutorial](https://root.cern/doc/master/rf619__discrete__profiling_8py.html)   | 
99 |  | -was created along with a [benchmark](https://github.com/vgvassilev/clad/issues/1521),   | 
100 |  | -made by Jonas Rembser, demonstrating discrete profiling with RooMultiPdf objects   | 
101 |  | -and evaluating the performance of AD in the likelihood scans.    | 
 | 99 | +- A [tutorial](https://root.cern/doc/master/rf619__discrete__profiling_8py.html)  | 
 | 100 | +was created along with a [benchmark](https://github.com/vgvassilev/clad/issues/1521),  | 
 | 101 | +made by Jonas Rembser, demonstrating discrete profiling with RooMultiPdf objects  | 
 | 102 | +and evaluating the performance of AD in the likelihood scans.  | 
102 | 103 | 
 
  | 
103 | 104 | ## **Results**  | 
104 |  | -With those objectives accomplished, RooFit now provides AD support for   | 
105 |  | -discrete profiling. However, the developed benchmark indicates that AD   | 
106 |  | -does not currently improve efficiency, as the gradient code generated by   | 
107 |  | -Clad introduces overhead. Further optimization in Clad is needed to achieve   | 
108 |  | -the potential performance gains for RooFit likelihood scans. More information   | 
 | 105 | +With those objectives accomplished, RooFit now provides AD support for  | 
 | 106 | +discrete profiling. However, the developed benchmark indicates that AD  | 
 | 107 | +does not currently improve efficiency, as the gradient code generated by  | 
 | 108 | +Clad introduces overhead. Further optimization in Clad is needed to achieve  | 
 | 109 | +the potential performance gains for RooFit likelihood scans. More information  | 
109 | 110 | regarding the issue can be found at [#1521](https://github.com/vgvassilev/clad/issues/1521).  | 
110 | 111 | 
 
  | 
111 | 112 | ## **Conclusions**  | 
112 |  | -Thanks to this project, RooFit now enables AD support for discrete profiling in Combine,   | 
113 |  | -which, after addressing the current overhead in Clad, would allow for   | 
114 |  | -significantly faster and more efficient likelihood scans while maintaining   | 
115 |  | -accurate optimization of both discrete and continuous parameters.   | 
 | 113 | +Thanks to this project, RooFit now enables AD support for discrete profiling in Combine,  | 
 | 114 | +which, after addressing the current overhead in Clad, would allow for  | 
 | 115 | +significantly faster and more efficient likelihood scans while maintaining  | 
 | 116 | +accurate optimization of both discrete and continuous parameters.  | 
116 | 117 | 
 
  | 
117 | 118 | ## **Future work**  | 
118 |  | -- Further benchmarking is required to quantify the potential performance   | 
 | 119 | +- Further benchmarking is required to quantify the potential performance  | 
119 | 120 | gains from automatic differentiation.  | 
120 |  | -- Additional optimization of Clad is needed to eliminate unnecessary   | 
 | 121 | +- Additional optimization of Clad is needed to eliminate unnecessary  | 
121 | 122 | overhead in gradient generation.  | 
122 |  | -- The discrete profiling logic implemented in RooMinimizer should be   | 
123 |  | -tested across different models to evaluate the minimizer’s behavior and   | 
 | 123 | +- The discrete profiling logic implemented in RooMinimizer should be  | 
 | 124 | +tested across different models to evaluate the minimizer’s behavior and  | 
124 | 125 | robustness.  | 
125 |  | -- Extend doxygen documentation of RooMinimizer to describe treatment of discrete   | 
 | 126 | +- Extend doxygen documentation of RooMinimizer to describe treatment of discrete  | 
126 | 127 | parameters.  | 
127 |  | -- Test if the implementation of discrete profiling works also inside CMS Combine ,   | 
128 |  | -replacing their implementation in `CascadeMinimizer.cxx`.   | 
 | 128 | +- Test if the implementation of discrete profiling works also inside CMS Combine ,  | 
 | 129 | +replacing their implementation in `CascadeMinimizer.cxx`.  | 
129 | 130 | 
 
  | 
130 | 131 | ## **Acknowledgements**  | 
131 |  | -I would like to express my sincere gratitude to the CERN Summer School   | 
132 |  | -for the opportunity to participate in such an inspiring project.   | 
133 |  | -I extend special thanks to Jonas Rembser, Vassil Vassilev, and David Lange for   | 
134 |  | -their invaluable guidance and for providing continuous learning opportunities throughout this journey.   | 
135 |  | -I am also grateful to the ROOT team for welcoming me and supporting me throughout my stay at CERN.   | 
 | 132 | +I would like to express my sincere gratitude to the CERN Summer School  | 
 | 133 | +for the opportunity to participate in such an inspiring project.  | 
 | 134 | +I extend special thanks to Jonas Rembser, Vassil Vassilev, and David Lange for  | 
 | 135 | +their invaluable guidance and for providing continuous learning opportunities throughout this journey.  | 
 | 136 | +I am also grateful to the ROOT team for welcoming me and supporting me throughout my stay at CERN.  | 
136 | 137 | 
 
  | 
137 | 138 | ## **Related Links**  | 
138 |  | -- [CMS Combine GitHub page](https://cms-analysis.github.io/HiggsAnalysis-CombinedLimit/latest/)    | 
139 |  | -- [ROOT official repository](https://github.com/root-project/root)    | 
 | 139 | +- [CMS Combine GitHub page](https://cms-analysis.github.io/HiggsAnalysis-CombinedLimit/latest/)  | 
 | 140 | +- [ROOT official repository](https://github.com/root-project/root)  | 
140 | 141 | - [My GitHub profile](https://github.com/GalinBistrev2)  | 
141 | 142 | - [Presentation](/assets/presentations/CaaS_Weekly_25_09_2025_Galin_Bistrev_AD_in_CMS_Combine.pdf)  | 
142 |  | - | 
143 |  | - | 
144 |  | - | 
0 commit comments