Skip to content

Commit 1ddaafc

Browse files
author
Tommaso Giani
committed
Merge branch 'hyperopt_penalty' of github.com:NNPDF/nnpdf into hyperopt_penalty
2 parents aa5b779 + dde5350 commit 1ddaafc

File tree

121 files changed

+19284
-18364
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

121 files changed

+19284
-18364
lines changed

.github/workflows/fitbot.yml

Lines changed: 19 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -8,10 +8,12 @@ on:
88

99
# some general variables
1010
env:
11-
N3FIT_MAXNREP: 20 # total number of replicas to fit
12-
POSTFIT_NREP: 16 # requested replicas for postfit
13-
REFERENCE_SET: NNBOT-955eb2bcc-2025-06-17 # reference set for exact results
14-
STABLE_REFERENCE_SET: NNBOT-955eb2bcc-2025-06-17 # reference set for last tag
11+
N3FIT_MAXNREP: 30 # total number of replicas to fit
12+
POSTFIT_NREP: 15 # requested minimum replicas for postfit
13+
# IMPORTANT
14+
# WHEN CHANGING THE REFERENCE SET, THE NEW REFERENCE MUST BE MANUALLY UPLOADED TO THE SERVER
15+
REFERENCE_SET: NNBOT-99108504e-2025-11-22 # reference set for exact results
16+
STABLE_REFERENCE_SET: NNBOT-99108504e-2025-11-22 # reference set for last tag
1517
PYTHONHASHSEED: "0"
1618

1719
jobs:
@@ -55,12 +57,12 @@ jobs:
5557
cd $RUNFOLDER
5658
cp developing.yml $RUNCARD.yml
5759
vp-setupfit $RUNCARD.yml
58-
# run n3fit replicas sequentially
60+
# try running the n3fit replicas in parallel
5961
- name: Running n3fit
6062
shell: bash -l {0}
6163
run: |
6264
cd $RUNFOLDER
63-
for ((i=1; i<=$N3FIT_MAXNREP; i+=1)); do n3fit $RUNCARD.yml $i ; done
65+
n3fit $RUNCARD.yml 1 -r $N3FIT_MAXNREP
6466
# performing DGLAP
6567
- name: Running dglap
6668
shell: bash -l {0}
@@ -79,17 +81,16 @@ jobs:
7981
run: |
8082
conda activate nnpdfenv
8183
cd $RUNFOLDER
82-
postfit $POSTFIT_NREP $RUNCARD
83-
res=$(vp-upload $RUNCARD 2>&1)
84-
echo ${res}
85-
while echo ${res} | grep ERROR >/dev/null
86-
do
87-
sleep 30s
88-
res=$(vp-upload $RUNCARD 2>&1)
89-
done
90-
url=$( echo "${res}" | grep https )
91-
echo "FIT_URL=$url" >> $GITHUB_ENV
92-
# running validphys report
84+
postfit $POSTFIT_NREP $RUNCARD --at-least-nrep
85+
ln -s ${PWD}/${RUNCARD} ${CONDA_PREFIX}/share/NNPDF/results
86+
tar -czf ${RUNCARD}.tar.gz ${RUNCARD}
87+
echo "PATH_TO_SAVE=${PWD}/${RUNCARD}.tar.gz" >> ${GITHUB_ENV}
88+
- name: Keep the fit as an artifact
89+
if: ${{ !cancelled() }}
90+
uses: actions/upload-artifact@v4
91+
with:
92+
name: ${{ env.RUNCARD }}.tar.gz
93+
path: ${{ env.PATH_TO_SAVE }}
9394
- name: Building and upload report
9495
shell: bash -l {0}
9596
run: |
@@ -121,6 +122,6 @@ jobs:
121122
- Fit Name: ${{ env.RUNCARD }}
122123
- Fit Report wrt master: ${{ env.REPORT_URL }}
123124
- Fit Report wrt latest stable reference: ${{ env.REPORT_URL_STABLE }}
124-
- Fit Data: ${{ env.FIT_URL }}
125+
- Fit Data: fit data is kept as an artifact. Please, remember to upload it to the server if the reference is changed.
125126
126127
Check the report **carefully**, and please buy me a :coffee: , or better, a GPU :wink:!

doc/sphinx/source/n3fit/methodology.rst

Lines changed: 55 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -346,3 +346,58 @@ The figure above provides a schematic representation of this feature scaling met
346346
2. ``[number of points]`` points are kept (dark blue), while other points are discarded (light blue).
347347
3. A cubic spline function is used to do the interpolation between the points that have not been
348348
discarded.
349+
350+
351+
Diagonal basis
352+
--------------
353+
354+
Performing the training and validation split without diagonalising the :math:`t_0` covmat :math:`C_{0}` neglects
355+
any correlations that may be present between training and validation data. To remedy this,
356+
we rotate to a basis in which the correlation matrix is diagonal before performing any training/validation split.
357+
Starting from the definition of the :math:`\chi^2` function in the NNPDF methodology, we have
358+
359+
.. math::
360+
361+
\chi^2 &= (D-T)^T C_0^{-1} (D-T) \\
362+
&= (D-T)^T R^{-1} R C_0^{-1} R R^{-1} (D-T) \\
363+
&= (D-T)^T R^{-1} \left( R^{-1} C_0 R^{-1} \right)^{-1} R^{-1} (D-T) \\
364+
&\equiv \tilde{\epsilon}^T \rho^{-1} \tilde{\epsilon} \, ,
365+
366+
where we have defined :math:`\tilde{\epsilon} \equiv R^{-1}(D-T)` and :math:`\rho = R^{-1} C_0 R^{-1}`.
367+
368+
Choosing :math:`R_{ii} = \sqrt{C_{0, ii}}`, we have that :math:`R^{-1} C_0 R^{-1}` coincides with the usual definition of the correlation matrix.
369+
370+
Next, we move to the basis in which :math:`\rho` is diagonal. Writing :math:`\rho = \tilde{U}^T \tilde{\Lambda} \tilde{U}`, we find
371+
372+
.. math::
373+
374+
\chi^2 &= \tilde{\epsilon}^T \rho^{-1} \tilde{\epsilon} \\
375+
&= \tilde{\epsilon}^T (\tilde{U}^T \tilde{\Lambda} \tilde{U})^{-1} \tilde{\epsilon} \\
376+
&= \tilde{\epsilon}^T \tilde{U}^T \tilde{\Lambda}^{-1} \tilde{U} \tilde{\epsilon} \\
377+
&\equiv \dbtilde{\epsilon}^T \tilde{\Lambda}^{-1} \dbtilde{\epsilon} \, ,
378+
379+
where on the last line we have defined
380+
381+
.. math::
382+
383+
\dbtilde{\epsilon} \equiv \tilde{U}\tilde{\epsilon} = \tilde{U}R^{-1}(D-T).
384+
385+
In index notation, this reads
386+
387+
.. math::
388+
389+
\dbtilde{\epsilon_i} = \tilde{U}_{ij} \frac{(D-T)_j}{\sqrt{C_{0, jj}}}
390+
391+
The transformed data :math:`\dbtilde{\epsilon}` is statistically independent in the diagonal basis of the correlation matrix :math:`\rho`.
392+
Computing the covariance of :math:`\dbtilde{\epsilon}`,
393+
394+
.. math::
395+
396+
\mathbb{E}[\dbtilde{\epsilon}\dbtilde{\epsilon}^T]
397+
&= \mathbb{E} \big[ (\tilde{U} R^{-1}(D-T)) (\tilde{U} R^{-1}(D-T))^T \big] \\
398+
&= \tilde{U} R^{-1} \mathbb{E}[(D-T)(D-T)^T] R^{-1} \tilde{U}^T \\
399+
&= \tilde{U} \rho \tilde{U}^T \\
400+
&= \tilde{U}\tilde{U}^T \tilde{\Lambda} \tilde{U}\tilde{U}^T \\
401+
&= \tilde{\Lambda} \, ,
402+
403+
we find that it is diagonal, which demonstrates that the training/validation data are statistically independent indeed.

extra_tests/regression_checks.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@
1717
runcard_and_replicas = {
1818
"normal_fit": 72,
1919
"central": 16,
20-
"diagonal": 45,
20+
"no_diagonal": 45,
2121
"feature_scaling": 81,
2222
"flavour": 29,
2323
"no_msr": 92,

0 commit comments

Comments
 (0)