Merge branch 'dev' into fix/error-propagation

sfluegel05 · web-flow · commit dcd0915ef67e · 2025-07-21T20:57:18.000+02:00
diff --git a/README.md b/README.md
@@ -107,17 +107,20 @@ The aggregation happens separately for all positive predictions (i.e., $p_c^{m_i
 ($p_c^{m_i} < 0.5$). If the aggregated value is larger for the positive predictions than for the negative predictions,
 the ensemble makes a positive prediction for class $c$:
 
+<img width="2297" height="114" alt="image" src="https://github.com/user-attachments/assets/2f0263ae-83ac-41ea-938a-c71b46082c22" />
+<!-- For some reason, this formula does not render in GitHub markdown. Therefore, I rendered it locally and added it as an image. The rendered formula is:
 $$
 \text{ensemble}(c) = \begin{cases} 
 1 & \text{if } \sum_{i: p_c^{m_i} \geq 0.5} [\text{confidence}_c^{m_i} \cdot \text{model_weight}_{m_i} \cdot \text{trust}_c^{m_i}] > \sum_{i: p_c^{m_i} < 0.5} [\text{confidence}_c^{m_i} \cdot \text{model_weight}_{m_i} \cdot \text{trust}_c^{m_i}] \\
 0 & \text{otherwise}
 \end{cases}
 $$
+-->
 
 Here, confidence is the model's (self-reported) confidence in its prediction, calculated as
-$$
+$
 \text{confidence}_c^{m_i} = 2|p_c^{m_i} - 0.5|
-$$
+$
 For example, if a model makes a positive prediction with $p_c^{m_i} = 0.55$, the confidence is $2|0.55 - 0.5| = 0.1$.
 One could say that the model is not very confident in its prediction and very close to switching to a negative prediction.
 If another model is very sure about its negative prediction with $p_c^{m_j} = 0.1$, the confidence is $2|0.1 - 0.5| = 0.8$.
@@ -131,7 +134,8 @@ Trust is based on the model's performance on a validation set. After training, w
 on a validation set for each class. If the `ensemble_type` is set to `wmv-f1`, the trust is calculated as 1 + the F1 score.
 If the `ensemble_type` is set to `mv` (the default), the trust is set to 1 for all models.
 
-3. After a decision has been made for each class independently, the consistency of the predictions with regard to the ChEBI hierarchy 
+### Inconsistency correction
+After a decision has been made for each class independently, the consistency of the predictions with regard to the ChEBI hierarchy 
 and disjointness axioms is checked. This is
 done in 3 steps:
 - (1) First, the hierarchy is corrected. For each pair of classes $A$ and $B$ where $A$ is a subclass of $B$ (following 
@@ -145,4 +149,4 @@ both, we select one with the higher class score and set the other to 0.
 - (3) Since the second step might have introduced new inconsistencies into the hierarchy, we repeat the first step, but 
 with a small change. For a pair of classes $A \subseteq B$ with predictions $1$ and $0$, instead of setting $B$ to $1$,
 we now set $A$ to $0$. This has the advantage that we cannot introduce new disjointness-inconsistencies and don't have
-to repeat step 2.
+to repeat step 2.