More link fixes in glossary

mdbenito · mdbenito · commit 6ef6105f0506 · 2025-03-03T22:32:25.000+01:00
diff --git a/docs/getting-started/glossary.md b/docs/getting-started/glossary.md
@@ -13,7 +13,7 @@ Terms in data valuation and influence functions:
 
 The Arnoldi method approximately computes eigenvalue, eigenvector pairs of
 a symmetric matrix. For influence functions, it is used to approximate
-the [iHVP][glossary-inverse-hessian-vector-product].
+the [iHVP][glossary-iHVP].
 Introduced by [@schioppa_scaling_2022] in the context of influence functions.
 
   * [Implementation (torch)
@@ -24,7 +24,7 @@ Introduced by [@schioppa_scaling_2022] in the context of influence functions.
 
 A blocked version of [CG][glossary-conjugate-gradient], which solves several linear
 systems simultaneously. For Influence Functions, it is used to
-approximate the [iHVP][glossary-inverse-hessian-vector-product].
+approximate the [iHVP][glossary-iHVP].
 
  * [Implementation (torch)
    ][pydvl.influence.torch.influence_function_model.CgInfluence]
@@ -47,7 +47,7 @@ Introduced by [@schoch_csshapley_2022].
 
 CG is an algorithm for solving linear systems with a symmetric and
 positive-definite coefficient matrix. For Influence Functions, it is used to
-approximate the [iHVP][glossary-inverse-hessian-vector-product].
+approximate the [iHVP][glossary-iHVP].
 
  * [Implementation
    (torch)][pydvl.influence.torch.influence_function_model.CgInfluence]
@@ -80,14 +80,14 @@ Introduced by [@wang_improving_2022].
 
 ### Eigenvalue-corrected Kronecker-Factored Approximate Curvature
 
-EKFAC builds on [K-FAC][glossary-kronecker-factored-approximate-curvature] by
-correcting for the approximation errors in the eigenvalues of the blocks of the
-Kronecker-factored approximate curvature matrix. This correction aims to refine
-the accuracy of natural gradient approximations, thus potentially offering
-better training efficiency and stability in neural networks.
+EKFAC builds on [K-FAC][flossary-k-fac] by correcting for the approximation
+errors in the eigenvalues of the blocks of the Kronecker-factored approximate
+curvature matrix. This correction aims to refine the accuracy of natural
+gradient approximations, thus potentially offering better training efficiency
+and stability in neural networks.
 
- * [Implementation (torch)
-   ][pydvl.influence.torch.influence_function_model.EkfacInfluence]
+ * [Implementation
+   (torch)][pydvl.influence.torch.influence_function_model.EkfacInfluence]
  * [Documentation (torch)][eigenvalue-corrected-k-fac]
 
 
@@ -142,7 +142,7 @@ LiSSA is an efficient algorithm for approximating the inverse Hessian-vector
 product, enabling faster computations in large-scale machine learning
 problems, particularly for second-order optimization.
 For Influence Functions, it is used to
-approximate the [iHVP][glossary-inverse-hessian-vector-product].
+approximate the [iHVP][glossary-iHVP].
 Introduced by [@agarwal_secondorder_2017].
 
  * [Implementation (torch)
@@ -196,7 +196,7 @@ Introduced into data valuation by [@ghorbani_data_2019].
 
 The Nyström approximation computes a low-rank approximation to a symmetric
 positive-definite matrix via random projections. For influence functions, 
-it is used to approximate the [iHVP][glossary-inverse-hessian-vector-product].
+it is used to approximate the [iHVP][glossary-iHVP].
 Introduced as sketch and solve algorithm in [@hataya_nystrom_2023], and as
 preconditioner for [PCG][glossary-preconditioned-conjugate-gradient] in
 [@frangella_randomized_2023].
@@ -220,7 +220,7 @@ performance, where the points are removed in order of their value. See
 
 A blocked version of [PCG][glossary-preconditioned-conjugate-gradient], which solves 
 several linear systems simultaneously. For Influence Functions, it is used to
-approximate the [iHVP][glossary-inverse-hessian-vector-product].
+approximate the [iHVP][glossary-iHVP].
 
  * [Implementation CG (torch)
    ][pydvl.influence.torch.influence_function_model.CgInfluence]
@@ -233,7 +233,7 @@ approximate the [iHVP][glossary-inverse-hessian-vector-product].
 A preconditioned version of [CG][glossary-conjugate-gradient] for improved
 convergence, depending on the characteristics of the matrix and the
 preconditioner. For Influence Functions, it is used to
-approximate the [iHVP][glossary-inverse-hessian-vector-product].
+approximate the [iHVP][glossary-iHVP].
 
  * [Implementation CG (torch)
    ][pydvl.influence.torch.influence_function_model.CgInfluence]
diff --git a/notebooks/data_oob.ipynb b/notebooks/data_oob.ipynb
@@ -271,8 +271,8 @@
    ]
   },
   {
-   "metadata": {},
    "cell_type": "markdown",
+   "metadata": {},
    "source": [
     "## Computing the OOB values\n",
     "\n",
@@ -284,10 +284,10 @@
    ]
   },
   {
-   "metadata": {},
    "cell_type": "code",
-   "outputs": [],
    "execution_count": null,
+   "metadata": {},
+   "outputs": [],
    "source": [
     "n_estimators = [50, 100, 200]\n",
     "oob_values = []\n",
diff --git a/notebooks/shapley_basic_spotify.ipynb b/notebooks/shapley_basic_spotify.ipynb
@@ -442,19 +442,19 @@
    ]
   },
   {
-   "metadata": {},
    "cell_type": "markdown",
+   "metadata": {},
    "source": [
     "Now we configure the valuation method. Shapley values were popularized for data valuation in machine learning with _Truncated Monte Carlo Shapley_, which is a Monte Carlo approximation of the Shapley value that uses a permutation-based definition of Shapley values and truncates the iteration over a given permutation after the marginal utility drops below a certain threshold. For more information on the method, see [Ghorbani and Zou (2019)](https://proceedings.mlr.press/v97/ghorbani19c.html) or [pydvl's documentation][pydvl.valuation.methods.shapley.ShapleyValuation].\n",
     "\n",
     "Like every semi-value method, `ShapleyValuation` requires a sampler and a stopping criterion. For the former we use a [PermutationSampler][pydvl.valuation.samplers.permutation.PermutationSampler], which samples permutations of indices and computes marginal contributions incrementally. By using [RelativeTruncation][pydvl.valuation.samplers.truncation.RelativeTruncation], the processing of a permutation will stop once the utility of a subset is close to the total utility. Finally, the stopping condition for the whole algorithm is given as in the TMCS paper: we stop once the total change in the last 100 steps is below a threshold."
    ]
   },
   {
-   "metadata": {},
    "cell_type": "code",
-   "outputs": [],
    "execution_count": null,
+   "metadata": {},
+   "outputs": [],
    "source": [
     "from joblib import parallel_config\n",
     "\n",
diff --git a/notebooks/shapley_knn_flowers.ipynb b/notebooks/shapley_knn_flowers.ipynb
@@ -81,7 +81,9 @@
    "cell_type": "markdown",
    "id": "75abb359",
    "metadata": {},
-   "source": "The main interface is the class [KNNShapleyValuation][pydvl.valuation.methods.knn_shapley.KNNShapleyValuation]. In order to use it we need to construct two [Datasets][pydvl.valuation.dataset.Dataset] (one for training and one for evaluating), and a [KNNClassifierUtility][pydvl.valuation.utility.knn.KNNClassifierUtility]."
+   "source": [
+    "The main interface is the class [KNNShapleyValuation][pydvl.valuation.methods.knn_shapley.KNNShapleyValuation]. In order to use it we need to construct two [Datasets][pydvl.valuation.dataset.Dataset] (one for training and one for evaluating), and a [KNNClassifierUtility][pydvl.valuation.utility.knn.KNNClassifierUtility]."
+   ]
   },
   {
    "cell_type": "markdown",
diff --git a/src/pydvl/valuation/methods/data_oob.py b/src/pydvl/valuation/methods/data_oob.py
@@ -79,7 +79,7 @@ def __init__(
         self.score = score
 
     def fit(self, data: Dataset) -> Self:
-        """ Compute the Data-OOB values.
+        """Compute the Data-OOB values.
 
         This requires the bagging model passed upon construction to be fitted.
 
diff --git a/src/pydvl/value/shapley/classwise.py b/src/pydvl/value/shapley/classwise.py
@@ -17,7 +17,7 @@
 
 !!! tip "Analysis of Class-wise Shapley"
     For a detailed analysis of the method, with comparison to other valuation
-    techniques, please refer to the [main documentation][class-wise-shapley].
+    techniques, please refer to the [main documentation][intro-to-cw-shapley].
 
 In practice, the quantity above is estimated using Monte Carlo sampling of
 the powerset and the set of index permutations. This results in the estimator
@@ -269,7 +269,7 @@ def compute_classwise_shapley_values(
 
     where $\sigma_{:i}$ denotes the set of indices in permutation sigma before
     the position where $i$ appears and $S$ is a subset of the index set of all
-    other labels (see [the main documentation][class-wise-shapley] for
+    other labels (see [the main documentation][#intro-to-cw-shapley] for
     details).
 
     Args:

Original file line number	Diff line number	Diff line change
`@@ -271,8 +271,8 @@`
`271`	`271`	`]`
`272`	`272`	`},`
`273`	`273`	`{`
`274`		`- "metadata": {},`
`275`	`274`	`"cell_type": "markdown",`
	`275`	`+ "metadata": {},`
`276`	`276`	`"source": [`
`277`	`277`	`"## Computing the OOB values\n",`
`278`	`278`	`"\n",`
`@@ -284,10 +284,10 @@`
`284`	`284`	`]`
`285`	`285`	`},`
`286`	`286`	`{`
`287`		`- "metadata": {},`
`288`	`287`	`"cell_type": "code",`
`289`		`- "outputs": [],`
`290`	`288`	`"execution_count": null,`
	`289`	`+ "metadata": {},`
	`290`	`+ "outputs": [],`
`291`	`291`	`"source": [`
`292`	`292`	`"n_estimators = [50, 100, 200]\n",`
`293`	`293`	`"oob_values = []\n",`
Original file line number	Diff line number	Diff line change
`@@ -442,19 +442,19 @@`
`442`	`442`	`]`
`443`	`443`	`},`
`444`	`444`	`{`
`445`		`- "metadata": {},`
`446`	`445`	`"cell_type": "markdown",`
	`446`	`+ "metadata": {},`
`447`	`447`	`"source": [`
`448`	`448`	"Now we configure the valuation method. Shapley values were popularized for data valuation in machine learning with _Truncated Monte Carlo Shapley_, which is a Monte Carlo approximation of the Shapley value that uses a permutation-based definition of Shapley values and truncates the iteration over a given permutation after the marginal utility drops below a certain threshold. For more information on the method, see [Ghorbani and Zou (2019)](https://proceedings.mlr.press/v97/ghorbani19c.html) or [pydvl's documentation][pydvl.valuation.methods.shapley.ShapleyValuation].\n",
`449`	`449`	`"\n",`
`450`	`450`	"Like every semi-value method, `ShapleyValuation` requires a sampler and a stopping criterion. For the former we use a [PermutationSampler][pydvl.valuation.samplers.permutation.PermutationSampler], which samples permutations of indices and computes marginal contributions incrementally. By using [RelativeTruncation][pydvl.valuation.samplers.truncation.RelativeTruncation], the processing of a permutation will stop once the utility of a subset is close to the total utility. Finally, the stopping condition for the whole algorithm is given as in the TMCS paper: we stop once the total change in the last 100 steps is below a threshold."
`451`	`451`	`]`
`452`	`452`	`},`
`453`	`453`	`{`
`454`		`- "metadata": {},`
`455`	`454`	`"cell_type": "code",`
`456`		`- "outputs": [],`
`457`	`455`	`"execution_count": null,`
	`456`	`+ "metadata": {},`
	`457`	`+ "outputs": [],`
`458`	`458`	`"source": [`
`459`	`459`	`"from joblib import parallel_config\n",`
`460`	`460`	`"\n",`