Skip to content

Commit 252e2df

Browse files
committed
More docs cleanup
1 parent 202c30d commit 252e2df

File tree

12 files changed

+49
-39
lines changed

12 files changed

+49
-39
lines changed

src/pydvl/influence/general.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
This module contains influence calculation functions for general
33
models, as introduced in (Koh and Liang, 2017)[^1].
44
5-
## References:
5+
## References
66
77
[^1]: <a name="koh_liang_2017"></a>Koh, P.W., Liang, P., 2017.
88
[Understanding Black-box Predictions via Influence Functions](https://proceedings.mlr.press/v70/koh17a.html).

src/pydvl/influence/torch/torch_differentiable.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44
methods to invert the Hessian vector product. These are used to calculate the
55
influence of a training point on the model.
66
7-
## References:
7+
## References
88
99
[^1]: <a name="koh_liang_2017"></a>Koh, P.W., Liang, P., 2017.
1010
[Understanding Black-box Predictions via Influence Functions](https://proceedings.mlr.press/v70/koh17a.html).

src/pydvl/utils/utility.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@
1515
This module also contains Utility classes for toy games that are used
1616
for testing and for demonstration purposes.
1717
18-
## References:
18+
## References
1919
2020
[^1]: <a name="wang_improving_2022"></a>Wang, T., Yang, Y. and Jia, R., 2021.
2121
[Improving cooperative game theory-based data valuation via data utility learning](https://arxiv.org/abs/2107.06336).

src/pydvl/value/result.py

Lines changed: 11 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -207,7 +207,8 @@ class ValuationResult(
207207
extra_values: Additional values that can be passed as keyword arguments.
208208
This can contain, for example, the least core value.
209209
210-
:raise ValueError: If input arrays have mismatching lengths.
210+
Raises:
211+
ValueError: If input arrays have mismatching lengths.
211212
"""
212213

213214
_indices: NDArray[IndexT]
@@ -611,7 +612,8 @@ def update(self, idx: int, new_value: float) -> "ValuationResult":
611612
Returns:
612613
A reference to the same, modified result.
613614
614-
:raises IndexError: If the index is not found.
615+
Raises:
616+
IndexError: If the index is not found.
615617
"""
616618
try:
617619
pos = self._positions[idx]
@@ -632,7 +634,9 @@ def update(self, idx: int, new_value: float) -> "ValuationResult":
632634
def get(self, idx: Integral) -> ValueItem:
633635
"""Retrieves a ValueItem by data index, as opposed to sort index, like
634636
the indexing operator.
635-
:raises IndexError: If the index is not found.
637+
638+
Raises:
639+
IndexError: If the index is not found.
636640
"""
637641
try:
638642
pos = self._positions[idx]
@@ -662,7 +666,8 @@ def to_dataframe(
662666
A dataframe with two columns, one for the values, with name
663667
given as explained in `column`, and another with standard errors for
664668
approximate algorithms. The latter will be named `column+'_stderr'`.
665-
:raise ImportError: If pandas is not installed
669+
Raises:
670+
ImportError: If pandas is not installed
666671
"""
667672
if not pandas:
668673
raise ImportError("Pandas required for DataFrame export")
@@ -700,7 +705,8 @@ def from_random(
700705
A valuation result with its status set to
701706
[Status.Converged][pydvl.utils.status.Status] by default.
702707
703-
:raises ValueError: If `size` is less than 1.
708+
Raises:
709+
ValueError: If `size` is less than 1.
704710
705711
!!! tip "Changed in version 0.6.0"
706712
Added parameter `total`. Check for zero size

src/pydvl/value/semivalues.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -54,7 +54,7 @@
5454
instead.
5555
5656
57-
# References:
57+
## References
5858
5959
[^1]: <a name="ghorbani_data_2019"></a>Ghorbani, A., Zou, J., 2019.
6060
[Data Shapley: Equitable Valuation of Data for Machine Learning](http://proceedings.mlr.press/v97/ghorbani19c.html).

src/pydvl/value/shapley/common.py

Lines changed: 10 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -82,19 +82,20 @@ def compute_shapley_values(
8282
Args:
8383
u: [Utility][pydvl.utils.utility.Utility] object with model, data, and
8484
scoring function.
85-
done: [StoppingCriterion][pydvl.value.stopping.StoppingCriterion] object, used to
86-
determine when to stop the computation for Monte Carlo methods. The
87-
default is to stop after 100 iterations. See the available criteria
88-
in [stopping][pydvl.value.stopping]. It is possible to combine several
89-
criteria using boolean operators. Some methods ignore this argument,
90-
others require specific subtypes.
85+
done: Object used to determine when to stop the computation for Monte
86+
Carlo methods. The default is to stop after 100 iterations. See the
87+
available criteria in [stopping][pydvl.value.stopping]. It is
88+
possible to combine several of them using boolean operators. Some
89+
methods ignore this argument, others require specific subtypes.
9190
n_jobs: Number of parallel jobs (available only to some methods)
92-
seed: Either an instance of a numpy random number generator or a seed for it.
91+
seed: Either an instance of a numpy random number generator or a seed
92+
for it.
9393
mode: Choose which shapley algorithm to use. See
94-
[ShapleyMode][pydvl.value.shapley.ShapleyMode] for a list of allowed value.
94+
[ShapleyMode][pydvl.value.shapley.ShapleyMode] for a list of allowed
95+
value.
9596
9697
Returns:
97-
A [ValuationResult][pydvl.value.result.ValuationResult] object with the results.
98+
Object with the results.
9899
99100
"""
100101
progress: bool = kwargs.pop("progress", False)

src/pydvl/value/shapley/gt.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@
1414
1515
!!! tip "New in version 0.4.0"
1616
17-
# References:
17+
## References
1818
1919
[^1]: <a name="jia_efficient_2019"></a>Jia, R. et al., 2019.
2020
[Towards Efficient Data Valuation Based on the Shapley Value](http://proceedings.mlr.press/v89/jia19a.html).

src/pydvl/value/shapley/knn.py

Lines changed: 9 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -2,14 +2,15 @@
22
This module contains Shapley computations for K-Nearest Neighbours.
33
44
!!! Todo
5-
Implement approximate KNN computation for sublinear complexity)
5+
Implement approximate KNN computation for sublinear complexity
66
77
8-
# References:
8+
## References
99
10-
[^Y]: <a name="jia_efficient_2019a"></a>Jia, R. et al., 2019.
11-
[Efficient Task-Specific Data Valuation for Nearest Neighbor Algorithms](https://doi.org/10.14778/3342263.3342637).
12-
In: Proceedings of the VLDB Endowment, Vol. 12, No. 11, pp. 1610–1623.
10+
[^1]: <a name="jia_efficient_2019a"></a>Jia, R. et al., 2019. [Efficient
11+
Task-Specific Data Valuation for Nearest Neighbor
12+
Algorithms](https://doi.org/10.14778/3342263.3342637). In: Proceedings of
13+
the VLDB Endowment, Vol. 12, No. 11, pp. 1610–1623.
1314
1415
"""
1516

@@ -43,7 +44,9 @@ def knn_shapley(u: Utility, *, progress: bool = True) -> ValuationResult:
4344
Returns:
4445
Object with the data values.
4546
46-
:raises TypeError: If the model in the utility is not a [KNeighborsClassifier](https://scikit-learn.org/stable/modules/generated/sklearn.neighbors.KNeighborsClassifier.html)
47+
Raises:
48+
TypeError: If the model in the utility is not a
49+
[sklearn.neighbors.KNeighborsClassifier][].
4750
4851
!!! tip "New in version 0.1.0"
4952

src/pydvl/value/shapley/montecarlo.py

Lines changed: 11 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -21,19 +21,19 @@
2121
[truncated_montecarlo_shapley()][pydvl.value.shapley.truncated.truncated_montecarlo_shapley].
2222
2323
!!! info "Also see"
24-
It is also possible to use [group_testing_shapley()][pydvl.value.shapley.gt.group_testing_shapley]
25-
to reduce the number of evaluations of the utility. The method is however
26-
typically outperformed by others in this module.
24+
It is also possible to use [group_testing_shapley()][pydvl.value.shapley.gt.group_testing_shapley]
25+
to reduce the number of evaluations of the utility. The method is however
26+
typically outperformed by others in this module.
2727
2828
!!! info "Also see"
29-
Additionally, you can consider grouping your data points using
30-
[GroupedDataset][pydvl.utils.dataset.GroupedDataset] and computing the values of the
31-
groups instead. This is not to be confused with "group testing" as
32-
implemented in [group_testing_shapley()][pydvl.value.shapley.gt.group_testing_shapley]: any of
33-
the algorithms mentioned above, including Group Testing, can work to valuate
34-
groups of samples as units.
35-
36-
# References:
29+
Additionally, you can consider grouping your data points using
30+
[GroupedDataset][pydvl.utils.dataset.GroupedDataset] and computing the values
31+
of the groups instead. This is not to be confused with "group testing" as
32+
implemented in [group_testing_shapley()][pydvl.value.shapley.gt.group_testing_shapley]: any of
33+
the algorithms mentioned above, including Group Testing, can work to valuate
34+
groups of samples as units.
35+
36+
## References
3737
3838
[^1]: <a name="ghorbani_data_2019"></a>Ghorbani, A., Zou, J., 2019.
3939
[Data Shapley: Equitable Valuation of Data for Machine Learning](http://proceedings.mlr.press/v97/ghorbani19c.html).

src/pydvl/value/shapley/owen.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
"""
2-
# References:
2+
## References
33
44
[^1]: <a name="okhrati_multilinear_2021"></a>Okhrati, R., Lipani, A., 2021.
55
[A Multilinear Sampling Algorithm to Estimate Shapley Values](https://ieeexplore.ieee.org/abstract/document/9412511).

0 commit comments

Comments
 (0)