You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: notebooks/shapley_toy.py
+25-71Lines changed: 25 additions & 71 deletions
Original file line number
Diff line number
Diff line change
@@ -1,34 +1,24 @@
1
1
importmarimo
2
2
3
-
__generated_with="0.3.2"
3
+
__generated_with="0.17.0"
4
4
app=marimo.App()
5
5
6
6
7
7
@app.cell
8
8
def_(mo):
9
-
mo.md(
10
-
r"""
11
-
# LS-SPA Demonstration Notebook
12
-
"""
13
-
)
9
+
mo.md(r"""# LS-SPA Demonstration Notebook""")
14
10
return
15
11
16
12
17
13
@app.cell
18
14
def_(mo):
19
-
mo.md(
20
-
r"### In this notebook, we use the data from the toy example in Section 2.5 of the paper \"Efficient Shapley Performance Attribution for Least-Squares Regression\" to demonstrate how Shapley values can be computed directly for a linear, least-squares model. We then demonstrate how LS-SPA can be used to generate the same Shapley attribution. In this specific case, we have a very small number of features, so it is feaible to compute the exact Shapley attribution. When the number of features exceeds 15, this is no longer the case. LS-SPA is able to accurately approximate Shapley attributions for linear least-squares models even when the number of features exceeds 1000."
21
-
)
15
+
mo.md(r"""### In this notebook, we use the data from the toy example in Section 2.5 of the paper "Efficient Shapley Performance Attribution for Least-Squares Regression" to demonstrate how Shapley values can be computed directly for a linear, least-squares model. We then demonstrate how LS-SPA can be used to generate the same Shapley attribution. In this specific case, we have a very small number of features, so it is feaible to compute the exact Shapley attribution. When the number of features exceeds 15, this is no longer the case. LS-SPA is able to accurately approximate Shapley attributions for linear least-squares models even when the number of features exceeds 1000.""")
22
16
return
23
17
24
18
25
19
@app.cell
26
20
def_(mo):
27
-
mo.md(
28
-
r"""
29
-
## Imports
30
-
"""
31
-
)
21
+
mo.md(r"""## Imports""")
32
22
return
33
23
34
24
@@ -41,17 +31,12 @@ def _():
41
31
importmatplotlib.pyplotasplt
42
32
43
33
fromls_spaimportls_spa
44
-
45
-
returnitertools, ls_spa, math, mo, np, plt
34
+
returnitertools, ls_spa, math, mo, np
46
35
47
36
48
37
@app.cell
49
38
def_(mo):
50
-
mo.md(
51
-
r"""
52
-
## Data loading
53
-
"""
54
-
)
39
+
mo.md(r"""## Data loading""")
55
40
return
56
41
57
42
@@ -60,45 +45,34 @@ def _():
60
45
N=50
61
46
M=50
62
47
p=3
63
-
returnM, N, p
48
+
return(p,)
64
49
65
50
66
51
@app.cell
67
52
def_(mo):
68
-
mo.md(
69
-
r"""
70
-
### The rows of $X$ correspond to observations and the columns of $X$ correspond to features. We fit a least-squares model on the training data `X_train` and `y_train` and evaluate its performance on the test data `X_test` and `y_test`.
71
-
"""
72
-
)
53
+
mo.md(r"""### The rows of $X$ correspond to observations and the columns of $X$ correspond to features. We fit a least-squares model on the training data `X_train` and `y_train` and evaluate its performance on the test data `X_test` and `y_test`.""")
### For every ordering of our features, we remove one from our model and re-fit sequentially. For each feature, we consider the change in the $R^2$ of the model due to its addition/removal. For a single ordering, the vector of these performance differences due to each feature is a lift vector. The Shapley attribution of our model is the average of the lift vectors for every possible ordering of the features.
130
-
"""
131
-
)
101
+
mo.md(r"""### For every ordering of our features, we remove one from our model and re-fit sequentially. For each feature, we consider the change in the $R^2$ of the model due to its addition/removal. For a single ordering, the vector of these performance differences due to each feature is a lift vector. The Shapley attribution of our model is the average of the lift vectors for every possible ordering of the features.""")
### We display the $R^2$ for the model fitted with each subset of the features, and we also display the lift vectors corresponding to each permutation of the features.
157
-
"""
158
-
)
124
+
mo.md(r"""### We display the $R^2$ for the model fitted with each subset of the features, and we also display the lift vectors corresponding to each permutation of the features.""")
0 commit comments