Skip to content

Commit e8ca9bd

Browse files
committed
Update docu
1 parent 27d92bf commit e8ca9bd

File tree

6 files changed

+91
-53
lines changed

6 files changed

+91
-53
lines changed

R/kernelshap.R

Lines changed: 8 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -8,17 +8,11 @@
88
#' Otherwise, an almost exact hybrid algorithm combining exact calculations and
99
#' iterative paired sampling is used, see Details.
1010
#'
11-
#' Note that (exact) Kernel SHAP is only an approximation of (exact) permutation SHAP.
12-
#' Thus, for up to eight features, we recommend [permshap()]. For more features,
13-
#' [permshap()] tends to be inefficient compared the optimized hybrid strategy
14-
#' of Kernel SHAP.
15-
#'
1611
#' @details
1712
#' The pure iterative Kernel SHAP sampling as in Covert and Lee (2021) works like this:
1813
#'
19-
#' 1. A binary "on-off" vector \eqn{z} is drawn from \eqn{\{0, 1\}^p}
20-
#' such that its sum follows the SHAP Kernel weight distribution
21-
#' (normalized to the range \eqn{\{1, \dots, p-1\}}).
14+
#' 1. A binary "on-off" vector \eqn{z} is drawn from \eqn{\{0, 1\}^p} according to
15+
#' a special weighting logic.
2216
#' 2. For each \eqn{j} with \eqn{z_j = 1}, the \eqn{j}-th column of the
2317
#' original background data is replaced by the corresponding feature value \eqn{x_j}
2418
#' of the observation to be explained.
@@ -33,17 +27,14 @@
3327
#'
3428
#' This is repeated multiple times until convergence, see CL21 for details.
3529
#'
36-
#' A drawback of this strategy is that many (at least 75%) of the \eqn{z} vectors will
37-
#' have \eqn{\sum z \in \{1, p-1\}}, producing many duplicates. Similarly, at least 92%
38-
#' of the mass will be used for the \eqn{p(p+1)} possible vectors with
39-
#' \eqn{\sum z \in \{1, 2, p-2, p-1\}}.
30+
#' To avoid the evaluation of
4031
#' This inefficiency can be fixed by a hybrid strategy, combining exact calculations
4132
#' with sampling.
4233
#'
4334
#' The hybrid algorithm has two steps:
4435
#' 1. Step 1 (exact part): There are \eqn{2p} different on-off vectors \eqn{z} with
45-
#' \eqn{\sum z \in \{1, p-1\}}, covering a large proportion of the Kernel SHAP
46-
#' distribution. The degree 1 hybrid will list those vectors and use them according
36+
#' \eqn{\sum z \in \{1, p-1\}}.
37+
#' The degree 1 hybrid will list those vectors and use them according
4738
#' to their weights in the upcoming calculations. Depending on \eqn{p}, we can also go
4839
#' a step further to a degree 2 hybrid by adding all \eqn{p(p-1)} vectors with
4940
#' \eqn{\sum z \in \{2, p-2\}} to the process etc. The necessary predictions are
@@ -96,12 +87,10 @@
9687
#' worse than the hybrid strategy and should therefore only be used for
9788
#' studying properties of the Kernel SHAP algorithm.
9889
#' - `1`: Uses all \eqn{2p} on-off vectors \eqn{z} with \eqn{\sum z \in \{1, p-1\}}
99-
#' for the exact part, which covers at least 75% of the mass of the Kernel weight
100-
#' distribution. The remaining mass is covered by random sampling.
90+
#' for the exact part. The remaining mass is covered by random sampling.
10191
#' - `2`: Uses all \eqn{p(p+1)} on-off vectors \eqn{z} with
102-
#' \eqn{\sum z \in \{1, 2, p-2, p-1\}}. This covers at least 92% of the mass of the
103-
#' Kernel weight distribution. The remaining mass is covered by sampling.
104-
#' Convergence usually happens in the minimal possible number of iterations of two.
92+
#' \eqn{\sum z \in \{1, 2, p-2, p-1\}}. The remaining mass is covered by sampling.
93+
#' Convergence usually happens very fast.
10594
#' - `k>2`: Uses all on-off vectors with
10695
#' \eqn{\sum z \in \{1, \dots, k, p-k, \dots, p-1\}}.
10796
#' @param m Even number of on-off vectors sampled during one iteration.

R/permshap.R

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -32,7 +32,7 @@
3232
#' Furthermore, the 2p on-off vectors with sum <=1 or >=p-1 are evaluated only once,
3333
#' similar to the degree 1 hybrid in [kernelshap()] (but covering less weight).
3434
#'
35-
#' @param exact If `TRUE`, the algorithm will produce exact SHAP values
35+
#' @param exact If `TRUE`, the algorithm produces exact SHAP values
3636
#' with respect to the background data.
3737
#' The default is `TRUE` for up to eight features, and `FALSE` otherwise.
3838
#' @param low_memory If `FALSE` (default up to p = 15), the algorithm evaluates p

README.md

Lines changed: 10 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -15,20 +15,18 @@
1515

1616
The package contains three functions to crunch SHAP values:
1717

18-
- **`permshap()`**: Permutation SHAP algorithm of [1]. Recommended for models with up to 8 features, or if you don't trust Kernel SHAP. Both exact and sampling versions are available.
19-
- **`kernelshap()`**: Kernel SHAP algorithm of [2] and [3]. Recommended for models with more than 8 features. Both exact and (pseudo-exact) sampling versions are available.
18+
- **`permshap()`**: Permutation SHAP algorithm of [1]. Both exact and sampling versions are available.
19+
- **`kernelshap()`**: Kernel SHAP algorithm of [2] and [3]. Both exact and (pseudo-exact) sampling versions are available.
2020
- **`additive_shap()`**: For *additive models* fitted via `lm()`, `glm()`, `mgcv::gam()`, `mgcv::bam()`, `gam::gam()`, `survival::coxph()`, or `survival::survreg()`. Exponentially faster than the model-agnostic options above, and recommended if possible.
2121

22-
To explain your model, select an explanation dataset `X` (up to 1000 rows from the training data, feature columns only) and apply the recommended function. Use {shapviz} to visualize the resulting SHAP values.
22+
To explain your model, select an explanation dataset `X` (up to 1000 rows from the training data, feature columns only). Use {shapviz} to visualize the resulting SHAP values.
2323

2424
**Remarks to `permshap()` and `kernelshap()`**
2525

2626
- Both algorithms need a representative background data `bg_X` to calculate marginal means (up to 500 rows from the training data). In cases with a natural "off" value (like MNIST digits), this can also be a single row with all values set to the off value. If unspecified, 200 rows are randomly sampled from `X`.
27-
- Exact Kernel SHAP is an approximation to exact permutation SHAP. Since exact calculations are usually sufficiently fast for up to eight features, we recommend `permshap()` in this case. With more features, `kernelshap()` switches to a comparably fast, almost exact algorithm with faster convergence than the sampling version of permutation SHAP.
28-
That is why we recommend `kernelshap()` in this case.
29-
- For models with interactions of order up to two, SHAP values of permutation SHAP and Kernel SHAP agree,
30-
and the implemented sampling versions provide the same results as the exact versions.
31-
In the presence of interactions of order three or higher, this is no longer the case.
27+
- Exact Kernel SHAP gives identical results as exact permutation SHAP. Both algorithms are fast up to 8 features.
28+
With more features, `kernelshap()` switches to an almost exact algorithm with faster convergence than the sampling version of permutation SHAP.
29+
- For models with interactions of order up to two, the sampling versions provide the same results as the exact versions.
3230
- For additive models, `permshap()` and `kernelshap()` give the same results as `additive_shap`
3331
as long as the full training data would be used as background data.
3432

@@ -89,13 +87,12 @@ ps
8987
[1,] 1.1913247 0.09005467 -0.13430720 0.000682593
9088
[2,] -0.4931989 -0.11724773 0.09868921 0.028563613
9189

92-
# Kernel SHAP gives very slightly different values because the model contains
93-
# interations of order > 2:
90+
# Indeed, Kernel SHAP gives the same:
9491
ks <- kernelshap(fit, X, bg_X = bg_X)
9592
ks
96-
# log_carat clarity color cut
97-
# [1,] 1.1911791 0.0900462 -0.13531648 0.001845958
98-
# [2,] -0.4927482 -0.1168517 0.09815062 0.028255442
93+
log_carat clarity color cut
94+
[1,] 1.1913247 0.09005467 -0.13430720 0.000682593
95+
[2,] -0.4931989 -0.11724773 0.09868921 0.028563613
9996

10097
# 4) Analyze with {shapviz}
10198
ps <- shapviz(ps)

backlog/compare_with_python2.R

Lines changed: 63 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,63 @@
1+
library(kernelshap)
2+
3+
n <- 100
4+
5+
X <- data.frame(
6+
x1 = seq(1:n) / 100,
7+
x2 = log(1:n),
8+
x3 = sqrt(1:n),
9+
x4 = sin(1:n),
10+
x5 = (seq(1:n) / 100)^2,
11+
x6 = cos(1:n)
12+
)
13+
head(X)
14+
15+
pf <- function(model, newdata) {
16+
x <- newdata
17+
x[, 1] * x[, 2] * x[, 3] * x[, 4] + x[, 5] + x[, 6]
18+
}
19+
ks <- kernelshap(pf, head(X), bg_X = X, pred_fun = pf)
20+
ks # -1.196216 -1.241848 -0.9567848 3.879420 -0.33825 0.5456252
21+
es <- permshap(pf, head(X), bg_X = X, pred_fun = pf)
22+
es # -1.196216 -1.241848 -0.9567848 3.879420 -0.33825 0.5456252
23+
24+
set.seed(10)
25+
kss <- kernelshap(
26+
pf,
27+
head(X, 1),
28+
bg_X = X,
29+
pred_fun = pf,
30+
hybrid_degree = 0,
31+
exact = F,
32+
m = 9000,
33+
max_iter = 100,
34+
tol = 0.0005
35+
)
36+
kss # -1.198078 -1.246508 -0.9580638 3.877532 -0.3241824 0.541247
37+
38+
set.seed(2)
39+
ksh <- kernelshap(
40+
pf,
41+
head(X, 1),
42+
bg_X = X,
43+
pred_fun = pf,
44+
hybrid_degree = 1,
45+
exact = FALSE,
46+
max_iter = 10000,
47+
tol = 0.0005
48+
)
49+
ksh # -1.191981 -1.240656 -0.9516264 3.86776 -0.3342143 0.5426642
50+
51+
set.seed(1)
52+
ksh2 <- kernelshap(
53+
pf,
54+
head(X, 1),
55+
bg_X = X,
56+
pred_fun = pf,
57+
hybrid_degree = 2,
58+
exact = FALSE,
59+
m = 10000,
60+
max_iter = 10000,
61+
tol = 0.0001
62+
)
63+
ksh2 # 1.195976 -1.241107 -0.9565121 3.878891 -0.3384621 0.5451118

man/kernelshap.Rd

Lines changed: 8 additions & 19 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

man/permshap.Rd

Lines changed: 1 addition & 1 deletion
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

0 commit comments

Comments
 (0)