|
8 | 8 | #' Otherwise, an almost exact hybrid algorithm combining exact calculations and |
9 | 9 | #' iterative paired sampling is used, see Details. |
10 | 10 | #' |
11 | | -#' Note that (exact) Kernel SHAP is only an approximation of (exact) permutation SHAP. |
12 | | -#' Thus, for up to eight features, we recommend [permshap()]. For more features, |
13 | | -#' [permshap()] tends to be inefficient compared the optimized hybrid strategy |
14 | | -#' of Kernel SHAP. |
15 | | -#' |
16 | 11 | #' @details |
17 | 12 | #' The pure iterative Kernel SHAP sampling as in Covert and Lee (2021) works like this: |
18 | 13 | #' |
19 | | -#' 1. A binary "on-off" vector \eqn{z} is drawn from \eqn{\{0, 1\}^p} |
20 | | -#' such that its sum follows the SHAP Kernel weight distribution |
21 | | -#' (normalized to the range \eqn{\{1, \dots, p-1\}}). |
| 14 | +#' 1. A binary "on-off" vector \eqn{z} is drawn from \eqn{\{0, 1\}^p} according to |
| 15 | +#' a special weighting logic. |
22 | 16 | #' 2. For each \eqn{j} with \eqn{z_j = 1}, the \eqn{j}-th column of the |
23 | 17 | #' original background data is replaced by the corresponding feature value \eqn{x_j} |
24 | 18 | #' of the observation to be explained. |
|
33 | 27 | #' |
34 | 28 | #' This is repeated multiple times until convergence, see CL21 for details. |
35 | 29 | #' |
36 | | -#' A drawback of this strategy is that many (at least 75%) of the \eqn{z} vectors will |
37 | | -#' have \eqn{\sum z \in \{1, p-1\}}, producing many duplicates. Similarly, at least 92% |
38 | | -#' of the mass will be used for the \eqn{p(p+1)} possible vectors with |
39 | | -#' \eqn{\sum z \in \{1, 2, p-2, p-1\}}. |
| 30 | +#' To avoid the evaluation of |
40 | 31 | #' This inefficiency can be fixed by a hybrid strategy, combining exact calculations |
41 | 32 | #' with sampling. |
42 | 33 | #' |
43 | 34 | #' The hybrid algorithm has two steps: |
44 | 35 | #' 1. Step 1 (exact part): There are \eqn{2p} different on-off vectors \eqn{z} with |
45 | | -#' \eqn{\sum z \in \{1, p-1\}}, covering a large proportion of the Kernel SHAP |
46 | | -#' distribution. The degree 1 hybrid will list those vectors and use them according |
| 36 | +#' \eqn{\sum z \in \{1, p-1\}}. |
| 37 | +#' The degree 1 hybrid will list those vectors and use them according |
47 | 38 | #' to their weights in the upcoming calculations. Depending on \eqn{p}, we can also go |
48 | 39 | #' a step further to a degree 2 hybrid by adding all \eqn{p(p-1)} vectors with |
49 | 40 | #' \eqn{\sum z \in \{2, p-2\}} to the process etc. The necessary predictions are |
|
96 | 87 | #' worse than the hybrid strategy and should therefore only be used for |
97 | 88 | #' studying properties of the Kernel SHAP algorithm. |
98 | 89 | #' - `1`: Uses all \eqn{2p} on-off vectors \eqn{z} with \eqn{\sum z \in \{1, p-1\}} |
99 | | -#' for the exact part, which covers at least 75% of the mass of the Kernel weight |
100 | | -#' distribution. The remaining mass is covered by random sampling. |
| 90 | +#' for the exact part. The remaining mass is covered by random sampling. |
101 | 91 | #' - `2`: Uses all \eqn{p(p+1)} on-off vectors \eqn{z} with |
102 | | -#' \eqn{\sum z \in \{1, 2, p-2, p-1\}}. This covers at least 92% of the mass of the |
103 | | -#' Kernel weight distribution. The remaining mass is covered by sampling. |
104 | | -#' Convergence usually happens in the minimal possible number of iterations of two. |
| 92 | +#' \eqn{\sum z \in \{1, 2, p-2, p-1\}}. The remaining mass is covered by sampling. |
| 93 | +#' Convergence usually happens very fast. |
105 | 94 | #' - `k>2`: Uses all on-off vectors with |
106 | 95 | #' \eqn{\sum z \in \{1, \dots, k, p-k, \dots, p-1\}}. |
107 | 96 | #' @param m Even number of on-off vectors sampled during one iteration. |
|
0 commit comments