Updated reduce_sum docs to reflect addition of lupmf

bbbales2 · bbbales2 · commit 9a6f4aa01118 · 2020-10-23T17:13:16.000-04:00
diff --git a/knitr/reduce-sum/logistic1.stan b/knitr/reduce-sum/logistic1.stan
@@ -1,12 +1,12 @@
 functions {
-  real partial_sum(int[] slice_n_redcards,
-                   int start, int end,
-                   int[] n_games,
-                   vector rating,
-                   vector beta) {
-    return binomial_logit_lpmf(slice_n_redcards |
-                               n_games[start:end],
-                               beta[1] + beta[2] * rating[start:end]);
+  real partial_sum_lpmf(int[] slice_n_redcards,
+                        int start, int end,
+                        int[] n_games,
+                        vector rating,
+                        vector beta) {
+    return binomial_logit_lupmf(slice_n_redcards |
+                                n_games[start:end],
+                                beta[1] + beta[2] * rating[start:end]);
   }
 }
 data {
@@ -24,6 +24,6 @@ model {
   beta[1] ~ normal(0, 10);
   beta[2] ~ normal(0, 1);
 
-  target += reduce_sum(partial_sum, n_redcards, grainsize,
+  target += reduce_sum(partial_sum_lupmf, n_redcards, grainsize,
                        n_games, rating, beta);
-}
+}
diff --git a/knitr/reduce-sum/reduce_sum_tutorial.Rmd b/knitr/reduce-sum/reduce_sum_tutorial.Rmd
@@ -149,55 +149,74 @@ statement:
 n_redcards ~ binomial_logit(n_games, beta[1] + beta[2] * rating);
 ```
 
-can be rewritten (up to a proportionality constant) as:
+can be rewritten as:
   
 ```{stan, output.var = "", eval = FALSE}
 for(n in 1:N) {
-  target += binomial_logit_lpmf(n_redcards[n] | n_games[n], beta[1] + beta[2] * rating[n])
+  target += binomial_logit_lupmf(n_redcards[n] | n_games[n], beta[1] + beta[2] * rating[n])
 }
 ```
 
-Now it is clear that the calculation is the sum of a number of
-conditionally independent Bernoulli log probability statements. So
-whenever we need to calculate a large sum where each term is
-independent of all others and associativity holds, then `reduce_sum`
-is useful.
+Now it is clear that the calculation is the sum (up to a
+proportionality constant) of a number of conditionally independent
+Bernoulli log probability statements. So whenever we need to calculate
+a large sum where each term is independent of all others and associativity
+holds, then `reduce_sum` is useful.
 
 To use `reduce_sum`, a function must be written that can be used to compute
 arbitrary sections of this sum.
 
+Note we used `binomial_logit_lupmf` instead of `binomial_logit_lpmf`.
+This is because we only need this likelihood term up to a proportionality
+constant for MCMC to work and for some distributions this can make code
+run noticeably faster. Because of the way that `_lupmf` features work,
+Stan only allows them in the model block or in user-defined probability
+distribution functions, and so the function we write for `reduce_sum`
+will need to be a probability distribution function (suffixed with
+`_lpdf` or `_lpmf`) for us to use `binomial_logit_lupmf` on the inside.
+If the difference in the normalized and unnormalized functions is not
+relevant for your application, you can call your `reduce_sum` function
+whatever you like.
+
 Using the reducer interface defined in
 [Reduce-Sum](https://mc-stan.org/docs/2_23/functions-reference/functions-reduce.html):
   
 ```{stan, output.var = "", eval = FALSE}
 functions {
-  real partial_sum(int[] slice_n_redcards,
-                   int start, int end,
-                   int[] n_games,
-                   vector rating,
-                   vector beta) {
-    return binomial_logit_lpmf(slice_n_redcards |
-                               n_games[start:end],
-                               beta[1] + beta[2] * rating[start:end]);
+  real partial_sum_lpmf(int[] slice_n_redcards,
+                        int start, int end,
+                        int[] n_games,
+                        vector rating,
+                        vector beta) {
+    return binomial_logit_lupmf(slice_n_redcards |
+                                n_games[start:end],
+                                beta[1] + beta[2] * rating[start:end]);
   }
 }
 ```
 
 The likelihood statement in the model can now be written:
   
 ```{stan, output.var = "", eval = FALSE}
-target += partial_sum(n_redcards, 1, N, n_games, rating, beta); // Sum terms 1 to N in the likelihood
+target += partial_sum_lupmf(n_redcards, 1, N, n_games, rating, beta); // Sum terms 1 to N in the likelihood
 ```
 
-Equivalently it could be broken into two pieces and written like:
+Note that we're calling `partial_sum_lupmf` even though we defined the
+function `partial_sum_lpmf`. `partial_sum_lupmf` is implicitly defined when
+we write `partial_sum_lpmf` and is a special version of the function that
+will signify to all the `_lupmf` calls inside it that it is okay to drop
+constants. If we call `partial_sum_lpmf`, the `binomial_logit_lupmf` function
+call will not drop constants (and hence be slower).
+
+Equivalently this partial sum could be broken into two pieces and written like:
 
 ```{stan, output.var = "", eval = FALSE}
 int M = N / 2;
-target += partial_sum(n_redcards[1:M], 1, M, n_games, rating, beta) // Sum terms 1 to M
-target += partial_sum(n_redcards[(M + 1):N], M + 1, N, n_games, rating, beta); // Sum terms M + 1 to N
+target += partial_sum_lupmf(n_redcards[1:M], 1, M, n_games, rating, beta) // Sum terms 1 to M
+target += partial_sum_lupmf(n_redcards[(M + 1):N], M + 1, N, n_games, rating, beta); // Sum terms M + 1 to N
 ```
 
-By passing `partial_sum` to `reduce_sum`, we allow Stan to
+By passing `partial_sum_lupmf` to `reduce_sum`, we allow Stan to
 automatically break up these calculations and do them in parallel.
 
 Notice the difference in how `n_redcards` is split in half (to reflect
@@ -211,7 +230,7 @@ likelihood:
 
 ```{stan, output.var = "", eval = FALSE}
 int grainsize = 1;
-target += reduce_sum(partial_sum, n_redcards, grainsize,
+target += reduce_sum(partial_sum_lupmf, n_redcards, grainsize,
                      n_games, rating, beta);
 ```
 
@@ -221,16 +240,20 @@ be estimated automatically (`grainsize` should be left at 1 unless specific test
 are done to
 [pick a different one](https://mc-stan.org/docs/2_23/stan-users-guide/reduce-sum.html#reduce-sum-grainsize)).
 
+Again, if we passed `partial_sum_lpmf` to `reduce_sum` instead of
+`partial_sum_lupmf` we would not take advantage of the performance benefits
+of using `bernoulli_logit_lupmf`.
+
 Making `grainsize` data (this makes it convenient to experiment with), the final
 model is:
 ```{stan, output.var = "", eval = FALSE}
 functions {
-  real partial_sum(int[] slice_n_redcards,
-                   int start, int end,
-                   int[] n_games,
-                   vector rating,
-                   vector beta) {
-    return binomial_logit_lpmf(slice_n_redcards |
+  real partial_sum_lpmf(int[] slice_n_redcards,
+                        int start, int end,
+                        int[] n_games,
+                        vector rating,
+                        vector beta) {
+    return binomial_logit_lupmf(slice_n_redcards |
                                n_games[start:end],
                                beta[1] + beta[2] * rating[start:end]);
   }
@@ -250,7 +273,7 @@ model {
   beta[1] ~ normal(0, 10);
   beta[2] ~ normal(0, 1);
 
-  target += reduce_sum(partial_sum, n_redcards, grainsize,
+  target += reduce_sum(partial_sum_lupmf, n_redcards, grainsize,
                        n_games, rating, beta);
 }
 ```
@@ -311,11 +334,11 @@ to check diagnostics. `reduce_sum` is a tool for speeding up single chain
 calculations, which can be useful for model development and on computers with
 large numbers of cores.
 
-We can do a quick check that these two methods are mixing with posterior.
+We can do a quick check that these two methods are mixing with the `posterior`
+package (https://github.com/stan-dev/posterior).
 When parallelizing a model is a good thing to do to make sure something is not
 breaking:
 ```{r}
-remotes::install_github("jgabry/posterior")
 library(posterior)
 summarise_draws(bind_draws(fit0$draws(), fit1$draws(), along = "chain"))
 ```