Skip to content

Commit 3b855e8

Browse files
committed
Responded to review comments for reduce_sum changes (mostly use slice instead of subset, rearranging a couple sentences, etc.) (design-doc pull request #17)
1 parent 3fcff92 commit 3b855e8

File tree

2 files changed

+14
-14
lines changed

2 files changed

+14
-14
lines changed

src/stan-users-guide/_bookdown.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,7 @@ rmd_files: [
2929
"problematic-posteriors.Rmd",
3030
"reparameterization.Rmd",
3131
"efficiency-tuning.Rmd",
32-
"parallel-computing.Rmd",
32+
"parallelization.Rmd",
3333

3434
"part-appendices.Rmd",
3535
"style-guide.Rmd",

src/stan-users-guide/parallel-computing.Rmd renamed to src/stan-users-guide/parallelization.Rmd

Lines changed: 13 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# Parallel Computing {#parallel-computing.chapter}
1+
# Parallelization {#parallelization.chapter}
22

33
Stan has two mechanisms for parallelizing calculations used in a model: `reduce_sum` and and `map_rect`.
44

@@ -8,30 +8,30 @@ The main advantages to `reduce_sum` are:
88
2. `reduce_sum` partitions the data for parallelization automatically (this is done manually in `map_rect`).
99
3. `reduce_sum` is easier to use.
1010

11-
while the advantages of `map_rect` are:
11+
The advantages of `map_rect` are:
1212

1313
1. `map_rect` returns a list of vectors, while `reduce_sum` returns only a real.
1414
2. `map_rect` can be parallelized across multiple computers, while `reduce_sum` can only parallelized across multiple cores.
1515

1616
## Reduce-Sum { #reduce-sum }
1717

18-
```reduce_sum``` is a tool for parallelizing operations that can be represented as a sum of functions, `g: U -> real`.
18+
```reduce_sum``` parallelizes operations that can be represented as a sum of functions, `g: U -> real`.
1919

2020
For instance, for a sequence of ```x``` values of type ```U```, ```{ x1, x2, ... }```, we might compute the sum:
2121

2222
```g(x1) + g(x2) + ...```
2323

2424
In probabilistic modeling this comes up when there are N conditionally independent terms in a likelihood. Because of the conditional independence, these terms can be computed in parallel. If dependencies exist between the terms, then this isn't possible. For instance, in evaluating the log density of a Gaussian process ```reduce_sum``` would not be very useful.
2525

26-
```reduce_sum``` doesn't actually take ```g: U -> real``` as an input argument. Instead it takes ```f: U[] -> real```, where ```f``` computes the partial sum corresponding to the slice of the sequence ```x``` passed in. For instance:
26+
```reduce_sum``` takes a function ```f: U[] -> real```, where ```f``` computes the partial sum corresponding to the slice of the sequence ```x``` passed in. For instance:
2727

2828
```
2929
f({ x1, x2, x3 }) = g(x1) + g(x2) + g(x3)
3030
f({ x1 }) = g(x1)
3131
f({ x1, x2, x3 }) = f({ x1, x2 }) + f({ x3 })
3232
```
3333

34-
If the user can write a function ```f: U[] -> real``` to compute the necessary partial sums in the calculation, then we can provide a function to automatically parallelize the calculations (and this is what ```reduce_sum``` is).
34+
If the user can write a function ```f: U[] -> real``` to compute the necessary partial sums in the calculation, then ```reduce_sum``` can automatically parallelize the calculations.
3535

3636
If the set of work is represented as an array ```{ x1, x2, x3, ... }```, then mathematically it is possible to rewrite this sum with any combination of partial sums.
3737

@@ -73,16 +73,16 @@ real reduce_sum(F func, T[] x, int grainsize, T1 s1, T2 s2, ...)
7373
The user-defined partial sum functions have the signature:
7474

7575
```
76-
real func(int start, int end, T[] x_subset, T1 arg1, T2 arg2, ...)
76+
real func(int start, int end, T[] x_slice, T1 arg1, T2 arg2, ...)
7777
```
7878

7979
and take the arguments:
8080
1. ```start``` - An integer specifying the first term in the partial sum
8181
2. ```end``` - An integer specifying the last term in the partial sum (inclusive)
82-
3. ```x_subset``` - The subset of ```x``` (from ```reduce_sum```) for which this partial sum is responsible (```x[start:end]```)
82+
3. ```x_slice``` - The subset of ```x``` (from ```reduce_sum```) for which this partial sum is responsible (```x[start:end]```)
8383
4-. ```arg1, arg2, ...``` Arguments shared in every term (passed on without modification from the reduce_sum call)
8484

85-
The user-provided function ```func``` is expect to compute the ```start``` through ```end``` terms of the overall sum, accumulate them, and return that value. The user function is passed the subset ```x[start:end]``` as ```x_subset```. ```start``` and ```end``` are passed so that ```func``` can index any of the tailing ```sM``` arguments as necessary. The trailing ```sM``` arguments are passed without modification to every call of ```func```.
85+
The user-provided function ```func``` is expect to compute the ```start``` through ```end``` terms of the overall sum, accumulate them, and return that value. The user function is passed the subset ```x[start:end]``` as ```x_slice```. ```start``` and ```end``` are passed so that ```func``` can index any of the tailing ```sM``` arguments as necessary. The trailing ```sM``` arguments are passed without modification to every call of ```func```.
8686

8787
The ```reduce_sum``` call:
8888

@@ -158,10 +158,10 @@ can be written like:
158158
```
159159
functions {
160160
real partial_sum(int start, int end,
161-
int[] y_subset,
161+
int[] y_slice,
162162
vector x,
163163
vector beta) {
164-
return bernoulli_logit_lpmf(y_subset | beta[1] + beta[2] * x[start:end]);
164+
return bernoulli_logit_lpmf(y_slice | beta[1] + beta[2] * x[start:end]);
165165
}
166166
}
167167
```
@@ -195,10 +195,10 @@ be estimated automatically. The final model looks like:
195195
```
196196
functions {
197197
real partial_sum(int start, int end,
198-
int[] y_subset,
198+
int[] y_slice,
199199
vector x,
200200
vector beta) {
201-
return bernoulli_logit_lpmf(y_subset | beta[1] + beta[2] * x[start:end]);
201+
return bernoulli_logit_lpmf(y_slice | beta[1] + beta[2] * x[start:end]);
202202
}
203203
}
204204
data {
@@ -221,7 +221,7 @@ model {
221221
### Picking the Grainsize
222222

223223
The `grainsize` is a recommendation on how large each piece of parallel work is
224-
(how many terms it contains). If zero, it will be chosen automatically, but it
224+
(how many terms it contains). If one, it will be chosen automatically, but it
225225
is probably best to choose this manually for each model.
226226

227227
To figure out an appropriate grainsize, think about how many terms are in the summation

0 commit comments

Comments
 (0)