Skip to content

Commit e55a78d

Browse files
Merge pull request #157 from bstatcomp/new_glms_and_broadcasting
New glms and broadcasting
2 parents 3bb2cf5 + 27c8b2a commit e55a78d

File tree

4 files changed

+380
-60
lines changed

4 files changed

+380
-60
lines changed

src/functions-reference/binary_distributions.Rmd

Lines changed: 58 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ represents the value true and 0 the value false.
77
if (knitr::is_html_output()) {
88
cat(' * <a href="bernoulli-distribution.html">Bernoulli Distribution</a>\n')
99
cat(' * <a href="bernoulli-logit-distribution.html">Bernoulli Distribution, Logit Parameterization</a>\n')
10-
cat(' * <a href="bernoulli-logit-glm.html">Bernoulli-Logit Generalised Linear Model (Logistic Regression)</a>\n')
10+
cat(' * <a href="bernoulli-logit-glm.html">Bernoulli-Logit generalized Linear Model (Logistic Regression)</a>\n')
1111
}
1212
```
1313

@@ -110,11 +110,11 @@ $\text{logit}^{-1}(\alpha)$; may only be used in transformed data and generated
110110
quantities blocks. For a description of argument and return types, see section
111111
[vectorized PRNG functions](#prng-vectorization).
112112

113-
## Bernoulli-Logit Generalised Linear Model (Logistic Regression) {#bernoulli-logit-glm}
113+
## Bernoulli-Logit Generalized Linear Model (Logistic Regression) {#bernoulli-logit-glm}
114114

115-
Stan also supplies a single primitive for a Generalised Linear Model
116-
with Bernoulli likelihood and logit link function, i.e. a primitive
117-
for a logistic regression. This should provide a more efficient
115+
Stan also supplies a single function for a generalized linear model
116+
with Bernoulli likelihood and logit link function, i.e. a function
117+
for a logistic regression. This provides a more efficient
118118
implementation of logistic regression than a manually written
119119
regression in terms of a Bernoulli likelihood and matrix
120120
multiplication.
@@ -142,24 +142,69 @@ dropping constant additive terms.
142142

143143
### Stan Functions
144144

145+
<!-- real; bernoulli_logit_glm_lpmf; (int y | matrix x, real alpha, vector beta); -->
146+
\index{{\tt \bfseries bernoulli\_logit\_glm\_lpmf }!{\tt (int y \textbar\ matrix x, real alpha, vector beta): real}|hyperpage}
147+
148+
`real` **`bernoulli_logit_glm_lpmf`**`(int y | matrix x, real alpha, vector beta)`<br>\newline
149+
The log Bernoulli probability mass of y given chance of success
150+
`inv_logit(alpha + x * beta)`, where the same intercept `alpha` and dependant variable value `y` are used
151+
for all observations. The number of columns of `x` needs to match the size of the
152+
coefficient vector `beta`.
153+
If `x` and `y` are data (not parameters) this function can be executed on a GPU.
154+
155+
<!-- real; bernoulli_logit_glm_lpmf; (int y | matrix x, vector alpha, vector beta); -->
156+
\index{{\tt \bfseries bernoulli\_logit\_glm\_lpmf }!{\tt (int y \textbar\ matrix x, vector alpha, vector beta): real}|hyperpage}
157+
158+
`real` **`bernoulli_logit_glm_lpmf`**`(int y | matrix x, vector alpha, vector beta)`<br>\newline
159+
The log Bernoulli probability mass of y given chance of success
160+
`inv_logit(alpha + x * beta)`, where an intercept `alpha` is used that is
161+
allowed to vary by observation. The dependant variable
162+
value `y` is used for all observations.
163+
The number of rows of `x` must match the size of `alpha` and
164+
the number of columns of `x` needs to match the size of the coefficient vector `beta`.
165+
If `x` and `y` are data (not parameters) this function can be executed on a GPU.
166+
167+
<!-- real; bernoulli_logit_glm_lpmf; (int[] y | row_vector x, real alpha, vector beta); -->
168+
\index{{\tt \bfseries bernoulli\_logit\_glm\_lpmf }!{\tt (int[] y \textbar\ row\_vector x, real alpha, vector beta): real}|hyperpage}
169+
170+
`real` **`bernoulli_logit_glm_lpmf`**`(int[] y | row_vector x, real alpha, vector beta)`<br>\newline
171+
The log Bernoulli probability mass of y given chance of success
172+
`inv_logit(alpha + x * beta)`, where the same intercept `alpha` and
173+
same independent variables values `x` are used for all observations.
174+
The number of columns of `x` needs to match the size of the coefficient vector `beta`.
175+
176+
<!-- real; bernoulli_logit_glm_lpmf; (int[] y | row_vector x, vector alpha, vector beta); -->
177+
\index{{\tt \bfseries bernoulli\_logit\_glm\_lpmf }!{\tt (int[] y \textbar\ row\_vector x, vector alpha, vector beta): real}|hyperpage}
178+
179+
`real` **`bernoulli_logit_glm_lpmf`**`(int[] y | row_vector x, vector alpha, vector beta)`<br>\newline
180+
The log Bernoulli probability mass of y given chance of success
181+
`inv_logit(alpha + x * beta)`, where an intercept `alpha` is used that is
182+
allowed to vary by observation.
183+
The same independent variables values `x` are used for all observations.
184+
The size of `y` must match the size of `alpha` and
185+
the number of columns of `x` needs to match the size of the coefficient vector `beta`.
186+
187+
145188
<!-- real; bernoulli_logit_glm_lpmf; (int[] y | matrix x, real alpha, vector beta); -->
146189
\index{{\tt \bfseries bernoulli\_logit\_glm\_lpmf }!{\tt (int[] y \textbar\ matrix x, real alpha, vector beta): real}|hyperpage}
147190

148191
`real` **`bernoulli_logit_glm_lpmf`**`(int[] y | matrix x, real alpha, vector beta)`<br>\newline
149192
The log Bernoulli probability mass of y given chance of success
150-
`inv_logit(alpha+x*beta)`, where a constant intercept `alpha` is used
193+
`inv_logit(alpha + x * beta)`, where the same intercept `alpha` is used
151194
for all observations. The number of rows of the independent variable
152-
matrix `x` needs to match the length of the dependent variable vector
153-
`y` and the number of columns of `x` needs to match the length of the
154-
weight vector `beta`.
195+
matrix `x` needs to match the size of the dependent variable vector
196+
`y` and the number of columns of `x` needs to match the size of the
197+
coefficient vector `beta`.
198+
If `x` and `y` are data (not parameters) this function can be executed on a GPU.
155199

156200
<!-- real; bernoulli_logit_glm_lpmf; (int[] y | matrix x, vector alpha, vector beta); -->
157201
\index{{\tt \bfseries bernoulli\_logit\_glm\_lpmf }!{\tt (int[] y \textbar\ matrix x, vector alpha, vector beta): real}|hyperpage}
158202

159203
`real` **`bernoulli_logit_glm_lpmf`**`(int[] y | matrix x, vector alpha, vector beta)`<br>\newline
160204
The log Bernoulli probability mass of y given chance of success
161-
`inv_logit(alpha+x*beta)`, where an intercept `alpha` is used that is
162-
allowed to vary with the different observations. The number of rows of
163-
the independent variable matrix `x` needs to match the length of the
205+
`inv_logit(alpha + x * beta)`, where an intercept `alpha` is used that is
206+
allowed to vary by observation. The number of rows of
207+
the independent variable matrix `x` needs to match the size of the
164208
dependent variable vector `y` and `alpha` and the number of columns of
165-
`x` needs to match the length of the weight vector `beta`.
209+
`x` needs to match the size of the coefficient vector `beta`.
210+
If `x` and `y` are data (not parameters) this function can be executed on a GPU.

src/functions-reference/bounded_discrete_distributions.Rmd

Lines changed: 151 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,9 @@ cat(' * <a href="binomial-distribution-logit-parameterization.html">Binomial Dis
1010
cat(' * <a href="beta-binomial-distribution.html">Beta-Binomial Distribution</a>\n')
1111
cat(' * <a href="hypergeometric-distribution.html">Hypergeometric Distribution</a>\n')
1212
cat(' * <a href="categorical-distribution.html">Categorical Distribution</a>\n')
13+
cat(' * <a href="categorical-logit-glm.html">Categorical Logit generalized Linear Model (Softmax Regression)</a>\n')
1314
cat(' * <a href="ordered-logistic-distribution.html">Ordered Logistic Distribution</a>\n')
15+
cat(' * <a href="ordered-logistic-glm.html">Ordered Logistic generalized Linear Model (Ordinal Regression)</a>\n')
1416
cat(' * <a href="ordered-probit-distribution.html">Ordered Probit Distribution</a>\n')
1517
}
1618
```
@@ -238,8 +240,8 @@ an $N$-simplex (i.e., has nonnegative entries summing to one), then
238240
for $y \in \{1,\ldots,N\}$, \[ \text{Categorical}(y~|~\theta) =
239241
\theta_y. \] In addition, Stan provides a log-odds scaled categorical
240242
distribution, \[ \text{CategoricalLogit}(y~|~\beta) =
241-
\text{Categorical}(y~|~\text{softmax}(\beta)). \] See section
242-
[softmax](#softmax) for the definition of the softmax function.
243+
\text{Categorical}(y~|~\text{softmax}(\beta)). \]
244+
See [the definition of softmax](#softmax) for the definition of the softmax function.
243245

244246
### Sampling Statement
245247

@@ -296,6 +298,83 @@ Generate a categorical variate with outcome in range $1:N$ from
296298
log-odds vector beta; may only be used in transformed data and generated
297299
quantities blocks
298300

301+
## Categorical Logit Generalized Linear Model (Softmax Regression) {#categorical-logit-glm}
302+
303+
Stan also supplies a single function for a generalized linear model
304+
with categorical likelihood and logit link function, i.e. a function
305+
for a softmax regression. This provides a more efficient
306+
implementation of softmax regression than a manually written
307+
regression in terms of a Categorical likelihood and matrix
308+
multiplication.
309+
310+
Note that the implementation does not put any restrictions on the coefficient matrix $\beta$. It is up to the user to use a reference category, a suitable prior or some other means of identifiability. See Multi-logit in the [Stan User's Guide](https://mc-stan.org/users/documentation/).
311+
312+
### Probability Mass Functions
313+
314+
If $N,M,K \in \mathbb{N}$, $N,M,K > 0$, and if $x\in \mathbb{R}^{M\cdot K}, \alpha \in \mathbb{R}^N, \beta\in \mathbb{R}^{K\cdot N}$, then for $y \in \{1,\ldots,N\}^M$,
315+
\[ \text{CategoricalLogitGLM}(y~|~x,\alpha,\beta) = \\[5pt]
316+
\prod_{1\leq i \leq M}\text{CategoricalLogit}(y_i~|~\alpha+x_i\cdot\beta) = \\[15pt]
317+
\prod_{1\leq i \leq M}\text{Categorical}(y_i~|~softmax(\alpha+x_i\cdot\beta)). \]
318+
See [the definition of softmax](#softmax) for the definition of the softmax function.
319+
320+
### Sampling Statement
321+
322+
`y ~ ` **`categorical_logit_glm`**`(x, alpha, beta)`
323+
324+
Increment target log probability density with `categorical_logit_glm(y | x, alpha, beta)`
325+
dropping constant additive terms.
326+
<!-- real; categorical_logit_glm ~; -->
327+
\index{{\tt \bfseries categorical\_logit\_glm }!sampling statement|hyperpage}
328+
329+
330+
### Stan Functions
331+
332+
<!-- real; categorical_logit_glm_lpmf; (int y | row_vector x, vector alpha, matrix beta); -->
333+
\index{{\tt \bfseries categorical\_logit\_glm\_lpmf }!{\tt (int y \textbar\ row\_vector x, vector alpha, matrix beta): real}|hyperpage}
334+
335+
`real` **`categorical_logit_glm_lpmf`**`(int y | row_vector x, vector alpha, matrix beta)`<br>\newline
336+
The log categorical probability mass function with outcome `y` in
337+
$1:N$ given $N$-vector of log-odds of outcomes `alpha + x * beta`.
338+
The size of the independent variable row vector `x` needs to match the number of rows of the
339+
coefficient matrix `beta`. The size of the intercept vector `alpha` must match the number
340+
of columns of the coefficient matrix `beta`.
341+
342+
<!-- real; categorical_logit_glm_lpmf; (int y | matrix x, vector alpha, matrix beta); -->
343+
\index{{\tt \bfseries categorical\_logit\_glm\_lpmf }!{\tt (int y \textbar\ matrix x, vector alpha, matrix beta): real}|hyperpage}
344+
345+
`real` **`categorical_logit_glm_lpmf`**`(int y | matrix x, vector alpha, matrix beta)`<br>\newline
346+
The log categorical probability mass function with outcomes `y` in
347+
$1:N$ given $N$-vector of log-odds of outcomes `alpha + x * beta`.
348+
The same vector of intercepts `alpha` and the same dependent variable value `y` are used for all instances.
349+
The number of columns of the independent variable `x` needs to match the number of rows of the
350+
coefficient matrix `beta`. The size of the intercept vector `alpha` must match the number
351+
of columns of the coefficient matrix `beta`. If `x` and `y` are data (not parameters) this function can be executed on a GPU.
352+
353+
<!-- real; categorical_logit_glm_lpmf; (int[] y | vector theta); -->
354+
\index{{\tt \bfseries categorical\_logit\_glm\_lpmf }!{\tt (int[] y \textbar\ row\_vector x, vector alpha, matrix beta): real}|hyperpage}
355+
356+
`real` **`categorical_logit_glm_lpmf`**`(int[] y | row_vector x, vector alpha, matrix beta)`<br>\newline
357+
The log categorical probability mass function with outcomes `y` in
358+
$1:N$ given $N$-vector of log-odds of outcomes `alpha + x * beta`.
359+
The same vector of intercepts `alpha` and same row vector of the independent variables `x` are used for all instances.
360+
The size of the independent variable matrix `x` needs to match the number of rows of the
361+
coefficient vector `beta`. The size of the intercept vector `alpha` must match the number
362+
of columns of the coefficient vector `beta`.
363+
364+
<!-- real; categorical_logit_glm_lpmf; (int[] y | vector theta); -->
365+
\index{{\tt \bfseries categorical\_logit\_glm\_lpmf }!{\tt (int[] y \textbar\ matrix x, vector alpha, matrix beta): real}|hyperpage}
366+
367+
`real` **`categorical_logit_glm_lpmf`**`(int[] y | matrix x, vector alpha, matrix beta)`<br>\newline
368+
The log categorical probability mass function with outcomes `y` in
369+
$1:N$ given $N$-vector of log-odds of outcomes `alpha + x * beta`.
370+
The same vector of intercepts `alpha` is used for all instances.
371+
The number of rows of the independent variable
372+
matrix `x` needs to match the size of the dependent variable vector
373+
`y`. The number of columns of independnt variable `x` needs to match the number of rows of the
374+
coefficient matrix `beta`. The size of the intercept vector `alpha` must match the number
375+
of columns of the coefficient matrix `beta`. If `x` and `y` are data (not parameters) this function can be executed on a GPU.
376+
377+
299378
## Ordered Logistic Distribution
300379

301380
### Probability Mass Function
@@ -330,14 +409,81 @@ dropping constant additive terms.
330409

331410
`real` **`ordered_logistic_lpmf`**`(ints k | vector eta, vectors c)`<br>\newline
332411
The log ordered logistic probability mass of k given linear predictors
333-
eta, and cutpoints c.
412+
`eta`, and cutpoints `c`.
334413

335414
<!-- int; ordered_logistic_rng; (real eta, vector c); -->
336415
\index{{\tt \bfseries ordered\_logistic\_rng }!{\tt (real eta, vector c): int}|hyperpage}
337416

338417
`int` **`ordered_logistic_rng`**`(real eta, vector c)`<br>\newline
339-
Generate an ordered logistic variate with linear predictor eta and
340-
cutpoints c; may only be used in transformed data and generated quantities blocks
418+
Generate an ordered logistic variate with linear predictor `eta` and
419+
cutpoints `c`; may only be used in transformed data and generated quantities blocks
420+
421+
## Ordered Logistic Generalized Linear Model (Ordinal Regression)
422+
423+
### Probability Mass Function
424+
425+
If $N,M,K \in \mathbb{N}$ with $N, M > 0$, $K > 2$, $c \in \mathbb{R}^{K-1}$ such that
426+
$c_k < c_{k+1}$ for $k \in \{1,\ldots,K-2\}$, and $x\in \mathbb{R}^{N\cdot M}, \beta\in \mathbb{R}^M$, then for $y \in \{1,\ldots,K\}^N$,
427+
\[\text{OrderedLogisticGLM}(y~|~x,\beta,c) = \\[4pt]
428+
\prod_{1\leq i \leq N}\text{OrderedLogistic}(y_i~|~x_i\cdot \beta,c) = \\[17pt]
429+
\prod_{1\leq i \leq N}\left\{ \begin{array}{ll}
430+
1 - \text{logit}^{-1}(x_i\cdot \beta - c_1) & \text{if } y = 1, \\[4pt]
431+
\text{logit}^{-1}(x_i\cdot \beta - c_{y-1}) - \text{logit}^{-1}(x_i\cdot \beta - c_{y}) & \text{if } 1 < y < K, \text{and} \\[4pt]
432+
\text{logit}^{-1}(x_i\cdot \beta - c_{K-1}) - 0 & \text{if } y = K.
433+
\end{array} \right. \] The $k=K$
434+
case is written with the redundant subtraction of zero to illustrate
435+
the parallelism of the cases; the $y=1$ and $y=K$ edge cases can be
436+
subsumed into the general definition by setting $c_0 = -\infty$ and
437+
$c_K = +\infty$ with $\text{logit}^{-1}(-\infty) = 0$ and
438+
$\text{logit}^{-1}(\infty) = 1$.
439+
440+
### Sampling Statement
441+
442+
`y ~ ` **`ordered_logistic_glm`**`(x, beta, c)`
443+
444+
Increment target log probability density with `ordered_logistic_lpmf(y | x, beta, c)`
445+
dropping constant additive terms.
446+
<!-- real; ordered_logistic ~; -->
447+
\index{{\tt \bfseries ordered\_logistic\_glm }!sampling statement|hyperpage}
448+
449+
### Stan Functions
450+
451+
<!-- real; ordered_logistic_glm_lpmf; (int y | row_vector x, vector beta, vector c); -->
452+
\index{{\tt \bfseries ordered\_logistic\_glm\_lpmf }!{\tt (int y \textbar\ row\_vector x, vector beta, vector c): real}|hyperpage}
453+
454+
`real` **`ordered_logistic_glm_lpmf`**`(int y | row_vector x, vector beta, vector c)`<br>\newline
455+
The log ordered logistic probability mass of y, given linear predictors `x * beta`, and cutpoints c.
456+
The size of the independent variable row vector `x` needs to match the size of the coefficient vector `beta`.
457+
The cutpoints `c` must be ordered.
458+
459+
<!-- real; ordered_logistic_glm_lpmf; (int y | matrix x, vector beta, vector c); -->
460+
\index{{\tt \bfseries ordered\_logistic\_glm\_lpmf }!{\tt (int y \textbar\ matrix x, vector beta, vector c): real}|hyperpage}
461+
462+
`real` **`ordered_logistic_glm_lpmf`**`(int y | matrix x, vector beta, vector c)`<br>\newline
463+
The log ordered logistic probability mass of y, given linear predictors `x * beta`, and cutpoints c.
464+
The same value of the independent variable `y` is used for all instances.
465+
The number of columns of the independent variable row vector `x` needs to match the size of the coefficient vector `beta`.
466+
The cutpoints `c` must be ordered. If `x` and `y` are data (not parameters) this function can be executed on a GPU.
467+
468+
<!-- real; ordered_logistic_glm_lpmf; (int[] y | row_vector x, vector beta, vector c); -->
469+
\index{{\tt \bfseries ordered\_logistic\_glm\_lpmf }!{\tt (int[] y \textbar\ row\_vector x, vector beta, vector c): real}|hyperpage}
470+
471+
`real` **`ordered_logistic_glm_lpmf`**`(int[] y | row_vector x, vector beta, vector c)`<br>\newline
472+
The log ordered logistic probability mass of y, given linear predictors `x * beta`, and cutpoints c.
473+
The same row vector of the independent variables `x` is used for all instances.
474+
The size of the independent variable row vector `x` needs to match the size of the coefficient vector `beta`.
475+
The cutpoints `c` must be ordered.
476+
477+
<!-- real; ordered_logistic_glm_lpmf; (int[] y | matrix x, vector beta, vector c); -->
478+
\index{{\tt \bfseries ordered\_logistic\_glm\_lpmf }!{\tt (int[] y \textbar\ matrix x, vector beta, vector c): real}|hyperpage}
479+
480+
`real` **`ordered_logistic_glm_lpmf`**`(int[] y | matrix x, vector beta, vector c)`<br>\newline
481+
The log ordered logistic probability mass of y, given linear predictors
482+
`x * beta`, and cutpoints c.
483+
The number of rows of the independent variable matrix `x` needs to match the size of the dependent variable vector `y`.
484+
The number of columns of the independent variable row vector `x` needs to match the size of the coefficient vector `beta`.
485+
The cutpoints `c` must be ordered. If `x` and `y` are data (not parameters) this function can be executed on a GPU.
486+
341487

342488
## Ordered Probit Distribution
343489

0 commit comments

Comments
 (0)