Added docs for two new GLMs and broadcasting in GLMs

t4c1 · t4c1 · commit a20c2ce7fa0e · 2020-03-05T14:06:37.000+01:00
diff --git a/src/functions-reference/binary_distributions.Rmd b/src/functions-reference/binary_distributions.Rmd
@@ -142,6 +142,49 @@ dropping constant additive terms.
 
 ### Stan Functions
 
+<!-- real; bernoulli_logit_glm_lpmf; (int y | matrix x, real alpha, vector beta); -->
+\index{{\tt \bfseries bernoulli\_logit\_glm\_lpmf  }!{\tt (int y \textbar\ matrix x, real alpha, vector beta): real}|hyperpage}
+
+`real` **`bernoulli_logit_glm_lpmf`**`(int y | matrix x, real alpha, vector beta)`<br>\newline
+The log Bernoulli probability mass of y given chance of success
+`inv_logit(alpha+x*beta)`, where a constant intercept `alpha` and dependant variable value `y` are used
+for all observations. The number of columns of `x` needs to match the length of the
+weight vector `beta`.
+If `x` and `y` are data (not parameters) this function can be executed on a GPU.
+
+<!-- real; bernoulli_logit_glm_lpmf; (int y | matrix x, vector alpha, vector beta); -->
+\index{{\tt \bfseries bernoulli\_logit\_glm\_lpmf  }!{\tt (int y \textbar\ matrix x, vector alpha, vector beta): real}|hyperpage}
+
+`real` **`bernoulli_logit_glm_lpmf`**`(int y | matrix x, vector alpha, vector beta)`<br>\newline
+The log Bernoulli probability mass of y given chance of success
+`inv_logit(alpha+x*beta)`, where an intercept `alpha` is used that is
+allowed to vary with the different observations. The dependant variable 
+value `y` is used for all observations. 
+The number of rows of `x` must match the length of `alpha` and 
+the number of columns of `x` needs to match the length of the weight vector `beta`.
+If `x` and `y` are data (not parameters) this function can be executed on a GPU.
+
+<!-- real; bernoulli_logit_glm_lpmf; (int[] y | row_vector x, real alpha, vector beta); -->
+\index{{\tt \bfseries bernoulli\_logit\_glm\_lpmf  }!{\tt (int[] y \textbar\ row\_vector x, real alpha, vector beta): real}|hyperpage}
+
+`real` **`bernoulli_logit_glm_lpmf`**`(int[] y | row_vector x, real alpha, vector beta)`<br>\newline
+The log Bernoulli probability mass of y given chance of success
+`inv_logit(alpha+x*beta)`, where a constant intercept `alpha` and 
+same indepependent variables values `x` are used for all observations.
+The number of columns of `x` needs to match the length of the weight vector `beta`.
+
+<!-- real; bernoulli_logit_glm_lpmf; (int[] y | row_vector x, vector alpha, vector beta); -->
+\index{{\tt \bfseries bernoulli\_logit\_glm\_lpmf  }!{\tt (int[] y \textbar\ row\_vector x, vector alpha, vector beta): real}|hyperpage}
+
+`real` **`bernoulli_logit_glm_lpmf`**`(int[] y | row_vector x, vector alpha, vector beta)`<br>\newline
+The log Bernoulli probability mass of y given chance of success
+`inv_logit(alpha+x*beta)`, where an intercept `alpha` is used that is
+allowed to vary with the different observations.
+Same indepependent variables values `x` are used for all observations. 
+The length of `y` must match the length of `alpha` and 
+the number of columns of `x` needs to match the length of the weight vector `beta`.
+
+
 <!-- real; bernoulli_logit_glm_lpmf; (int[] y | matrix x, real alpha, vector beta); -->
 \index{{\tt \bfseries bernoulli\_logit\_glm\_lpmf  }!{\tt (int[] y \textbar\ matrix x, real alpha, vector beta): real}|hyperpage}
 
@@ -152,6 +195,7 @@ for all observations. The number of rows of the independent variable
 matrix `x` needs to match the length of the dependent variable vector
 `y` and the number of columns of `x` needs to match the length of the
 weight vector `beta`.
+If `x` and `y` are data (not parameters) this function can be executed on a GPU.
 
 <!-- real; bernoulli_logit_glm_lpmf; (int[] y | matrix x, vector alpha, vector beta); -->
 \index{{\tt \bfseries bernoulli\_logit\_glm\_lpmf  }!{\tt (int[] y \textbar\ matrix x, vector alpha, vector beta): real}|hyperpage}
@@ -163,3 +207,4 @@ allowed to vary with the different observations. The number of rows of
 the independent variable matrix `x` needs to match the length of the
 dependent variable vector `y` and `alpha` and the number of columns of
 `x` needs to match the length of the weight vector `beta`.
+If `x` and `y` are data (not parameters) this function can be executed on a GPU.
diff --git a/src/functions-reference/bounded_discrete_distributions.Rmd b/src/functions-reference/bounded_discrete_distributions.Rmd
@@ -10,7 +10,9 @@ cat(' * <a href="binomial-distribution-logit-parameterization.html">Binomial Dis
 cat(' * <a href="beta-binomial-distribution.html">Beta-Binomial Distribution</a>\n')
 cat(' * <a href="hypergeometric-distribution.html">Hypergeometric Distribution</a>\n')
 cat(' * <a href="categorical-distribution.html">Categorical Distribution</a>\n')
+cat(' * <a href="categorical-logit-glm.html">Categorical Logit Generalised Linear Model (Softmax Regression)</a>\n')
 cat(' * <a href="ordered-logistic-distribution.html">Ordered Logistic Distribution</a>\n')
+cat(' * <a href="ordered-logistic-glm.html">Ordered Logistic Generalised Linear Model (Ordinal Regression)</a>\n')
 cat(' * <a href="ordered-probit-distribution.html">Ordered Probit Distribution</a>\n')
 }
 ```
@@ -296,6 +298,81 @@ Generate a categorical variate with outcome in range $1:N$ from
 log-odds vector beta; may only be used in transformed data and generated
 quantities blocks
 
+## Categorical Logit Generalised Linear Model (Softmax Regression) {#categorical-logit-glm}
+
+Stan also supplies a single primitive for a Generalised Linear Model
+with Categorical likelihood and logit link function, i.e. a primitive
+for a softmax regression. This should provide a more efficient
+implementation of softmax regression than a manually written
+regression in terms of a Categorical likelihood and matrix
+multiplication.
+
+### Probability Mass Functions
+
+If $N,M,K \in \mathbb{N}$, $N,M,K > 0$, and if $x\in \mathbb{R}^{M\cdot K}, \alpha \in \mathbb{R}^N, \beta\in \mathbb{R}^{K\cdot N}$, then for $y \in \{1,\ldots,N\}^M$, 
+\[ \text{CategoricalLogitGLM}(y~|~x,\alpha,\beta) = \\[5pt]
+\prod_{1\leq i \leq M}\text{CategoricalLogit}(y_i~|~\alpha+\beta_i\cdot x_i) = \\[15pt] 
+\prod_{1\leq i \leq M}\text{Categorical}(y_i~|~softmax(\alpha+\beta_i\cdot x_i)). \] 
+See section [softmax](#softmax) for the definition of the softmax function.
+
+### Sampling Statement
+
+`y ~ ` **`categorical_logit_glm`**`(x, alpha, beta)`
+
+Increment target log probability density with `categorical_logit_glm(y | x, alpha, beta)`
+dropping constant additive terms.
+<!-- real; categorical_logit_glm ~; -->
+\index{{\tt \bfseries categorical\_logit\_glm }!sampling statement|hyperpage}
+
+
+### Stan Functions
+
+<!-- real; categorical_logit_glm_lpmf; (int y | row_vector x, vector alpha, matrix beta); -->
+\index{{\tt \bfseries categorical\_logit\_glm\_lpmf }!{\tt (int y \textbar\ row\_vector x, vector alpha, matrix beta): real}|hyperpage}
+
+`real` **`categorical_logit_glm_lpmf`**`(int y | row_vector x, vector alpha, matrix beta)`<br>\newline
+The log categorical probability mass function with outcome `y` in
+$1:N$ given $N$-vector of log-odds of outcomes `alpha + x * beta`.
+The size of independant variable row vector `x` needs to match the number of rows of the
+weight matrix `beta`. The size of intercept vector `alpha` must match number 
+of columns of the weight matrix `beta`.
+
+<!-- real; categorical_logit_glm_lpmf; (int y | matrix x, vector alpha, matrix beta); -->
+\index{{\tt \bfseries categorical\_logit\_glm\_lpmf }!{\tt (int y \textbar\ matrix x, vector alpha, matrix beta): real}|hyperpage}
+
+`real` **`categorical_logit_glm_lpmf`**`(int y | matrix x, vector alpha, matrix beta)`<br>\newline
+The log categorical probability mass function with outcomes `y` in
+$1:N$ given $N$-vector of log-odds of outcomes `alpha + x * beta`.
+Same vector of intercepts `alpha` and same dependant variable value `y` are used for all instances.
+The number of columns of independant variable `x` needs to match the number of rows of the
+weight matrix `beta`. The size of intercept vector `alpha` must match number 
+of columns of the weight matrix `beta`. If `x` and `y` are data (not parameters) this function can be executed on a GPU.
+
+<!-- real; categorical_logit_glm_lpmf; (int[] y | vector theta); -->
+\index{{\tt \bfseries categorical\_logit\_glm\_lpmf }!{\tt (int[] y \textbar\ row\_vector x, vector alpha, matrix beta): real}|hyperpage}
+
+`real` **`categorical_logit_glm_lpmf`**`(int[] y | row_vector x, vector alpha, matrix beta)`<br>\newline
+The log categorical probability mass function with outcomes `y` in
+$1:N$ given $N$-vector of log-odds of outcomes `alpha + x * beta`.
+Same vector of intercepts `alpha` and same row vector of independant variables `x` are used for all instances.
+The size of independant variable matrix `x` needs to match the number of rows of the
+weight vector `beta`. The size of intercept vector `alpha` must match number 
+of columns of the weight vector `beta`.
+
+<!-- real; categorical_logit_glm_lpmf; (int[] y | vector theta); -->
+\index{{\tt \bfseries categorical\_logit\_glm\_lpmf }!{\tt (int[] y \textbar\ matrix x, vector alpha, matrix beta): real}|hyperpage}
+
+`real` **`categorical_logit_glm_lpmf`**`(int[] y | matrix x, vector alpha, matrix beta)`<br>\newline
+The log categorical probability mass function with outcomes `y` in
+$1:N$ given $N$-vector of log-odds of outcomes `alpha + x * beta`. 
+Same vector of intercepts `alpha` is used for all instances.
+The number of rows of the independent variable
+matrix `x` needs to match the length of the dependent variable vector
+`y`. The number of columns of independant variable `x` needs to match the number of rows of the
+weight matrix `beta`. The size of intercept vector `alpha` must match number 
+of columns of the weight matrix `beta`. If `x` and `y` are data (not parameters) this function can be executed on a GPU.
+
+
 ## Ordered Logistic Distribution
 
 ### Probability Mass Function
@@ -339,6 +416,73 @@ eta, and cutpoints c.
 Generate an ordered logistic variate with linear predictor eta and
 cutpoints c; may only be used in transformed data and generated quantities blocks
 
+## Ordered Logistic Generalised Linear Model (Ordinal Regression)
+
+### Probability Mass Function
+
+If $N,M,K \in \mathbb{N}$ with $N, M > 0$, $K > 2$, $c \in \mathbb{R}^{K-1}$ such that
+$c_k < c_{k+1}$ for $k \in \{1,\ldots,K-2\}$, and $x\in \mathbb{R}^{N\cdot M}, \beta\in \mathbb{R}^M$, then for $y \in \{1,\ldots,K\}^N$, 
+\[\text{OrderedLogisticGLM}(y~|~x,\beta,c) = \\[4pt]
+\prod_{1\leq i \leq N}\text{OrderedLogistic}(y_i~|~x_i\cdot \beta,c) = \\[17pt]
+\prod_{1\leq i \leq N}\left\{ \begin{array}{ll}
+1 - \text{logit}^{-1}(x_i\cdot \beta - c_1)  &  \text{if } y = 1, \\[4pt]
+\text{logit}^{-1}(x_i\cdot \beta - c_{y-1}) - \text{logit}^{-1}(x_i\cdot \beta - c_{y}) & \text{if } 1 < y < K, \text{and} \\[4pt]
+\text{logit}^{-1}(x_i\cdot \beta - c_{K-1}) - 0  &  \text{if } y = K. 
+\end{array} \right. \] The $k=K$
+case is written with the redundant subtraction of zero to illustrate
+the parallelism of the cases; the $y=1$ and $y=K$ edge cases can be
+subsumed into the general definition by setting $c_0 = -\infty$ and
+$c_K = +\infty$ with $\text{logit}^{-1}(-\infty) = 0$ and
+$\text{logit}^{-1}(\infty) = 1$.
+
+### Sampling Statement
+
+`y ~ ` **`ordered_logistic_glm`**`(x, beta, c)`
+
+Increment target log probability density with `ordered_logistic_lpmf(y | x, beta, c)`
+dropping constant additive terms.
+<!-- real; ordered_logistic ~; -->
+\index{{\tt \bfseries ordered\_logistic\_glm }!sampling statement|hyperpage}
+
+### Stan Functions
+
+<!-- real; ordered_logistic_glm_lpmf; (int y | row_vector x, vector beta, vector c); -->
+\index{{\tt \bfseries ordered\_logistic\_glm\_lpmf }!{\tt (int y \textbar\ row\_vector x, vector beta, vector c): real}|hyperpage}
+
+`real` **`ordered_logistic_glm_lpmf`**`(int y | row_vector x, vector beta, vector c)`<br>\newline
+The log ordered logistic probability mass of y, given linear predictors `x * beta`, and cutpoints c. 
+The size of independant variable row vector `x` needs to match the size of the weight vector `beta`. 
+Cutpoints `c` must be ordered.
+
+<!-- real; ordered_logistic_glm_lpmf; (int y | matrix x, vector beta, vector c); -->
+\index{{\tt \bfseries ordered\_logistic\_glm\_lpmf }!{\tt (int y \textbar\ matrix x, vector beta, vector c): real}|hyperpage}
+
+`real` **`ordered_logistic_glm_lpmf`**`(int y | matrix x, vector beta, vector c)`<br>\newline
+The log ordered logistic probability mass of y, given linear predictors `x * beta`, and cutpoints c. 
+Same value of independant variable `y` is used for all instances.
+The number of columns of independant variable row vector `x` needs to match the size of the weight vector `beta`.
+Cutpoints `c` must be ordered. If `x` and `y` are data (not parameters) this function can be executed on a GPU.
+
+<!-- real; ordered_logistic_glm_lpmf; (int[] y | row_vector x, vector beta, vector c); -->
+\index{{\tt \bfseries ordered\_logistic\_glm\_lpmf }!{\tt (int[] y \textbar\ row\_vector x, vector beta, vector c): real}|hyperpage}
+
+`real` **`ordered_logistic_glm_lpmf`**`(int[] y | row_vector x, vector beta, vector c)`<br>\newline
+The log ordered logistic probability mass of y, given linear predictors `x * beta`, and cutpoints c. 
+Same row vector of independant variables `x` is used for all instances.
+The size of independant variable row vector `x` needs to match the size of the weight vector `beta`. 
+Cutpoints `c` must be ordered.
+
+<!-- real; ordered_logistic_glm_lpmf; (int[] y | matrix x, vector beta, vector c); -->
+\index{{\tt \bfseries ordered\_logistic\_glm\_lpmf }!{\tt (int[] y \textbar\ matrix x, vector beta, vector c): real}|hyperpage}
+
+`real` **`ordered_logistic_glm_lpmf`**`(int[] y | matrix x, vector beta, vector c)`<br>\newline
+The log ordered logistic probability mass of y, given linear predictors
+`x * beta`, and cutpoints c. 
+The number of rows of the independent variable matrix `x` needs to match the length of the dependent variable vector `y`.
+The number of columns of independant variable row vector `x` needs to match the size of the weight vector `beta`.
+Cutpoints `c` must be ordered. If `x` and `y` are data (not parameters) this function can be executed on a GPU.
+
+
 ## Ordered Probit Distribution
 
 ### Probability Mass Function
diff --git a/src/functions-reference/unbounded_continuous_distributions.Rmd b/src/functions-reference/unbounded_continuous_distributions.Rmd
@@ -176,6 +176,49 @@ dropping constant additive terms.
 
 ### Stan Functions
 
+<!-- real; normal_id_glm_lpdf; (real y | matrix x, real alpha, vector beta, real sigma); -->
+\index{{\tt \bfseries normal\_id\_glm\_lpdf }!{\tt (real y \textbar\ matrix x, real alpha, vector beta, real sigma): real}|hyperpage}
+
+`real` **`normal_id_glm_lpdf`**`(real y | matrix x, real alpha, vector beta, real sigma)`<br>\newline
+The log normal probability density of y given location `alpha+x*beta`
+and scale `sigma`, where a constant intercept `alpha`, `sigma` and dependent variable value `y` are
+used for all observations. The number of columns of `x` needs to match
+the length of the weight vector `beta`.
+If `x` and `y` are data (not parameters) this function can be executed on a GPU.
+
+<!-- real; normal_id_glm_lpdf; (real y | matrix x, vector alpha, vector beta, real sigma); -->
+\index{{\tt \bfseries normal\_id\_glm\_lpdf }!{\tt (real y \textbar\ matrix x, vector alpha, vector beta, real sigma): real}|hyperpage}
+
+`real` **`normal_id_glm_lpdf`**`(real y | matrix x, vector alpha, vector beta, real sigma)`<br>\newline
+The log normal probability density of y given location `alpha+x*beta`
+and scale `sigma`, where a constant `sigma` and dependent variable value`y` are used for all
+observations and an intercept `alpha` is used that is allowed to vary
+with the different observations. The number of rows of the independent
+variable matrix `x` needs to match the length of the intercept
+vector `alpha` and the number of columns of `x` needs to match
+the length of the weight vector `beta`.
+If `x` and `y` are data (not parameters) this function can be executed on a GPU.
+
+<!-- real; normal_id_glm_lpdf; (vector y | row_vector x, real alpha, vector beta, real sigma); -->
+\index{{\tt \bfseries normal\_id\_glm\_lpdf }!{\tt (vector y \textbar\ row\_vector x, real alpha, vector beta, real sigma): real}|hyperpage}
+
+`real` **`normal_id_glm_lpdf`**`(vector y | row_vector x, real alpha, vector beta, real sigma)`<br>\newline
+The log normal probability density of y given location `alpha+x*beta`
+and scale `sigma`, where a constant intercept `alpha`, `sigma` and independent variable values `x` are
+used for all observations. The number of columns of `x` needs to match
+the length of the weight vector `beta`.
+
+<!-- real; normal_id_glm_lpdf; (vector y | row_vector x, vector alpha, vector beta, real sigma); -->
+\index{{\tt \bfseries normal\_id\_glm\_lpdf }!{\tt (vector y \textbar\ row\_vector x, vector alpha, vector beta, real sigma): real}|hyperpage}
+
+`real` **`normal_id_glm_lpdf`**`(vector y | row_vector x, vector alpha, vector beta, real sigma)`<br>\newline
+The log normal probability density of y given location `alpha+x*beta`
+and scale `sigma`, where a constant `sigma` and independent variable values `x` are used for all
+observations and an intercept `alpha` is used that is allowed to vary
+with the different observations. The length of the dependent
+variable vector `y` needs to match the length of the intercept vector `alpha` and the number of columns of `x` needs to match
+the length of the weight vector `beta`.
+
 <!-- real; normal_id_glm_lpdf; (vector y | matrix x, real alpha, vector beta, real sigma); -->
 \index{{\tt \bfseries normal\_id\_glm\_lpdf }!{\tt (vector y \textbar\ matrix x, real alpha, vector beta, real sigma): real}|hyperpage}
 
@@ -186,6 +229,7 @@ used for all observations. The number of rows of the independent
 variable matrix `x` needs to match the length of the dependent
 variable vector `y` and the number of columns of `x` needs to match
 the length of the weight vector `beta`.
+If `x` and `y` are data (not parameters) this function can be executed on a GPU.
 
 <!-- real; normal_id_glm_lpdf; (vector y | matrix x, vector alpha, vector beta, real sigma); -->
 \index{{\tt \bfseries normal\_id\_glm\_lpdf }!{\tt (vector y \textbar\ matrix x, vector alpha, vector beta, real sigma): real}|hyperpage}
@@ -196,8 +240,9 @@ and scale `sigma`, where a constant `sigma` is used for all
 observations and an intercept `alpha` is used that is allowed to vary
 with the different observations. The number of rows of the independent
 variable matrix `x` needs to match the length of the dependent
-variable vector `y` and the number of columns of `x` needs to match
+variable vector `y` and the length of intercept vector alpha. The number of columns of `x` needs to match
 the length of the weight vector `beta`.
+If `x` and `y` are data (not parameters) this function can be executed on a GPU.
 
 ## Exponentially Modified Normal Distribution
 
diff --git a/src/functions-reference/unbounded_discrete_distributions.Rmd b/src/functions-reference/unbounded_discrete_distributions.Rmd