some fixes

t4c1 · t4c1 · commit 33f3b01fcebe · 2020-03-06T09:22:42.000+01:00
diff --git a/src/functions-reference/binary_distributions.Rmd b/src/functions-reference/binary_distributions.Rmd
@@ -147,7 +147,7 @@ dropping constant additive terms.
 
 `real` **`bernoulli_logit_glm_lpmf`**`(int y | matrix x, real alpha, vector beta)`<br>\newline
 The log Bernoulli probability mass of y given chance of success
-`inv_logit(alpha+x*beta)`, where a constant intercept `alpha` and dependant variable value `y` are used
+`inv_logit(alpha + x * beta)`, where a constant intercept `alpha` and dependant variable value `y` are used
 for all observations. The number of columns of `x` needs to match the length of the
 weight vector `beta`.
 If `x` and `y` are data (not parameters) this function can be executed on a GPU.
@@ -157,7 +157,7 @@ If `x` and `y` are data (not parameters) this function can be executed on a GPU.
 
 `real` **`bernoulli_logit_glm_lpmf`**`(int y | matrix x, vector alpha, vector beta)`<br>\newline
 The log Bernoulli probability mass of y given chance of success
-`inv_logit(alpha+x*beta)`, where an intercept `alpha` is used that is
+`inv_logit(alpha + x * beta)`, where an intercept `alpha` is used that is
 allowed to vary with the different observations. The dependant variable 
 value `y` is used for all observations. 
 The number of rows of `x` must match the length of `alpha` and 
@@ -169,18 +169,18 @@ If `x` and `y` are data (not parameters) this function can be executed on a GPU.
 
 `real` **`bernoulli_logit_glm_lpmf`**`(int[] y | row_vector x, real alpha, vector beta)`<br>\newline
 The log Bernoulli probability mass of y given chance of success
-`inv_logit(alpha+x*beta)`, where a constant intercept `alpha` and 
-same indepependent variables values `x` are used for all observations.
+`inv_logit(alpha + x * beta)`, where a constant intercept `alpha` and 
+same independent variables values `x` are used for all observations.
 The number of columns of `x` needs to match the length of the weight vector `beta`.
 
 <!-- real; bernoulli_logit_glm_lpmf; (int[] y | row_vector x, vector alpha, vector beta); -->
 \index{{\tt \bfseries bernoulli\_logit\_glm\_lpmf  }!{\tt (int[] y \textbar\ row\_vector x, vector alpha, vector beta): real}|hyperpage}
 
 `real` **`bernoulli_logit_glm_lpmf`**`(int[] y | row_vector x, vector alpha, vector beta)`<br>\newline
 The log Bernoulli probability mass of y given chance of success
-`inv_logit(alpha+x*beta)`, where an intercept `alpha` is used that is
+`inv_logit(alpha + x * beta)`, where an intercept `alpha` is used that is
 allowed to vary with the different observations.
-Same indepependent variables values `x` are used for all observations. 
+Same independent variables values `x` are used for all observations. 
 The length of `y` must match the length of `alpha` and 
 the number of columns of `x` needs to match the length of the weight vector `beta`.
 
@@ -190,7 +190,7 @@ the number of columns of `x` needs to match the length of the weight vector `bet
 
 `real` **`bernoulli_logit_glm_lpmf`**`(int[] y | matrix x, real alpha, vector beta)`<br>\newline
 The log Bernoulli probability mass of y given chance of success
-`inv_logit(alpha+x*beta)`, where a constant intercept `alpha` is used
+`inv_logit(alpha + x * beta)`, where a constant intercept `alpha` is used
 for all observations. The number of rows of the independent variable
 matrix `x` needs to match the length of the dependent variable vector
 `y` and the number of columns of `x` needs to match the length of the
@@ -202,7 +202,7 @@ If `x` and `y` are data (not parameters) this function can be executed on a GPU.
 
 `real` **`bernoulli_logit_glm_lpmf`**`(int[] y | matrix x, vector alpha, vector beta)`<br>\newline
 The log Bernoulli probability mass of y given chance of success
-`inv_logit(alpha+x*beta)`, where an intercept `alpha` is used that is
+`inv_logit(alpha + x * beta)`, where an intercept `alpha` is used that is
 allowed to vary with the different observations. The number of rows of
 the independent variable matrix `x` needs to match the length of the
 dependent variable vector `y` and `alpha` and the number of columns of
diff --git a/src/functions-reference/bounded_discrete_distributions.Rmd b/src/functions-reference/bounded_discrete_distributions.Rmd
@@ -307,12 +307,14 @@ implementation of softmax regression than a manually written
 regression in terms of a Categorical likelihood and matrix
 multiplication.
 
+Note that the implementation does not put any restrictions on the coefficient matrix $\beta$. It is up to the user to use a reference category, a suitable prior or some other means of avoiding non-identifiability. See Multi-logit in the [Stan User's Guide](https://mc-stan.org/docs/2_21/stan-users-guide/multi-logit-section.html).
+
 ### Probability Mass Functions
 
 If $N,M,K \in \mathbb{N}$, $N,M,K > 0$, and if $x\in \mathbb{R}^{M\cdot K}, \alpha \in \mathbb{R}^N, \beta\in \mathbb{R}^{K\cdot N}$, then for $y \in \{1,\ldots,N\}^M$, 
 \[ \text{CategoricalLogitGLM}(y~|~x,\alpha,\beta) = \\[5pt]
-\prod_{1\leq i \leq M}\text{CategoricalLogit}(y_i~|~\alpha+\beta_i\cdot x_i) = \\[15pt] 
-\prod_{1\leq i \leq M}\text{Categorical}(y_i~|~softmax(\alpha+\beta_i\cdot x_i)). \] 
+\prod_{1\leq i \leq M}\text{CategoricalLogit}(y_i~|~\alpha+x_i\cdot\beta) = \\[15pt] 
+\prod_{1\leq i \leq M}\text{Categorical}(y_i~|~softmax(\alpha+x_i\cdot\beta)). \] 
 See section [softmax](#softmax) for the definition of the softmax function.
 
 ### Sampling Statement
@@ -333,7 +335,7 @@ dropping constant additive terms.
 `real` **`categorical_logit_glm_lpmf`**`(int y | row_vector x, vector alpha, matrix beta)`<br>\newline
 The log categorical probability mass function with outcome `y` in
 $1:N$ given $N$-vector of log-odds of outcomes `alpha + x * beta`.
-The size of independant variable row vector `x` needs to match the number of rows of the
+The size of independent variable row vector `x` needs to match the number of rows of the
 weight matrix `beta`. The size of intercept vector `alpha` must match number 
 of columns of the weight matrix `beta`.
 
@@ -344,7 +346,7 @@ of columns of the weight matrix `beta`.
 The log categorical probability mass function with outcomes `y` in
 $1:N$ given $N$-vector of log-odds of outcomes `alpha + x * beta`.
 Same vector of intercepts `alpha` and same dependant variable value `y` are used for all instances.
-The number of columns of independant variable `x` needs to match the number of rows of the
+The number of columns of independent variable `x` needs to match the number of rows of the
 weight matrix `beta`. The size of intercept vector `alpha` must match number 
 of columns of the weight matrix `beta`. If `x` and `y` are data (not parameters) this function can be executed on a GPU.
 
@@ -354,8 +356,8 @@ of columns of the weight matrix `beta`. If `x` and `y` are data (not parameters)
 `real` **`categorical_logit_glm_lpmf`**`(int[] y | row_vector x, vector alpha, matrix beta)`<br>\newline
 The log categorical probability mass function with outcomes `y` in
 $1:N$ given $N$-vector of log-odds of outcomes `alpha + x * beta`.
-Same vector of intercepts `alpha` and same row vector of independant variables `x` are used for all instances.
-The size of independant variable matrix `x` needs to match the number of rows of the
+Same vector of intercepts `alpha` and same row vector of independent variables `x` are used for all instances.
+The size of independent variable matrix `x` needs to match the number of rows of the
 weight vector `beta`. The size of intercept vector `alpha` must match number 
 of columns of the weight vector `beta`.
 
@@ -368,7 +370,7 @@ $1:N$ given $N$-vector of log-odds of outcomes `alpha + x * beta`.
 Same vector of intercepts `alpha` is used for all instances.
 The number of rows of the independent variable
 matrix `x` needs to match the length of the dependent variable vector
-`y`. The number of columns of independant variable `x` needs to match the number of rows of the
+`y`. The number of columns of independnt variable `x` needs to match the number of rows of the
 weight matrix `beta`. The size of intercept vector `alpha` must match number 
 of columns of the weight matrix `beta`. If `x` and `y` are data (not parameters) this function can be executed on a GPU.
 
@@ -451,25 +453,25 @@ dropping constant additive terms.
 
 `real` **`ordered_logistic_glm_lpmf`**`(int y | row_vector x, vector beta, vector c)`<br>\newline
 The log ordered logistic probability mass of y, given linear predictors `x * beta`, and cutpoints c. 
-The size of independant variable row vector `x` needs to match the size of the weight vector `beta`. 
+The size of independent variable row vector `x` needs to match the size of the weight vector `beta`. 
 Cutpoints `c` must be ordered.
 
 <!-- real; ordered_logistic_glm_lpmf; (int y | matrix x, vector beta, vector c); -->
 \index{{\tt \bfseries ordered\_logistic\_glm\_lpmf }!{\tt (int y \textbar\ matrix x, vector beta, vector c): real}|hyperpage}
 
 `real` **`ordered_logistic_glm_lpmf`**`(int y | matrix x, vector beta, vector c)`<br>\newline
 The log ordered logistic probability mass of y, given linear predictors `x * beta`, and cutpoints c. 
-Same value of independant variable `y` is used for all instances.
-The number of columns of independant variable row vector `x` needs to match the size of the weight vector `beta`.
+Same value of independent variable `y` is used for all instances.
+The number of columns of independent variable row vector `x` needs to match the size of the weight vector `beta`.
 Cutpoints `c` must be ordered. If `x` and `y` are data (not parameters) this function can be executed on a GPU.
 
 <!-- real; ordered_logistic_glm_lpmf; (int[] y | row_vector x, vector beta, vector c); -->
 \index{{\tt \bfseries ordered\_logistic\_glm\_lpmf }!{\tt (int[] y \textbar\ row\_vector x, vector beta, vector c): real}|hyperpage}
 
 `real` **`ordered_logistic_glm_lpmf`**`(int[] y | row_vector x, vector beta, vector c)`<br>\newline
 The log ordered logistic probability mass of y, given linear predictors `x * beta`, and cutpoints c. 
-Same row vector of independant variables `x` is used for all instances.
-The size of independant variable row vector `x` needs to match the size of the weight vector `beta`. 
+Same row vector of independent variables `x` is used for all instances.
+The size of independent variable row vector `x` needs to match the size of the weight vector `beta`. 
 Cutpoints `c` must be ordered.
 
 <!-- real; ordered_logistic_glm_lpmf; (int[] y | matrix x, vector beta, vector c); -->
@@ -479,7 +481,7 @@ Cutpoints `c` must be ordered.
 The log ordered logistic probability mass of y, given linear predictors
 `x * beta`, and cutpoints c. 
 The number of rows of the independent variable matrix `x` needs to match the length of the dependent variable vector `y`.
-The number of columns of independant variable row vector `x` needs to match the size of the weight vector `beta`.
+The number of columns of independent variable row vector `x` needs to match the size of the weight vector `beta`.
 Cutpoints `c` must be ordered. If `x` and `y` are data (not parameters) this function can be executed on a GPU.
 
 
diff --git a/src/functions-reference/unbounded_continuous_distributions.Rmd b/src/functions-reference/unbounded_continuous_distributions.Rmd
@@ -180,7 +180,7 @@ dropping constant additive terms.
 \index{{\tt \bfseries normal\_id\_glm\_lpdf }!{\tt (real y \textbar\ matrix x, real alpha, vector beta, real sigma): real}|hyperpage}
 
 `real` **`normal_id_glm_lpdf`**`(real y | matrix x, real alpha, vector beta, real sigma)`<br>\newline
-The log normal probability density of y given location `alpha+x*beta`
+The log normal probability density of y given location `alpha + x * beta`
 and scale `sigma`, where a constant intercept `alpha`, `sigma` and dependent variable value `y` are
 used for all observations. The number of columns of `x` needs to match
 the length of the weight vector `beta`.
@@ -190,7 +190,7 @@ If `x` and `y` are data (not parameters) this function can be executed on a GPU.
 \index{{\tt \bfseries normal\_id\_glm\_lpdf }!{\tt (real y \textbar\ matrix x, vector alpha, vector beta, real sigma): real}|hyperpage}
 
 `real` **`normal_id_glm_lpdf`**`(real y | matrix x, vector alpha, vector beta, real sigma)`<br>\newline
-The log normal probability density of y given location `alpha+x*beta`
+The log normal probability density of y given location `alpha + x * beta`
 and scale `sigma`, where a constant `sigma` and dependent variable value`y` are used for all
 observations and an intercept `alpha` is used that is allowed to vary
 with the different observations. The number of rows of the independent
@@ -203,7 +203,7 @@ If `x` and `y` are data (not parameters) this function can be executed on a GPU.
 \index{{\tt \bfseries normal\_id\_glm\_lpdf }!{\tt (vector y \textbar\ row\_vector x, real alpha, vector beta, real sigma): real}|hyperpage}
 
 `real` **`normal_id_glm_lpdf`**`(vector y | row_vector x, real alpha, vector beta, real sigma)`<br>\newline
-The log normal probability density of y given location `alpha+x*beta`
+The log normal probability density of y given location `alpha + x * beta`
 and scale `sigma`, where a constant intercept `alpha`, `sigma` and independent variable values `x` are
 used for all observations. The number of columns of `x` needs to match
 the length of the weight vector `beta`.
@@ -212,7 +212,7 @@ the length of the weight vector `beta`.
 \index{{\tt \bfseries normal\_id\_glm\_lpdf }!{\tt (vector y \textbar\ row\_vector x, vector alpha, vector beta, real sigma): real}|hyperpage}
 
 `real` **`normal_id_glm_lpdf`**`(vector y | row_vector x, vector alpha, vector beta, real sigma)`<br>\newline
-The log normal probability density of y given location `alpha+x*beta`
+The log normal probability density of y given location `alpha + x * beta`
 and scale `sigma`, where a constant `sigma` and independent variable values `x` are used for all
 observations and an intercept `alpha` is used that is allowed to vary
 with the different observations. The length of the dependent
@@ -223,7 +223,7 @@ the length of the weight vector `beta`.
 \index{{\tt \bfseries normal\_id\_glm\_lpdf }!{\tt (vector y \textbar\ matrix x, real alpha, vector beta, real sigma): real}|hyperpage}
 
 `real` **`normal_id_glm_lpdf`**`(vector y | matrix x, real alpha, vector beta, real sigma)`<br>\newline
-The log normal probability density of y given location `alpha+x*beta`
+The log normal probability density of y given location `alpha + x * beta`
 and scale `sigma`, where a constant intercept `alpha` and `sigma` is
 used for all observations. The number of rows of the independent
 variable matrix `x` needs to match the length of the dependent
@@ -235,7 +235,7 @@ If `x` and `y` are data (not parameters) this function can be executed on a GPU.
 \index{{\tt \bfseries normal\_id\_glm\_lpdf }!{\tt (vector y \textbar\ matrix x, vector alpha, vector beta, real sigma): real}|hyperpage}
 
 `real` **`normal_id_glm_lpdf`**`(vector y | matrix x, vector alpha, vector beta, real sigma)`<br>\newline
-The log normal probability density of y given location `alpha+x*beta`
+The log normal probability density of y given location `alpha + x * beta`
 and scale `sigma`, where a constant `sigma` is used for all
 observations and an intercept `alpha` is used that is allowed to vary
 with the different observations. The number of rows of the independent
diff --git a/src/functions-reference/unbounded_discrete_distributions.Rmd b/src/functions-reference/unbounded_discrete_distributions.Rmd