Update examples

Alexander März · Alexander März · commit 186a34203a6e · 2023-08-11T15:18:33.000+02:00
diff --git a/docs/dgbm.md b/docs/dgbm.md
@@ -18,7 +18,7 @@ Probabilistic forecasts are predictions in the form of a probability distributio
 
 ### Univariate Targets
 
-In its original formulation, GAMLSS assume a univariate response to follow a distribution $\mathcal{D}$ that depends on up to four parameters, i.e., $y_{i} \stackrel{ind}{\sim} \mathcal{D}(\mu_{i}, \sigma^{2}_{i}, \nu_{i}, \tau_{i}), i=1,\ldots,n$, where $\mu_{i}$ and $\sigma^{2}_{i}$ are often location and scale parameters, respectively, while $\nu_{i}$ and $\tau_{i}$ correspond to shape parameters such as skewness and kurtosis. Hence, the framework allows to model not only the mean (or location) but all parameters as functions of explanatory variables. It is important to note that distributional modelling implies that observations are independent, but not necessarily identical realizations $y \stackrel{ind}{\sim} \mathcal{D}\big(\mathbf{\theta}(\mathbf{x})\big)$, since all distributional parameters $\mathbf{\theta}(\mathbf{x})$ are related to and allowed to change with covariates. In contrast to Generalized Linear (GLM) and Generalized Additive Models (GAM), the assumption of the response distribution belonging to an exponential family is relaxed in GAMLSS and replaced by a more general class of distributions, including highly skewed and/or kurtotic continuous, discrete and mixed discrete, as well as zero-inflated distributions. While the original formulation of GAMLSS in Rigby and Stasinopoulos (2005) suggests that any distribution can be described by location, scale and shape parameters, it is not necessarily true that the observed data distribution can actually be characterized by all of these parameters. Hence, we follow Klein et al. (2015) and use the term distributional modelling and GAMLSS interchangeably.
+In its original formulation, GAMLSS assume a univariate response to follow a distribution $\mathcal{D}$ that depends on up to four parameters, i.e., $y_{i} \stackrel{ind}{\sim} \mathcal{D}(\mu_{i}, \sigma^{2}_{i}, \nu_{i}, \tau_{i}), i=1,\ldots,N$, where $\mu_{i}$ and $\sigma^{2}_{i}$ are often location and scale parameters, respectively, while $\nu_{i}$ and $\tau_{i}$ correspond to shape parameters such as skewness and kurtosis. Hence, the framework allows to model not only the mean (or location) but all parameters as functions of explanatory variables. It is important to note that distributional modelling implies that observations are independent, but not necessarily identical realizations $y \stackrel{ind}{\sim} \mathcal{D}\big(\mathbf{\theta}(\mathbf{x})\big)$, since all distributional parameters $\mathbf{\theta}(\mathbf{x})$ are related to and allowed to change with covariates. In contrast to Generalized Linear (GLM) and Generalized Additive Models (GAM), the assumption of the response distribution belonging to an exponential family is relaxed in GAMLSS and replaced by a more general class of distributions, including highly skewed and/or kurtotic continuous, discrete and mixed discrete, as well as zero-inflated distributions. While the original formulation of GAMLSS in Rigby and Stasinopoulos (2005) suggests that any distribution can be described by location, scale and shape parameters, it is not necessarily true that the observed data distribution can actually be characterized by all of these parameters. Hence, we follow Klein et al. (2015) and use the term distributional modelling and GAMLSS interchangeably.
 
 From a frequentist point of view, distributional modelling can be formulated as follows
 
diff --git a/docs/examples/Gaussian_Regression.ipynb b/docs/examples/Gaussian_Regression.ipynb
@@ -28,7 +28,7 @@
     "\t\\vdots \\\\                        \n",
     "\th_{K}\\bigl(\\theta_{iK}(x_{i})\\bigr) = \\eta_{iK} \n",
     "\\end{pmatrix}\n",
-    "\\quad  i=1, \\ldots, N. \n",
+    "\\quad ,i=1, \\ldots, N. \n",
     "\\end{equation}\n",
     "\n",
     "where $h_{k}(\\cdot)$ transforms each distributional parameter to the corresponding parameter scale. For the univariate Normal case, we can specify the above as $y_{i} \\stackrel{ind}{\\sim} \\mathcal{N}\\bigl(\\mu_{i}(x_{i}), \\sigma_{i}(x_{i})\\bigr)$. Since $\\mu_{i}(\\cdot) \\in \\mathbb{R}$ and since the standard-deviation cannot be negative, $h_{k}(\\cdot)$ is applied to $\\sigma_{i}(\\cdot)$ only. Typical choices are the exponential or the softplus function."