Updated driver.rst, overview.rst, and covariance.rst files

slilonfe5 · slilonfe5 · commit 302493ce2b80 · 2025-12-15T16:30:29.000-05:00
diff --git a/doc/OnlineDocs/explanation/analysis/parmest/covariance.rst b/doc/OnlineDocs/explanation/analysis/parmest/covariance.rst
@@ -12,17 +12,17 @@ following methods which have been implemented in parmest.
 
 1. Reduced Hessian Method
 
-    When the objective function is the sum of squared errors (SSE) defined as
-    :math:`\text{SSE} = \sum_{i = 1}^{n}
-    \left(\boldsymbol{y}_{i} - \boldsymbol{f}(\boldsymbol{x}_{i};\boldsymbol{\theta})\right)^2`,
-    the covariance matrix is:
+    When the objective function is the sum of squared errors (SSE) for homogeneous data, defined as
+    :math:`\text{SSE} = \sum_{i = 1}^{n} \left(\boldsymbol{y}_{i} - \boldsymbol{f}(\boldsymbol{x}_{i};
+    \boldsymbol{\theta})\right)^\text{T} \left(\boldsymbol{y}_{i} - \boldsymbol{f}(\boldsymbol{x}_{i};
+    \boldsymbol{\theta})\right)`, the covariance matrix is:
 
     .. math::
        \boldsymbol{V}_{\boldsymbol{\theta}} = 2 \sigma^2 \left(\frac{\partial^2 \text{SSE}}
         {\partial \boldsymbol{\theta}^2}\right)^{-1}_{\boldsymbol{\theta}
         = \hat{\boldsymbol{\theta}}}
 
-    Similarly, when the objective function is the weighted SSE (WSSE) defined as
+    Similarly, when the objective function is the weighted SSE (WSSE) for heterogeneous data, defined as
     :math:`\text{WSSE} = \frac{1}{2} \sum_{i = 1}^{n} \left(\boldsymbol{y}_{i} -
     \boldsymbol{f}(\boldsymbol{x}_{i};\boldsymbol{\theta})\right)^\text{T} \boldsymbol{\Sigma}_{\boldsymbol{y}}^{-1}
     \left(\boldsymbol{y}_{i} - \boldsymbol{f}(\boldsymbol{x}_{i};\boldsymbol{\theta})\right)`,
@@ -34,12 +34,15 @@ following methods which have been implemented in parmest.
         = \hat{\boldsymbol{\theta}}}
 
     Where :math:`\boldsymbol{V}_{\boldsymbol{\theta}}` is the covariance matrix of the estimated
-    parameters :math:`\hat{\boldsymbol{\theta}}`, :math:`\boldsymbol{y}` are observations of the measured variables,
-    :math:`n` is the number of experiments, :math:`\boldsymbol{\Sigma}_{\boldsymbol{y}}` is the measurement error
-    covariance matrix, and :math:`\sigma^2` is the variance of the measurement error. When the standard deviation of
-    the measurement error is not supplied by the user, parmest approximates :math:`\sigma^2` as:
-    :math:`\hat{\sigma}^2 = \frac{1}{n-l} \sum_{i=1}^{n} e_i^2`, where :math:`l` is the number of fitted parameters,
-    and :math:`e_i` is the residual between the data and model for experiment :math:`i`.
+    parameters :math:`\hat{\boldsymbol{\theta}} \in \mathbb{R}^p`, :math:`\boldsymbol{y}_{i} \in \mathbb{R}^m` are
+    observations of the measured output variables, :math:`\boldsymbol{f}` is the model function,
+    :math:`\boldsymbol{x}_{i} \in \mathbb{R}^{q}` are the input variables, :math:`n` is the number of experiments,
+    :math:`\boldsymbol{\Sigma}_{\boldsymbol{y}}` is the measurement error covariance matrix, and :math:`\sigma^2`
+    is the variance of the measurement error. When the standard deviation of the measurement error is not supplied
+    by the user, parmest approximates :math:`\sigma^2` as:
+    :math:`\hat{\sigma}^2 = \frac{1}{n-p} \sum_{i=1}^{n} \boldsymbol{\varepsilon}_{i}(\boldsymbol{\theta})^{\text{T}}
+    \boldsymbol{\varepsilon}_{i}(\boldsymbol{\theta})`, and :math:`\boldsymbol{\varepsilon}_{i} \in \mathbb{R}^m`
+    are the residuals between the data and model for experiment :math:`i`.
 
     In parmest, this method computes the inverse of the Hessian by scaling the
     objective function (SSE or WSSE) with a constant probability factor, :math:`\frac{1}{n}`.
@@ -48,9 +51,9 @@ following methods which have been implemented in parmest.
 
     In this method, the covariance matrix, :math:`\boldsymbol{V}_{\boldsymbol{\theta}}`, is
     computed by differentiating the Hessian,
-    :math:`\frac{\partial^2 \text{SSE}}{\partial \boldsymbol{\theta} \partial \boldsymbol{\theta}}`
+    :math:`\frac{\partial^2 \text{SSE}}{\partial \boldsymbol{\theta}^2}`
     or
-    :math:`\frac{\partial^2 \text{WSSE}}{\partial \boldsymbol{\theta} \partial \boldsymbol{\theta}}`, and
+    :math:`\frac{\partial^2 \text{WSSE}}{\partial \boldsymbol{\theta}^2}`, and
     applying Gauss-Newton approximation which results in:
 
     .. math::
diff --git a/doc/OnlineDocs/explanation/analysis/parmest/driver.rst b/doc/OnlineDocs/explanation/analysis/parmest/driver.rst
@@ -40,9 +40,10 @@ Where :math:`y` is the observation of the measured variable, :math:`t` is the ti
 is the asymptote, and :math:`\theta_2` is the rate constant.
 
 The experimental data is given in the table below:
-.. list-table:: Experimental Data
+
+.. list-table:: Data
    :header-rows: 1
-   :widths: 20 20
+   :widths: 30 30
 
    * - hour
      - y
@@ -59,8 +60,8 @@ The experimental data is given in the table below:
    * - 7
      - 19.8
 
-To use parmest to estimate :math:`\theta_1` and :math:`\theta_2` from the data, the following
-detailed steps should be followed:
+To use parmest to estimate :math:`\theta_1` and :math:`\theta_2` from the data, we provide the following
+detailed steps:
 
 Step 0: Import Pyomo, parmest, Experiment Class, and Pandas
 -----------------------------------------------------------
diff --git a/doc/OnlineDocs/explanation/analysis/parmest/overview.rst b/doc/OnlineDocs/explanation/analysis/parmest/overview.rst
@@ -10,7 +10,8 @@ for design optimization.
 
 Functionality in parmest includes:
 
-* Model based parameter estimation using experimental data
+* Model-based parameter estimation using experimental data
+* Covariance matrix estimation
 * Bootstrap resampling for parameter estimation
 * Confidence regions based on single or multi-variate distributions
 * Likelihood ratio
@@ -21,61 +22,56 @@ Background
 ----------
 
 The goal of parameter estimation is to estimate values for 
-a vector, :math:`{\theta}`, to use in the functional form
+a vector, :math:`\boldsymbol{\theta}`, to use in the functional form
 
 .. math::
       
-   y = g(x; \theta)
-
-where :math:`x` is a vector containing measured data, typically in high
-dimension, :math:`{\theta}` is a vector of values to estimate, in much
-lower dimension, and the response vectors are given as :math:`y_{i},
-i=1,\ldots,m` with :math:`m` also much smaller than the dimension of
-:math:`x`.  This is done by collecting :math:`S` data points, which are
-:math:`{\tilde{x}},{\tilde{y}}` pairs and then finding :math:`{\theta}`
-values that minimize some function of the deviation between the values
-of :math:`{\tilde{y}}` that are measured and the values of
-:math:`g({\tilde{x}};{\theta})` for each corresponding
-:math:`{\tilde{x}}`, which is a subvector of the vector :math:`x`. Note
-that for most experiments, only small parts of :math:`x` will change
-from one experiment to the next.
+   \boldsymbol{y}_i & = \boldsymbol{f}\left(\boldsymbol{x}_{i}, \boldsymbol{\theta}\right) +
+    \boldsymbol{\varepsilon}_i \quad \forall \; i \in \left\{1, \ldots, n}
+
+where :math:`\boldsymbol{y}_{i} \in \mathbb{R}^m` are observations of the measured or output variables,
+:math:`\boldsymbol{f}` is the model function, :math:`\boldsymbol{x}_{i} \in \mathbb{R}^{q}` are the decision
+or input variables, :math:`\boldsymbol{\theta} \in \mathbb{R}^p` are the model parameters,
+:math:`\boldsymbol{\varepsilon}_{i} \in \mathbb{R}^m` are measurement errors, and :math:`n` is the number of
+experiments.
 
 The following least squares objective can be used to estimate parameter
 values assuming Gaussian independent and identically distributed measurement
-errors, where data points are indexed by :math:`s=1,\ldots,S`
+errors:
 
 .. math::
 
-   \min_{{\theta}} Q({\theta};{\tilde{x}}, {\tilde{y}}) \equiv \sum_{s=1}^{S}q_{s}({\theta};{\tilde{x}}_{s}, {\tilde{y}}_{s}) \;\;
+   \min_{\boldsymbol{\theta}} \, g(\boldsymbol{x}, \boldsymbol{y};\boldsymbol{\theta}) \;\;
 
-where :math:`q_{s}({\theta};{\tilde{x}}_{s}, {\tilde{y}}_{s})` can be:
+where :math:`g(\boldsymbol{x}, \boldsymbol{y};\boldsymbol{\theta})` can be:
 
 1. Sum of squared errors
 
     .. math::
 
-       q_{s}({\theta};{\tilde{x}}_{s}, {\tilde{y}}_{s}) =
-        \sum_{i=1}^{m}\left({\tilde{y}}_{s,i} - g_{i}({\tilde{x}}_{s};{\theta})\right)^{2}
+       g(\boldsymbol{x}, \boldsymbol{y};\boldsymbol{\theta}) =
+        \sum_{i = 1}^{n} \left(\boldsymbol{y}_{i} - \boldsymbol{f}(\boldsymbol{x}_{i};\boldsymbol{\theta})
+        \right)^\text{T} \left(\boldsymbol{y}_{i} - \boldsymbol{f}(\boldsymbol{x}_{i};\boldsymbol{\theta})\right)
 
 2. Weighted sum of squared errors
 
     .. math::
 
-       q_{s}({\theta};{\tilde{x}}_{s}, {\tilde{y}}_{s}) =
-        \sum_{i=1}^{m}\left(\frac{{\tilde{y}}_{s,i} - g_{i}({\tilde{x}}_{s};{\theta})}{w_i}\right)^{2}
+       g(\boldsymbol{x}, \boldsymbol{y};\boldsymbol{\theta}) =
+        \frac{1}{2} \sum_{i = 1}^{n} \left(\boldsymbol{y}_{i} - \boldsymbol{f}(\boldsymbol{x}_{i};\boldsymbol{\theta})
+        \right)^\text{T} \boldsymbol{\Sigma}_{\boldsymbol{y}}^{-1} \left(\boldsymbol{y}_{i} -
+        \boldsymbol{f}(\boldsymbol{x}_{i};\boldsymbol{\theta})\right)
 
-i.e., the contribution of sample :math:`s` to :math:`Q`, where :math:`w
-\in \Re^{m}` is a vector containing the standard deviation of the measurement
-errors of :math:`y`. Custom objectives can also be defined for parameter estimation.
+where :math:`\boldsymbol{\Sigma}_{\boldsymbol{y}}` is the measurement error covariance matrix containing the
+standard deviation of the measurement errors of :math:`\boldsymbol{y}`. Custom objectives can also be defined
+for parameter estimation.
 
 In the applications of interest to us, the function :math:`g(\cdot)` is
 usually defined as an optimization problem with a large number of
 (perhaps constrained) optimization variables, a subset of which are
-fixed at values :math:`{\tilde{x}}` when the optimization is performed.
-In other applications, the values of :math:`{\theta}` are fixed
+fixed at values :math:`\boldsymbol{x}` when the optimization is performed.
+In other applications, the values of :math:`\boldsymbol{\theta}` are fixed
 parameter values, but for the problem formulation above, the values of
-:math:`{\theta}` are the primary optimization variables. Note that in
+:math:`\boldsymbol{\theta}` are the primary optimization variables. Note that in
 general, the function :math:`g(\cdot)` will have a large set of
-parameters that are not included in :math:`{\theta}`. Often, the
-:math:`y_{is}` will be vectors themselves, perhaps indexed by time with
-index sets that vary with :math:`s`.
+parameters that are not included in :math:`\boldsymbol{\theta}`.