Skip to content

Commit 93f28b3

Browse files
committed
Updated covariance.rst and Xinhong comment on parmest.py
1 parent b8df9e1 commit 93f28b3

File tree

2 files changed

+134
-40
lines changed

2 files changed

+134
-40
lines changed
Lines changed: 85 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -1,16 +1,88 @@
11
Covariance Matrix Estimation
22
=================================
33

4-
If the optional argument ``calc_cov=True`` is specified for :class:`~pyomo.contrib.parmest.parmest.Estimator.theta_est`,
5-
parmest will calculate the covariance matrix :math:`V_{\theta}` as follows:
6-
7-
.. math::
8-
V_{\theta} = 2 \sigma^2 H^{-1}
9-
10-
This formula assumes all measurement errors are independent and identically distributed with
11-
variance :math:`\sigma^2`. :math:`H^{-1}` is the inverse of the Hessian matrix for an unweighted
12-
sum of least squares problem. Currently, the covariance approximation is only valid if the
13-
objective given to parmest is the sum of squared error. Moreover, parmest approximates the
14-
variance of the measurement errors as :math:`\sigma^2 = \frac{1}{n-l} \sum e_i^2` where :math:`n` is
15-
the number of data points, :math:`l` is the number of fitted parameters, and :math:`e_i` is the
16-
residual for experiment :math:`i`.
4+
The uncertainty in the estimated parameters is quantified using the covariance matrix.
5+
The diagonal of the covariance matrix contains the variance of the estimated parameters.
6+
Assuming Gaussian independent and identically distributed measurement errors, the
7+
covariance matrix of the estimated parameters can be computed using the following
8+
methods which have been implemented in parmest.
9+
10+
1. Reduced Hessian Method
11+
12+
.. math::
13+
V_{\boldsymbol{\theta}} = 2 \sigma^2 \left(\frac{\partial^2 \text{SSE}}
14+
{\partial \boldsymbol{\theta} \partial \boldsymbol{\theta}}\right)^{-1}_{\boldsymbol{\theta}
15+
= \boldsymbol{\theta}^*}
16+
17+
Where SSE is the sum of squared errors, WSSE is the weighted SSE,
18+
:math:`\boldsymbol{\theta}` are the unknown parameters, :math:`\boldsymbol{\theta^*}`
19+
are the estimate of the unknown parameters, and :math:`\sigma^2` is the variance of
20+
the measurement error. When the standard deviation of the measurement error is not
21+
supplied by the user, parmest approximates the variance of the measurement error as
22+
:math:`\sigma^2 = \frac{1}{n-l} \sum e_i^2` where :math:`n` is the number of data
23+
points, :math:`l` is the number of fitted parameters, and :math:`e_i` is the residual
24+
for experiment :math:`i`.
25+
26+
2. Finite Difference Method
27+
28+
.. math::
29+
V_{\boldsymbol{\theta}} = \left( \sum_{r = 1}^n \mathbf{G}_{r}^{\mathrm{T}} \mathbf{W}
30+
\mathbf{G}_{r} \right)^{-1}
31+
32+
This method uses central finite difference to compute the Jacobian matrix,
33+
:math:`\mathbf{G}_{r}`, which is the sensitivity of the measured variables with
34+
respect to the parameters, `\boldsymbol{\theta}`. :math:`\mathbf{W}` is a diagonal
35+
matrix containing the inverse of the variance of the measurement errors,
36+
:math:`\sigma^2`.
37+
38+
3. Automatic Differentiation Method
39+
40+
.. math::
41+
V_{\boldsymbol{\theta}} = \left( \sum_{r = 1}^n \mathbf{G}_{\text{kaug},\, r}^{\mathrm{T}}
42+
\mathbf{W} \mathbf{G}_{\text{kaug},\, r} \right)^{-1}
43+
44+
This method uses the model optimality (KKT) condition to compute the Jacobian matrix,
45+
:math:`\mathbf{G}_{\text{kaug},\, r}`.
46+
47+
In parmest, the covariance matrix can be calculated after defining the
48+
:class:`~pyomo.contrib.parmest.parmest.Estimator` object and estimating the unknown
49+
parameters using :class:`~pyomo.contrib.parmest.parmest.Estimator.theta_est`. To
50+
estimate the covariance matrix, call
51+
:class:`~pyomo.contrib.parmest.parmest.Estimator.cov_est` and pass it the number
52+
of data points, e.g.,
53+
54+
.. testsetup:: *
55+
:skipif: not __import__('pyomo.contrib.parmest.parmest').contrib.parmest.parmest.parmest_available
56+
57+
# Data
58+
import pandas as pd
59+
data = pd.DataFrame(
60+
data=[[1, 8.3], [2, 10.3], [3, 19.0],
61+
[4, 16.0], [5, 15.6], [7, 19.8]],
62+
columns=['hour', 'y'],
63+
)
64+
num_data = len(data)
65+
66+
# Create an experiment list
67+
from pyomo.contrib.parmest.examples.rooney_biegler.rooney_biegler import RooneyBieglerExperiment
68+
exp_list = []
69+
for i in range(data.shape[0]):
70+
exp_list.append(RooneyBieglerExperiment(data.loc[i, :]))
71+
72+
.. doctest::
73+
:skipif: not __import__('pyomo.contrib.parmest.parmest').contrib.parmest.parmest.parmest_available
74+
75+
>>> import pyomo.contrib.parmest.parmest as parmest
76+
>>> pest = parmest.Estimator(exp_list, obj_function="SSE")
77+
>>> obj_val, theta_val = pest.theta_est()
78+
>>> cov = pest.cov_est(cov_n=num_data)
79+
80+
Optionally, one of the three methods; "reduced_hessian", "finite_difference",
81+
and "automatic_differenciation_kaug" can be supplied for the covariance calculation,
82+
e.g.,
83+
84+
.. doctest::
85+
:skipif: not __import__('pyomo.contrib.parmest.parmest').contrib.parmest.parmest.parmest_available
86+
87+
>>> cov_method = "reduced_hessian"
88+
>>> cov = pest.cov_est(cov_n=num_data, method=cov_method)

pyomo/contrib/parmest/parmest.py

Lines changed: 49 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -41,6 +41,7 @@
4141
import re
4242
import importlib as im
4343
import logging
44+
import warnings
4445
import types
4546
import json
4647
from collections.abc import Callable
@@ -259,9 +260,7 @@ def SSE_weighted(model):
259260
_check_model_labels_helper(model, logging_level=logging.ERROR)
260261

261262
# Check that measurement errors exist
262-
if hasattr(model, "measurement_error"):
263-
pass
264-
else:
263+
if not hasattr(model, "measurement_error"):
265264
raise AttributeError(
266265
'Experiment model does not have suffix "measurement_error". '
267266
'"measurement_error" is a required suffix for the "SSE_weighted" '
@@ -270,16 +269,24 @@ def SSE_weighted(model):
270269

271270
# check if all the values of the measurement error standard deviation
272271
# have been supplied
273-
if all(
272+
all_known_errors = all(
274273
model.measurement_error[y_hat] is not None for y_hat in model.experiment_outputs
275-
):
276-
# calculate the weighted SSE between the prediction and observation of the
277-
# measured variables
278-
expr = (1 / 2) * sum(
279-
((y - y_hat) / model.measurement_error[y_hat]) ** 2
280-
for y_hat, y in model.experiment_outputs.items()
281-
)
282-
return expr
274+
)
275+
276+
if all_known_errors:
277+
# calculate the weighted SSE between the prediction
278+
# and observation of the measured variables
279+
try:
280+
expr = (1 / 2) * sum(
281+
((y - y_hat) / model.measurement_error[y_hat]) ** 2
282+
for y_hat, y in model.experiment_outputs.items()
283+
)
284+
return expr
285+
except ZeroDivisionError:
286+
raise ValueError(
287+
'Division by zero encountered in the "SSE_weighted" objective. '
288+
'One or more values of the measurement error are zero.'
289+
)
283290
else:
284291
raise ValueError(
285292
'One or more values are missing from "measurement_error". All values of '
@@ -582,36 +589,43 @@ def _finite_difference_FIM(
582589
# computing the condition number of the Jacobian matrix
583590
cond_number_jac = np.linalg.cond(J)
584591
if logging_level == logging.INFO:
585-
logger.info(
586-
f"The condition number of the Jacobian matrix " f"is {cond_number_jac}"
587-
)
592+
logger.info(f"The condition number of the Jacobian matrix is {cond_number_jac}")
588593

589594
# grab the model
590595
model = _get_labeled_model_helper(experiment)
591596

592597
# extract the measured variables and measurement errors
593598
y_hat_list = [y_hat for y_hat, y in model.experiment_outputs.items()]
594599

595-
# check if the model has a 'measurement_error' attribute and the measurement
596-
# error standard deviation has been supplied
597-
if hasattr(model, "measurement_error") and all(
600+
# check if the model has a 'measurement_error' attribute and
601+
# the measurement error standard deviation has been supplied
602+
all_known_errors = all(
598603
model.measurement_error[y_hat] is not None for y_hat in model.experiment_outputs
599-
):
604+
)
605+
606+
if hasattr(model, "measurement_error") and all_known_errors:
600607
error_list = [
601608
model.measurement_error[y_hat] for y_hat in model.experiment_outputs
602609
]
603610

604-
# compute the matrix of the inverse of the measurement variance
605-
# the following assumes independent measurement errors
606-
W = np.diag([1 / (err**2) for err in error_list])
611+
# compute the matrix of the inverse of the measurement error variance
612+
# the following assumes independent and identically distributed
613+
# measurement errors
614+
try:
615+
W = np.diag([1 / (err**2) for err in error_list])
616+
except ZeroDivisionError:
617+
raise ValueError(
618+
'Division by zero encountered in computing the covariance matrix. '
619+
'One or more values of the measurement error are zero.'
620+
)
607621

608-
# check if error list is consistent
622+
# check if the error list is consistent
609623
if len(error_list) == 0 or len(y_hat_list) == 0:
610624
raise ValueError(
611625
"Experiment outputs and measurement errors cannot be empty."
612626
)
613627

614-
# check if the dimension of error_list is same with that of y_hat_list
628+
# check if the dimension of error_list is the same with that of y_hat_list
615629
if len(error_list) != len(y_hat_list):
616630
raise ValueError(
617631
"Experiment outputs and measurement errors are not the same length."
@@ -717,16 +731,24 @@ def _kaug_FIM(experiment, theta_vals, solver, tee, estimated_var=None):
717731
kaug_jac = np.array(jac).T
718732

719733
# compute FIM
720-
# compute matrix of the inverse of the measurement variance
721-
# The following assumes independent measurement error.
734+
# compute the matrix of the inverse of the measurement error variance
735+
# the following assumes independent and identically distributed
736+
# measurement errors
722737
W = np.zeros((len(model.measurement_error), len(model.measurement_error)))
723738
all_known_errors = all(
724739
model.measurement_error[y_hat] is not None for y_hat in model.experiment_outputs
725740
)
741+
726742
count = 0
727743
for k, v in model.measurement_error.items():
728744
if all_known_errors:
729-
W[count, count] = 1 / (v**2)
745+
try:
746+
W[count, count] = 1 / (v**2)
747+
except ZeroDivisionError:
748+
raise ValueError(
749+
'Division by zero encountered in computing the covariance matrix. '
750+
'One or more values of the measurement error are zero.'
751+
)
730752
else:
731753
W[count, count] = 1 / estimated_var
732754
count += 1

0 commit comments

Comments
 (0)