Skip to content

Commit f48337f

Browse files
SanggyuChongagoscinski
authored andcommitted
add local and component-wise prediction rigidities
* implemented the LPR and (L)CPR into the metrics module and wrote up the documentation and tests.
1 parent 5dc222f commit f48337f

File tree

5 files changed

+311
-21
lines changed

5 files changed

+311
-21
lines changed

docs/src/getting-started.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -25,8 +25,8 @@ Notebook Examples
2525

2626
.. _getting_started-reconstruction:
2727

28-
Reconstruction Measures
29-
-----------------------
28+
Metrics
29+
-------
3030

3131
.. automodule:: skmatter.metrics
3232
:noindex:

docs/src/references/metrics.rst

Lines changed: 16 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
1-
Reconstruction Measures
2-
=======================
1+
Metrics
2+
=======
33

44
.. automodule:: skmatter.metrics
55

@@ -26,3 +26,17 @@ Local Reconstruction Error
2626

2727
.. autofunction:: skmatter.metrics.pointwise_local_reconstruction_error
2828
.. autofunction:: skmatter.metrics.local_reconstruction_error
29+
30+
.. _LPR-api:
31+
32+
Local Prediction Rigidity
33+
-------------------------
34+
35+
.. autofunction:: skmatter.metrics.local_prediction_rigidity
36+
37+
.. _CPR-api:
38+
39+
Component-wise Prediction Rigidity
40+
----------------------------------
41+
42+
.. autofunction:: skmatter.metrics.componentwise_prediction_rigidity

src/skmatter/metrics/__init__.py

Lines changed: 40 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -1,26 +1,42 @@
11
"""
2-
This module contains a set of easily-interpretable error measures of the relative
3-
information capacity of feature space `F` with respect to feature space `F'`. The
4-
methods returns a value between 0 and 1, where 0 means that `F` and `F'` are completey
5-
distinct in terms of linearly-decodable information, and where 1 means that `F'` is
6-
contained in `F`. All methods are implemented as the root mean-square error for the
7-
regression of the feature matrix `X_F'` (or sometimes called `Y` in the doc) from `X_F`
8-
(or sometimes called `X` in the doc) for transformations with different constraints
9-
(linear, orthogonal, locally-linear). By default a custom 2-fold cross-validation
10-
:py:class:`skosmo.linear_model.RidgeRegression2FoldCV` is used to ensure the
11-
generalization of the transformation and efficiency of the computation, since we deal
12-
with a multi-target regression problem. Methods were applied to compare different forms
13-
of featurizations through different hyperparameters and induced metrics and kernels
14-
[Goscinski2021]_ .
2+
This module contains a set of metrics that can be used for an enhanced
3+
understanding of your machine learning model.
4+
5+
First are the easily-interpretable error measures of the relative information
6+
capacity of feature space `F` with respect to feature space `F'`. The methods
7+
returns a value between 0 and 1, where 0 means that `F` and `F'` are completey
8+
distinct in terms of linearly-decodable information, and where 1 means that `F'`
9+
is contained in `F`. All methods are implemented as the root mean-square error
10+
for the regression of the feature matrix `X_F'` (or sometimes called `Y` in the
11+
doc) from `X_F` (or sometimes called `X` in the doc) for transformations with
12+
different constraints (linear, orthogonal, locally-linear). By default a custom
13+
2-fold cross-validation :py:class:`skosmo.linear_model.RidgeRegression2FoldCV`
14+
is used to ensure the generalization of the transformation and efficiency of the
15+
computation, since we deal with a multi-target regression problem. Methods were
16+
applied to compare different forms of featurizations through different
17+
hyperparameters and induced metrics and kernels [Goscinski2021]_ .
1518
1619
These reconstruction measures are available:
1720
1821
* :ref:`GRE-api` (GRE) computes the amount of linearly-decodable information
1922
recovered through a global linear reconstruction.
20-
* :ref:`GRD-api` (GRD) computes the amount of distortion contained in a global linear
21-
reconstruction.
22-
* :ref:`LRE-api` (LRE) computes the amount of decodable information recovered through
23-
a local linear reconstruction for the k-nearest neighborhood of each sample.
23+
* :ref:`GRD-api` (GRD) computes the amount of distortion contained in a global
24+
linear reconstruction.
25+
* :ref:`LRE-api` (LRE) computes the amount of decodable information recovered
26+
through a local linear reconstruction for the k-nearest neighborhood of each
27+
sample.
28+
29+
Next, we offer a set of prediction rigidity metrics, which can be used to
30+
quantify the robustness of the local or component-wise predictions that the
31+
machine learning model has been trained to make, based on the training dataset
32+
composition.
33+
34+
These prediction rigidities are available:
35+
36+
* :ref:`LPR-api` (LPR) computes the local prediction rigidity of a linear or
37+
kernel model.
38+
* :ref:`CPR-api` (CPR) computes the component-wise prediction rigidity of a
39+
linear or kernel model.
2440
"""
2541

2642
from ._reconstruction_measures import (
@@ -34,6 +50,11 @@
3450
pointwise_local_reconstruction_error,
3551
)
3652

53+
from ._prediction_rigidities import (
54+
local_prediction_rigidity,
55+
componentwise_prediction_rigidity,
56+
)
57+
3758
__all__ = [
3859
"pointwise_global_reconstruction_error",
3960
"global_reconstruction_error",
@@ -43,4 +64,6 @@
4364
"local_reconstruction_error",
4465
"check_global_reconstruction_measures_input",
4566
"check_local_reconstruction_measures_input",
67+
"local_prediction_rigidity",
68+
"componentwise_prediction_rigidity",
4669
]
Lines changed: 208 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,208 @@
1+
import numpy as np
2+
3+
4+
def local_prediction_rigidity(X_train, X_test, alpha):
5+
r"""Computes the local prediction rigidity (LPR) of a linear or kernel model
6+
trained on a training dataset provided as input, on the local environments
7+
in the test set provided as a separate input. LPR is defined as follows:
8+
9+
.. math::
10+
LPR_{i} = \frac{1}{X_i (X^{T} X + \lambda I)^{-1} X_i^{T}}
11+
12+
The function assumes that the model training is undertaken in a manner where
13+
the global prediction targets are averaged over the number of atoms
14+
appearing in each training structure, and the average feature vector of each
15+
structure is hence used in the regression. This ensures that (1)
16+
regularization strength across structures with different number of atoms is
17+
kept constant per structure during model training, and (2) range of
18+
resulting LPR values are loosely kept between 0 and 1 for the ease of
19+
interpretation. This requires the user to provide the regularizer value that
20+
results from such training procedure. To guarantee valid comparison in the
21+
LPR across different models, feature vectors are scaled by a global factor
22+
based on standard deviation across atomic envs.
23+
24+
If the model is a kernel model, K_train and K_test can be provided in lieu
25+
of X_train and X_test, alnog with the appropriate regularizer for the
26+
trained model.
27+
28+
Parameters
29+
----------
30+
X_train : list of ndarray of shape (n_atoms, n_features)
31+
Training dataset where each training set structure is stored as a
32+
separate ndarray.
33+
34+
X_test : list of ndarray of shape (n_atoms, n_features)
35+
Test dataset where each training set structure is stored as a separate
36+
ndarray.
37+
38+
alpha : float
39+
Regularizer value that the linear/kernel model has been optimized to.
40+
41+
Returns
42+
-------
43+
LPR : list of array of shape (n_atoms)
44+
Local prediction rigidity (LPR) of the test set structures. LPR is
45+
separately stored for each test structure, and hence list length =
46+
n_test_strucs.
47+
rank_diff : int
48+
integer value of the difference between cov matrix dimension and rank
49+
50+
"""
51+
52+
# initialize a StandardFlexibleScaler and fit to train set atom envs
53+
X_atom = np.vstack(X_train)
54+
sfactor = np.sqrt(np.mean(X_atom**2, axis=0).sum())
55+
56+
# prep to build covariance matrix XX, take average feat vecs per struc
57+
X_struc = []
58+
for X_i in X_train:
59+
X_struc.append(np.mean(X_i / sfactor, axis=0))
60+
X_struc = np.vstack(X_struc)
61+
62+
# build XX and obtain Xinv for LPR calculation
63+
XX = X_struc.T @ X_struc
64+
Xprime = XX + alpha * np.eye(XX.shape[0])
65+
rank_diff = X_struc.shape[1] - np.linalg.matrix_rank(Xprime)
66+
Xinv = np.linalg.pinv(Xprime)
67+
68+
# track test set atom indices for output
69+
lens = []
70+
for X in X_test:
71+
lens.append(len(X))
72+
test_idxs = np.cumsum([0] + lens)
73+
74+
# prep and compute LPR
75+
num_test = len(X_test)
76+
X_test = np.vstack(X_test)
77+
atom_count = X_test.shape[0]
78+
LPR_np = np.zeros(X_test.shape[0])
79+
80+
for ai in range(atom_count):
81+
Xi = X_test[ai].reshape(1, -1) / sfactor
82+
LPR_np[ai] = 1 / (Xi @ Xinv @ Xi.T)
83+
84+
# separately store LPR by test struc
85+
LPR = []
86+
for i in range(num_test):
87+
LPR.append(LPR_np[test_idxs[i] : test_idxs[i + 1]])
88+
89+
return LPR, rank_diff
90+
91+
92+
def componentwise_prediction_rigidity(X_train, X_test, alpha, comp_dims):
93+
r"""Computes the component-wise prediction rigidity (CPR) and the local CPR
94+
(LCPR) of a linear or kernel model trained on a training dataset provided as
95+
input, on the local environments in the test set provided as a separate
96+
input. CPR and LCPR are defined as follows:
97+
98+
.. math::
99+
CPR_{A,c} = \frac{1}{X_{A,c} (X^{T} X + \lambda I)^{-1} X_{A,c}^{T}}
100+
101+
.. math::
102+
LCPR_{i,c} = \frac{1}{X_{i,c} (X^{T} X + \lambda I)^{-1} X_{i,c}^{T}}
103+
104+
The function assumes that the feature vectors for the local environments and
105+
structures are built by concatenating the descriptors of different
106+
prediction components together. It also assumes, like the case of LPR, that
107+
model training is undertaken in a manner where the global prediction targets
108+
are averaged over the number of atoms appearing in each training structure,
109+
and the average feature vector of each structure is hence used in the
110+
regression. Likewise, to guarantee valid comparison in the (L)CPR across
111+
different models, feature vectors are scaled by a global factor based on
112+
standard deviation across atomic envs.
113+
114+
If the model is a kernel model, K_train and K_test can be provided in lieu
115+
of X_train and X_test, alnog with the appropriate regularizer for the
116+
trained model. However, in computing the kernels, one must strictly keep the
117+
different components separate, and compute separate kernel blocks for
118+
different prediction components.
119+
120+
Parameters
121+
----------
122+
X_train : list of ndarray of shape (n_atoms, n_features)
123+
Training dataset where each training set structure is stored as a
124+
separate ndarray.
125+
126+
X_test : list of ndarray of shape (n_atoms, n_features)
127+
Test dataset where each training set structure is stored as a separate
128+
ndarray.
129+
130+
alpha : float
131+
Regularizer value that the linear/kernel model has been optimized to.
132+
133+
comp_dims : array of int values
134+
Dimensions of the feature vectors pertaining to each prediction
135+
component.
136+
137+
138+
Returns
139+
-------
140+
CPR : ndarray of shape (n_test_strucs, n_comps)
141+
Component-wise prediction rigidity computed for each prediction
142+
component, pertaining to the entire test structure.
143+
LCPR : list of ndarrays of shape (n_atoms, n_comps)
144+
Local component-wise prediction rigidity of the test set structures.
145+
Values are separately stored for each test structure, and hence list
146+
length = n_test_strucs
147+
rank_diff : int
148+
value of the difference between cov matrix dimension and rank
149+
150+
"""
151+
152+
# initialize a StandardFlexibleScaler and fit to train set atom envs
153+
X_atom = np.vstack(X_train)
154+
sfactor = np.sqrt(np.mean(X_atom**2, axis=0).sum())
155+
156+
# prep to build covariance matrix XX, take average feat vecs per struc
157+
X_struc = []
158+
for X_i in X_train:
159+
X_struc.append(np.mean(X_i / sfactor, axis=0))
160+
X_struc = np.vstack(X_struc)
161+
162+
# build XX and obtain Xinv for LPR calculation
163+
XX = X_struc.T @ X_struc
164+
Xprime = XX + alpha * np.eye(XX.shape[0])
165+
rank_diff = X_struc.shape[1] - np.linalg.matrix_rank(Xprime)
166+
Xinv = np.linalg.pinv(Xprime)
167+
168+
# track test set atom indices for output
169+
lens = []
170+
for X in X_test:
171+
lens.append(len(X))
172+
test_idxs = np.cumsum([0] + lens)
173+
174+
# get struc average feat vecs for test set
175+
X_struc_test = []
176+
for X_i in X_test:
177+
X_struc_test.append(np.mean(X_i / sfactor, axis=0))
178+
X_struc_test = np.vstack(X_struc_test)
179+
180+
# prep and compute CPR and LCPR
181+
num_test = len(X_test)
182+
num_comp = len(comp_dims)
183+
comp_idxs = np.cumsum([0] + comp_dims.tolist())
184+
X_test = np.vstack(X_test)
185+
atom_count = X_test.shape[0]
186+
CPR = np.zeros((num_test, num_comp))
187+
LCPR_np = np.zeros((atom_count, num_comp))
188+
189+
for ci in range(num_comp):
190+
tot_comp_idx = np.arange(comp_dims.sum())
191+
mask = (
192+
(tot_comp_idx >= comp_idxs[ci]) & (tot_comp_idx < comp_idxs[ci + 1])
193+
).astype(float)
194+
195+
for ai in range(atom_count):
196+
Xic = np.multiply(X_test[ai].reshape(1, -1) / sfactor, mask)
197+
LCPR_np[ai, ci] = 1 / (Xic @ Xinv @ Xic.T)
198+
199+
for si in range(len(X_struc_test)):
200+
XAc = np.multiply(X_struc_test[si].reshape(1, -1), mask)
201+
CPR[si, ci] = 1 / (XAc @ Xinv @ XAc.T)
202+
203+
# separately store LCPR by test struc
204+
LCPR = []
205+
for i in range(num_test):
206+
LCPR.append(LCPR_np[test_idxs[i] : test_idxs[i + 1]])
207+
208+
return CPR, LCPR, rank_diff

tests/test_metrics.py

Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,14 +4,59 @@
44
from sklearn.datasets import load_iris
55
from sklearn.utils import check_random_state, extmath
66

7+
from skmatter.datasets import load_degenerate_CH4_manifold
78
from skmatter.metrics import (
9+
componentwise_prediction_rigidity,
810
global_reconstruction_distortion,
911
global_reconstruction_error,
12+
local_prediction_rigidity,
1013
local_reconstruction_error,
1114
pointwise_local_reconstruction_error,
1215
)
1316

1417

18+
class PredictionRigidityTests(unittest.TestCase):
19+
@classmethod
20+
def setUpClass(cls):
21+
soap_features = load_degenerate_CH4_manifold().data["SOAP_power_spectrum"]
22+
soap_features = soap_features[:11]
23+
# each structure in CH4 has 5 environmental feature, because there are 5 atoms
24+
# per structure and each atom is one environment
25+
cls.features = [
26+
soap_features[i * 5 : (i + 1) * 5] for i in range(len(soap_features) // 5)
27+
]
28+
# add a single environment structure to check value
29+
cls.features = cls.features + [soap_features[-1:]]
30+
cls.alpha = 1e-8
31+
bi_features = load_degenerate_CH4_manifold().data["SOAP_bispectrum"]
32+
bi_features = bi_features[:11]
33+
comp_features = np.column_stack([soap_features, bi_features])
34+
cls.comp_features = [
35+
comp_features[i * 5 : (i + 1) * 5] for i in range(len(comp_features) // 5)
36+
]
37+
cls.comp_dims = np.array([soap_features.shape[1], bi_features.shape[1]])
38+
39+
def test_local_prediction_rigidity(self):
40+
LPR, rank_diff = local_prediction_rigidity(
41+
self.features, self.features, self.alpha
42+
)
43+
self.assertTrue(
44+
LPR[-1] >= 1,
45+
f"LPR of the single environment structure is incorrectly lower than 1:"
46+
f"LPR = {LPR[-1]}",
47+
)
48+
self.assertTrue(
49+
rank_diff == 0,
50+
f"LPR Covariance matrix rank is not full, with a difference of:"
51+
f"{rank_diff}",
52+
)
53+
54+
def test_componentwise_prediction_rigidity(self):
55+
_CPR, _LCPR, _rank_diff = componentwise_prediction_rigidity(
56+
self.comp_features, self.comp_features, self.alpha, self.comp_dims
57+
)
58+
59+
1560
class ReconstructionMeasuresTests(unittest.TestCase):
1661
@classmethod
1762
def setUpClass(cls):

0 commit comments

Comments
 (0)