Deep Neural Network with probabilistic description of damage condition space via Gaussian Mixture Model
This code aims at quantifying the uncertainty of mooring line fault detection predictions using a multivariate Gaussian mixture variational autoencoder. The scripts included in this code encompass everything from data preprocessing, i.e. cleaning, scaling and dataset splitting, to DNN training and results posprocessing.
These instructions will get you a copy of the project's latest stable version up and running on your local machine for development and testing purposes.
The following Python libraries are needed to execute the code
- TensorFlow 2.13.0
- Keras 2.13.0
- NumPy
- Pandas
- Matplotlib
- TensorFlow Probability
In this section, we present the directories of the repository.
File/directory | Description |
---|---|
multivariate_main.py | Main executable file, used to train only the inverse operator |
forward_included_main.py | Main executable file, used to train both forward and inverse operator |
DATA | Directory where the employed datasets can be found |
MODULES | Directory where preprocessing, training, and posprocessing scripts can be found |
MODULES/PREPROCESSING/preprocessing_tools.py | Script containing main preprocessing functions |
MODULES/TRAINING/multivariate_arch.py | Script containing architecture containing inverse, sampling, and forward operator architecture |
MODULES/TRAINING/multivariate_models.py | Script defining the structure of the MGMVAE |
OUTPUT | Directory where the outputs of our training process are stored (models, weights, loss logs, etc.) |
RESULTS | Directory where graphs are stored (prediction accuracies, loss evolutions, etc.) |
The main executable file is multivariate_main.py
. In this file, we read and preprocess the dataset using
loc = os.path.join('Data', 'Data_nicoWT', 'joint.csv')
u_train, u_val, u_test, r_train, r_val, r_test, p_train, p_val, p_test = read_data_previous(loc, dt, debug=False)
The function read_data_previous(loc, dt, debug)
comes from the script located in MODULES/PREPROCESSING/preprocessing_tools.py
. This function reads the dataset, drops the features we discard, and shuffles and splits the data into training, validation, and testing datasets. Here we distinguish among the statistical inputs extracted from the platform's response, u
, from the environmental conditions, r
, and target damage coefficients, p
.
Then, we define the number of mixtures, Gaussians per mixture, and samples per mixture as
num_mixtures = p_train.shape[1]
num_gaussians_ = np.array([5])
num_samples_per_mixture_ = np.array([10])
We then define the hyperparameters needed to train our inverse operator as
input_dim_decoder = p_train.shape[1] + r_train.shape[1]
output_dim = u_train.shape[1]
loss_name = ['ELBO_loss']
inverse_LR = 1e-06
n_epochs_enc = 1000
betas = [0.033]
A loop then starts to train the inverse operator. The loop is executed as many times as combinations of values for beta
, num_mixtures
, num_gaussians
, and num_samples_per_mixture
(as of now there is only one training taking place since only one value for each of these variables is specified). The loop first loads a previously saved forward model as
forward_path = os.path.join("Output", '23Apr_2Prop_Forward')
model_forward = tf.saved_model.load(os.path.join(forward_path,'model_forward_2mix'))
Then, the complete model is built using the custom class type MyInverseForward
.
model = My_InverseForward(input_dim_encoder, input_dim_decoder, output_dim, num_mixtures, num_gaussians, num_samples_per_mixture, model_forward, selected_features, beta)
The inputs to the model are:
input_dim_encoder
: input dimension of the encoder. Set equal tou_train[:,selected_features].shape[1] + r_train.shape[1]
. It filters the originalu
shape to allow for its training using only the degrees of freedom (DOFs) of interest.input_dim_decoder
: input dimension of the decoder. Set equal top_train.shape[1] + r_train.shape[1]
.output_dim
: output dimension of the complete autoencoder. Set equal tou_train.shape[1]
. It does not matter if the forward operator is trained using all DOFs or the filtered ones, as it will be able to reconstruct the complete set of features based on the estimated latent space.num_mixtures
: number of mixtures.num_gaussians
: number of Gaussian components in each mixture.num_samples_per_mixture
: number of samples extracted from each mixture. These samples are all passed to the decoder.model_forward
: a previously saved trained forward model.selected_features
: array of indices corresponding to the features of the DOFs of interest.beta
: parameter used for noise control
The model is built using two separate modules, namely MODULES/TRAINING/multivariate_arch.py
for the model's architecture, divided into encoder, sampling layer, and decoder; and MODULES/TRAINING/multivariate_models.py
to define the model's structure and loss function. The structure of the model is specified in the definition of the My_InverseForward
class as
self.Encoder_model
: encoder model. Defined by a custom classInverse_GMM_Model
. This model, when called, concatenates its given inputs and builds the architectureFully_connected_enc_gmm
, defined inMODULES/TRAINING/multivariate_arch.py
as
def Fully_connected_enc_gmm(input_dim, num_mixtures, num_gaussians):
input1 = tf.keras.Input(shape =(input_dim), name = 'Innnputlayer')
lay1 = layers.Dense(100, activation = 'relu', name = 'lay1')(input1) #Intermediate layers
lay2 = layers.Dense(150, activation = 'relu', kernel_initializer="he_uniform", bias_initializer="zeros", name = 'lay2')(lay1) #Intermediate layers
lay2 = layers.Dense(200, activation = 'tanh')(lay2) #Intermediate layers
lay3 = layers.Dense(150, activation = 'relu', kernel_initializer="he_uniform", bias_initializer="zeros", name = 'lay3')(lay2) #Intermediate layers
lay4 = layers.Dense(100, activation = 'tanh', name = 'lay4')(lay3) #Intermediate layers
means = layers.Dense(num_mixtures*num_gaussians, activation = 'sigmoid', name = 'means')(lay4)
sigmas = layers.Dense(num_mixtures*num_gaussians, activation = 'softplus', name = 'sigmas')(lay4)
weights = layers.Dense(num_gaussians, activation = 'softmax', name = 'weights')(lay4) #Intermediate layers
outputs = tf.concat([means,sigmas,weights], axis = 1)
return tf.keras.Model(inputs = input1, outputs = outputs)
With weight initialization strategies as presented by He et al.. It returns an array of GMM properties (means, variances, and weights) named GMMprops
, which is passed to the sampling layer. The sampling layer is defined using the custom layer class GMMSamplingLayer
as
class GMMSamplingLayer(tf.keras.layers.Layer):
def __init__(self, num_mixtures, num_gaussians, num_samples_per_mixture, **kwargs):
super(GMMSamplingLayer, self).__init__(**kwargs)
self.num_mixtures = num_mixtures
self.num_gaussians = num_gaussians
self.num_samples_per_mixture = num_samples_per_mixture
def call(self, inputs):
means, sigmas, weights = inputs
# Reshaping required to accommodate for the batch size (, num_mixtures*num_gaussians) during training
means = tf.reshape(means, (-1, self.num_mixtures, self.num_gaussians))
sigmas = tf.reshape(sigmas, (-1, self.num_mixtures, self.num_gaussians))
weights = tf.reshape(weights, (-1, self.num_gaussians))
samples = []
truncated_components = []
for i in range(self.num_gaussians):
component = tfp.distributions.MultivariateNormalDiag(loc=means[:, :, i], scale_diag=sigmas[:, :, i])
truncated_components.append(component)
cat = tfd.Categorical(probs=weights)
q = tfd.Mixture(cat=cat, components=truncated_components)
samples = []
samples = q.sample(self.num_samples_per_mixture)
s1=tf.transpose(samples,perm = [1,0,2])
self.reshaped_samples = tf.reshape(s1, (-1, self.num_mixtures))
return self.reshaped_samples
When called, GMMSamplingLayer
reshapes its inputs, means
, sigmas
, and weights
to the required shape (None, num_mixtures, num_gaussians)
for the former two and (None, num_gaussians)
for the latter. Then, it creates a truncated multivariate Gaussian distribution by first defining each component
from each means
and sigmas
, and then building q=tfd.Mixture
combining all components, each one with corresponding categorical probabilities weights
. Then, it samples num_samples_per_mixture
from q
and reshapes the samples to (None, num_mixtures)
, returning reshaped_samples
to pass to the decoder.
The decoder is simply loaded from a previously trained model_forward
, which takes the reshaped samples and reconstructs the array of response statistics u
.
Back to multivariate_main.py
, the model is compiled using the Adam optimizer, with specified learning rate inverse_LR
and loss function ELBO_loss
as
model.compile(optimizer = optimizers.Adam(learning_rate = inverse_LR), loss = model.ELBO_loss, metrics = [model.Conditional_likelihood_term, model.Mixture_dens_term])
model_history = model.fit(x = [u_train, r_train, p_train],
y = u_train,
batch_size = 1024,
epochs = n_epochs_enc,
shuffle = True,
validation_data = ([u_val, r_val, p_val], u_val))
The loss function ELBO_loss
is defined as the sum of two components, based on the findings by Goh et al. (2021) and Rodríguez et al. (2023). The first component, defined in My_InverseForward
as a method Mixture_dense_term
, estimates the logarithmic probability of obtaining the samples reshaped_samples
given the distribution defined by the GMM properties we have fitted. The method returns the mean of the obtained probabilites. In the code, this component is defined as
def Mixture_dens_term(self, y_true, y_pred):
GMM_props = self.GMM_props
means, sigmas, weights = tf.split(GMM_props, [self.num_mixtures * self.num_gaussians,
self.num_mixtures * self.num_gaussians,
self.num_gaussians], axis=-1)
#Reshaping required to accomodate for the batch size (, num_mixtures*num_gaussians) during training
means = tf.reshape(means, (-1, self.num_mixtures, self.num_gaussians))
sigmas = tf.reshape(sigmas, (-1, self.num_mixtures, self.num_gaussians))
weights = tf.reshape(weights, (-1, self.num_gaussians))
weights = tf.clip_by_value(weights, 1e-6, 1.0)
## REPEATING THE MEANS TO FIT WITH THE NUMBER OF SAMPLES ,N
m = tf.repeat(means[:,:,tf.newaxis], self.num_samples_per_mixture, axis = 2)
s = tf.repeat(sigmas[:,:,tf.newaxis], self.num_samples_per_mixture, axis = 2)
w = tf.repeat(weights[:,tf.newaxis], self.num_samples_per_mixture, axis = 1)
## THIS PERMUTATION IS TO MAKE THE MEANS, SIGMAS AND WEIGHTS HAVE THE DESIRED SHAPE (None*N, num_mixtures, num_gaussians):
means = tf.reshape(tf.transpose(m, perm = [0,2,1,3]), [-1, self.num_mixtures, self.num_gaussians])
sigmas = tf.reshape(tf.transpose(s, perm = [0,2,1,3]), [-1, self.num_mixtures, self.num_gaussians])
weights =tf.reshape(w, [-1,self.num_gaussians])
components = []
for i in range(self.num_gaussians):
component = tfp.distributions.MultivariateNormalDiag(loc=means[:, :, i], scale_diag=sigmas[:, :, i])
components.append(component)
cat = tfp.distributions.Categorical(probs=weights)
mixture = tfp.distributions.Mixture(cat=cat, components=components)
Mixture_density_loss = tf.math.reduce_mean(mixture.log_prob(self.reshaped_samples))
return tf.math.square(tf.cast(self.beta, dtype=tf.float32))* Mixture_density_loss
The second component is related to the relative misfit of data between the true and predicted health status, y_true
and y_pred
, respectively. It also incorporates beta
as a parameter for noise control. In the code, it is defined as a method Conditional_likelihood_term
as
def Conditional_likelihood_term(self, y_true, y_pred):
y_true = self.ColumnSliceLayer(y_true)
y_pred = self.ColumnSliceLayer(y_pred)
beta = self.beta
Ytrue = tf.repeat(y_true[:,:,tf.newaxis], self.num_samples_per_mixture, axis = 2)
Ytrue = tf.reshape(tf.transpose(Ytrue, perm = [0,2,1]), [-1,y_true.shape[1]])
Ydiff = tf.math.square((Ytrue - y_pred)/(beta*y_pred+1e-10))
LKLHD_Loss = 0.5*tf.math.reduce_mean(Ydiff, axis = None)
return tf.math.square(tf.cast(self.beta, dtype=tf.float32))*LKLHD_Loss
Once trained, the code produces loss evolution graphs for all components in the loss function and saves them at the specified path within the Results
directory. A separate CSV file is written and stored in the same path within the Output
directory.