GMM_Autoencoder

Deep Neural Network with probabilistic description of damage condition space via Gaussian Mixture Model

Deep Gaussian Mixture approach for uncertainty quantification in FOWT damage estimates

This code aims at quantifying the uncertainty of mooring line fault detection predictions using a multivariate Gaussian mixture variational autoencoder. The scripts included in this code encompass everything from data preprocessing, i.e. cleaning, scaling and dataset splitting, to DNN training and results posprocessing.

Getting Started

These instructions will get you a copy of the project's latest stable version up and running on your local machine for development and testing purposes.

Prerequisites

The following Python libraries are needed to execute the code

TensorFlow 2.13.0
Keras 2.13.0
NumPy
Pandas
Matplotlib
TensorFlow Probability

Repository Structure

In this section, we present the directories of the repository.

File/directory	Description
multivariate_main.py	Main executable file, used to train only the inverse operator
forward_included_main.py	Main executable file, used to train both forward and inverse operator
DATA	Directory where the employed datasets can be found
MODULES	Directory where preprocessing, training, and posprocessing scripts can be found
MODULES/PREPROCESSING/preprocessing_tools.py	Script containing main preprocessing functions
MODULES/TRAINING/multivariate_arch.py	Script containing architecture containing inverse, sampling, and forward operator architecture
MODULES/TRAINING/multivariate_models.py	Script defining the structure of the MGMVAE
OUTPUT	Directory where the outputs of our training process are stored (models, weights, loss logs, etc.)
RESULTS	Directory where graphs are stored (prediction accuracies, loss evolutions, etc.)

The main executable file is multivariate_main.py. In this file, we read and preprocess the dataset using

loc = os.path.join('Data', 'Data_nicoWT', 'joint.csv')
u_train, u_val, u_test, r_train, r_val, r_test, p_train, p_val, p_test = read_data_previous(loc, dt, debug=False)

The function read_data_previous(loc, dt, debug) comes from the script located in MODULES/PREPROCESSING/preprocessing_tools.py. This function reads the dataset, drops the features we discard, and shuffles and splits the data into training, validation, and testing datasets. Here we distinguish among the statistical inputs extracted from the platform's response, u, from the environmental conditions, r, and target damage coefficients, p.

Then, we define the number of mixtures, Gaussians per mixture, and samples per mixture as

num_mixtures = p_train.shape[1] 
num_gaussians_ = np.array([5])
num_samples_per_mixture_ = np.array([10])

We then define the hyperparameters needed to train our inverse operator as

input_dim_decoder = p_train.shape[1] + r_train.shape[1]
output_dim = u_train.shape[1]
loss_name = ['ELBO_loss']
inverse_LR = 1e-06
n_epochs_enc = 1000
betas = [0.033]

A loop then starts to train the inverse operator. The loop is executed as many times as combinations of values for beta, num_mixtures, num_gaussians, and num_samples_per_mixture (as of now there is only one training taking place since only one value for each of these variables is specified). The loop first loads a previously saved forward model as

forward_path = os.path.join("Output", '23Apr_2Prop_Forward')
model_forward = tf.saved_model.load(os.path.join(forward_path,'model_forward_2mix'))

Then, the complete model is built using the custom class type MyInverseForward.

model = My_InverseForward(input_dim_encoder, input_dim_decoder, output_dim, num_mixtures, num_gaussians, num_samples_per_mixture, model_forward, selected_features, beta)

The inputs to the model are:

input_dim_encoder: input dimension of the encoder. Set equal to u_train[:,selected_features].shape[1] + r_train.shape[1]. It filters the original u shape to allow for its training using only the degrees of freedom (DOFs) of interest.
input_dim_decoder: input dimension of the decoder. Set equal to p_train.shape[1] + r_train.shape[1].
output_dim: output dimension of the complete autoencoder. Set equal to u_train.shape[1]. It does not matter if the forward operator is trained using all DOFs or the filtered ones, as it will be able to reconstruct the complete set of features based on the estimated latent space.
num_mixtures: number of mixtures.
num_gaussians: number of Gaussian components in each mixture.
num_samples_per_mixture: number of samples extracted from each mixture. These samples are all passed to the decoder.
model_forward: a previously saved trained forward model.
selected_features: array of indices corresponding to the features of the DOFs of interest.
beta: parameter used for noise control

The model is built using two separate modules, namely MODULES/TRAINING/multivariate_arch.py for the model's architecture, divided into encoder, sampling layer, and decoder; and MODULES/TRAINING/multivariate_models.py to define the model's structure and loss function. The structure of the model is specified in the definition of the My_InverseForward class as

self.Encoder_model: encoder model. Defined by a custom class Inverse_GMM_Model. This model, when called, concatenates its given inputs and builds the architecture Fully_connected_enc_gmm, defined in MODULES/TRAINING/multivariate_arch.py as

def Fully_connected_enc_gmm(input_dim, num_mixtures, num_gaussians):
    input1 = tf.keras.Input(shape =(input_dim), name = 'Innnputlayer')
    lay1 = layers.Dense(100, activation = 'relu', name = 'lay1')(input1) #Intermediate layers
    lay2 = layers.Dense(150, activation = 'relu', kernel_initializer="he_uniform", bias_initializer="zeros", name = 'lay2')(lay1) #Intermediate layers
    lay2 = layers.Dense(200, activation = 'tanh')(lay2) #Intermediate layers
    lay3 = layers.Dense(150, activation = 'relu', kernel_initializer="he_uniform", bias_initializer="zeros", name = 'lay3')(lay2) #Intermediate layers
    lay4 = layers.Dense(100, activation = 'tanh', name = 'lay4')(lay3) #Intermediate layers
    means = layers.Dense(num_mixtures*num_gaussians, activation = 'sigmoid',  name = 'means')(lay4)
    sigmas = layers.Dense(num_mixtures*num_gaussians, activation = 'softplus', name = 'sigmas')(lay4)
    weights = layers.Dense(num_gaussians, activation = 'softmax', name = 'weights')(lay4) #Intermediate layers 
    outputs = tf.concat([means,sigmas,weights], axis = 1)
    return tf.keras.Model(inputs = input1, outputs = outputs)

With weight initialization strategies as presented by He et al.. It returns an array of GMM properties (means, variances, and weights) named GMMprops, which is passed to the sampling layer. The sampling layer is defined using the custom layer class GMMSamplingLayer as

class GMMSamplingLayer(tf.keras.layers.Layer):
    def __init__(self, num_mixtures, num_gaussians, num_samples_per_mixture, **kwargs):
        super(GMMSamplingLayer, self).__init__(**kwargs)
        self.num_mixtures = num_mixtures
        self.num_gaussians = num_gaussians
        self.num_samples_per_mixture = num_samples_per_mixture
    
    def call(self, inputs):
        means, sigmas, weights = inputs
        # Reshaping required to accommodate for the batch size (, num_mixtures*num_gaussians) during training
        means = tf.reshape(means, (-1, self.num_mixtures, self.num_gaussians))
        sigmas = tf.reshape(sigmas, (-1, self.num_mixtures, self.num_gaussians))
        weights = tf.reshape(weights, (-1, self.num_gaussians))
        
        samples = []
        truncated_components = []
        for i in range(self.num_gaussians):
            component = tfp.distributions.MultivariateNormalDiag(loc=means[:, :, i], scale_diag=sigmas[:, :, i])
            truncated_components.append(component)
    
        cat = tfd.Categorical(probs=weights)
        q = tfd.Mixture(cat=cat, components=truncated_components)
        samples = []
        samples = q.sample(self.num_samples_per_mixture)
        s1=tf.transpose(samples,perm = [1,0,2])
        self.reshaped_samples = tf.reshape(s1, (-1, self.num_mixtures))
        return self.reshaped_samples

When called, GMMSamplingLayer reshapes its inputs, means, sigmas, and weights to the required shape (None, num_mixtures, num_gaussians) for the former two and (None, num_gaussians) for the latter. Then, it creates a truncated multivariate Gaussian distribution by first defining each component from each means and sigmas, and then building q=tfd.Mixture combining all components, each one with corresponding categorical probabilities weights. Then, it samples num_samples_per_mixture from q and reshapes the samples to (None, num_mixtures), returning reshaped_samples to pass to the decoder.

The decoder is simply loaded from a previously trained model_forward, which takes the reshaped samples and reconstructs the array of response statistics u.

Back to multivariate_main.py, the model is compiled using the Adam optimizer, with specified learning rate inverse_LR and loss function ELBO_loss as

model.compile(optimizer = optimizers.Adam(learning_rate = inverse_LR), loss = model.ELBO_loss, metrics = [model.Conditional_likelihood_term, model.Mixture_dens_term])    
                model_history = model.fit(x = [u_train, r_train, p_train],
                  y = u_train,
                  batch_size = 1024,
                  epochs = n_epochs_enc,
                  shuffle = True,
                  validation_data = ([u_val, r_val, p_val], u_val))

The loss function ELBO_loss is defined as the sum of two components, based on the findings by Goh et al. (2021) and Rodríguez et al. (2023). The first component, defined in My_InverseForward as a method Mixture_dense_term, estimates the logarithmic probability of obtaining the samples reshaped_samples given the distribution defined by the GMM properties we have fitted. The method returns the mean of the obtained probabilites. In the code, this component is defined as

def Mixture_dens_term(self, y_true, y_pred):
    GMM_props = self.GMM_props
    means, sigmas, weights = tf.split(GMM_props, [self.num_mixtures * self.num_gaussians,
                                                    self.num_mixtures * self.num_gaussians,
                                                     self.num_gaussians], axis=-1)
    #Reshaping required to accomodate for the batch size (, num_mixtures*num_gaussians) during training
    means = tf.reshape(means, (-1, self.num_mixtures, self.num_gaussians))
    sigmas = tf.reshape(sigmas, (-1, self.num_mixtures, self.num_gaussians))
    weights = tf.reshape(weights, (-1, self.num_gaussians))
    weights = tf.clip_by_value(weights, 1e-6, 1.0)
    ## REPEATING THE MEANS TO FIT WITH THE NUMBER OF SAMPLES ,N
    m = tf.repeat(means[:,:,tf.newaxis], self.num_samples_per_mixture, axis = 2)
    s = tf.repeat(sigmas[:,:,tf.newaxis], self.num_samples_per_mixture, axis = 2)
    w = tf.repeat(weights[:,tf.newaxis], self.num_samples_per_mixture, axis = 1)
    ## THIS PERMUTATION IS TO MAKE THE MEANS, SIGMAS AND WEIGHTS HAVE THE DESIRED SHAPE (None*N, num_mixtures, num_gaussians):
    means = tf.reshape(tf.transpose(m, perm = [0,2,1,3]), [-1, self.num_mixtures, self.num_gaussians])
    sigmas = tf.reshape(tf.transpose(s, perm = [0,2,1,3]), [-1, self.num_mixtures, self.num_gaussians])
    weights =tf.reshape(w, [-1,self.num_gaussians]) 
    
    components = []
    for i in range(self.num_gaussians):
        component = tfp.distributions.MultivariateNormalDiag(loc=means[:, :, i], scale_diag=sigmas[:, :, i])
        components.append(component)

    cat = tfp.distributions.Categorical(probs=weights)
    mixture = tfp.distributions.Mixture(cat=cat, components=components)
   
    Mixture_density_loss = tf.math.reduce_mean(mixture.log_prob(self.reshaped_samples))
    return tf.math.square(tf.cast(self.beta, dtype=tf.float32))* Mixture_density_loss

The second component is related to the relative misfit of data between the true and predicted health status, y_true and y_pred, respectively. It also incorporates beta as a parameter for noise control. In the code, it is defined as a method Conditional_likelihood_term as

def Conditional_likelihood_term(self, y_true, y_pred):
    y_true = self.ColumnSliceLayer(y_true)
    y_pred = self.ColumnSliceLayer(y_pred)
    beta = self.beta
    Ytrue = tf.repeat(y_true[:,:,tf.newaxis], self.num_samples_per_mixture, axis = 2)
    Ytrue = tf.reshape(tf.transpose(Ytrue, perm = [0,2,1]), [-1,y_true.shape[1]])
    Ydiff = tf.math.square((Ytrue - y_pred)/(beta*y_pred+1e-10))
    LKLHD_Loss = 0.5*tf.math.reduce_mean(Ydiff, axis = None)
    return tf.math.square(tf.cast(self.beta, dtype=tf.float32))*LKLHD_Loss

Once trained, the code produces loss evolution graphs for all components in the loss function and saves them at the specified path within the Results directory. A separate CSV file is written and stored in the same path within the Output directory.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
Data		Data
MODULES		MODULES
Output		Output
Results		Results
README.md		README.md
forward_included_main.py		forward_included_main.py
forward_included_results_analysis.py		forward_included_results_analysis.py
gitignore		gitignore
main_server.py		main_server.py
multivariate_main.py		multivariate_main.py
run_GPU.py		run_GPU.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

GMM_Autoencoder

Deep Gaussian Mixture approach for uncertainty quantification in FOWT damage estimates

Getting Started

Prerequisites

Repository Structure

About

Uh oh!

Releases

Packages

Languages

Mathmode/GMM_Autoencoder

Folders and files

Latest commit

History

Repository files navigation

GMM_Autoencoder

Deep Gaussian Mixture approach for uncertainty quantification in FOWT damage estimates

Getting Started

Prerequisites

Repository Structure

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages