Skip to content

Deep Neural Network with probabilistic description of damage condition space via Gaussian Mixture Model

Notifications You must be signed in to change notification settings

Mathmode/GMM_Autoencoder

Repository files navigation

GMM_Autoencoder

Deep Neural Network with probabilistic description of damage condition space via Gaussian Mixture Model

Deep Gaussian Mixture approach for uncertainty quantification in FOWT damage estimates

This code aims at quantifying the uncertainty of mooring line fault detection predictions using a multivariate Gaussian mixture variational autoencoder. The scripts included in this code encompass everything from data preprocessing, i.e. cleaning, scaling and dataset splitting, to DNN training and results posprocessing.

Getting Started

These instructions will get you a copy of the project's latest stable version up and running on your local machine for development and testing purposes.

Prerequisites

The following Python libraries are needed to execute the code

  • TensorFlow 2.13.0
  • Keras 2.13.0
  • NumPy
  • Pandas
  • Matplotlib
  • TensorFlow Probability

Repository Structure

In this section, we present the directories of the repository.

File/directory Description
multivariate_main.py Main executable file, used to train only the inverse operator
forward_included_main.py Main executable file, used to train both forward and inverse operator
DATA Directory where the employed datasets can be found
MODULES Directory where preprocessing, training, and posprocessing scripts can be found
MODULES/PREPROCESSING/preprocessing_tools.py Script containing main preprocessing functions
MODULES/TRAINING/multivariate_arch.py Script containing architecture containing inverse, sampling, and forward operator architecture
MODULES/TRAINING/multivariate_models.py Script defining the structure of the MGMVAE
OUTPUT Directory where the outputs of our training process are stored (models, weights, loss logs, etc.)
RESULTS Directory where graphs are stored (prediction accuracies, loss evolutions, etc.)

The main executable file is multivariate_main.py. In this file, we read and preprocess the dataset using

loc = os.path.join('Data', 'Data_nicoWT', 'joint.csv')
u_train, u_val, u_test, r_train, r_val, r_test, p_train, p_val, p_test = read_data_previous(loc, dt, debug=False)

The function read_data_previous(loc, dt, debug) comes from the script located in MODULES/PREPROCESSING/preprocessing_tools.py. This function reads the dataset, drops the features we discard, and shuffles and splits the data into training, validation, and testing datasets. Here we distinguish among the statistical inputs extracted from the platform's response, u, from the environmental conditions, r, and target damage coefficients, p.

Then, we define the number of mixtures, Gaussians per mixture, and samples per mixture as

num_mixtures = p_train.shape[1] 
num_gaussians_ = np.array([5])
num_samples_per_mixture_ = np.array([10])

We then define the hyperparameters needed to train our inverse operator as

input_dim_decoder = p_train.shape[1] + r_train.shape[1]
output_dim = u_train.shape[1]
loss_name = ['ELBO_loss']
inverse_LR = 1e-06
n_epochs_enc = 1000
betas = [0.033]

A loop then starts to train the inverse operator. The loop is executed as many times as combinations of values for beta, num_mixtures, num_gaussians, and num_samples_per_mixture (as of now there is only one training taking place since only one value for each of these variables is specified). The loop first loads a previously saved forward model as

forward_path = os.path.join("Output", '23Apr_2Prop_Forward')
model_forward = tf.saved_model.load(os.path.join(forward_path,'model_forward_2mix'))

Then, the complete model is built using the custom class type MyInverseForward.

model = My_InverseForward(input_dim_encoder, input_dim_decoder, output_dim, num_mixtures, num_gaussians, num_samples_per_mixture, model_forward, selected_features, beta)

The inputs to the model are:

  • input_dim_encoder: input dimension of the encoder. Set equal to u_train[:,selected_features].shape[1] + r_train.shape[1]. It filters the original u shape to allow for its training using only the degrees of freedom (DOFs) of interest.
  • input_dim_decoder: input dimension of the decoder. Set equal to p_train.shape[1] + r_train.shape[1].
  • output_dim: output dimension of the complete autoencoder. Set equal to u_train.shape[1]. It does not matter if the forward operator is trained using all DOFs or the filtered ones, as it will be able to reconstruct the complete set of features based on the estimated latent space.
  • num_mixtures: number of mixtures.
  • num_gaussians: number of Gaussian components in each mixture.
  • num_samples_per_mixture: number of samples extracted from each mixture. These samples are all passed to the decoder.
  • model_forward: a previously saved trained forward model.
  • selected_features: array of indices corresponding to the features of the DOFs of interest.
  • beta: parameter used for noise control

The model is built using two separate modules, namely MODULES/TRAINING/multivariate_arch.py for the model's architecture, divided into encoder, sampling layer, and decoder; and MODULES/TRAINING/multivariate_models.py to define the model's structure and loss function. The structure of the model is specified in the definition of the My_InverseForward class as

  • self.Encoder_model: encoder model. Defined by a custom class Inverse_GMM_Model. This model, when called, concatenates its given inputs and builds the architecture Fully_connected_enc_gmm, defined in MODULES/TRAINING/multivariate_arch.py as
def Fully_connected_enc_gmm(input_dim, num_mixtures, num_gaussians):
    input1 = tf.keras.Input(shape =(input_dim), name = 'Innnputlayer')
    lay1 = layers.Dense(100, activation = 'relu', name = 'lay1')(input1) #Intermediate layers
    lay2 = layers.Dense(150, activation = 'relu', kernel_initializer="he_uniform", bias_initializer="zeros", name = 'lay2')(lay1) #Intermediate layers
    lay2 = layers.Dense(200, activation = 'tanh')(lay2) #Intermediate layers
    lay3 = layers.Dense(150, activation = 'relu', kernel_initializer="he_uniform", bias_initializer="zeros", name = 'lay3')(lay2) #Intermediate layers
    lay4 = layers.Dense(100, activation = 'tanh', name = 'lay4')(lay3) #Intermediate layers
    means = layers.Dense(num_mixtures*num_gaussians, activation = 'sigmoid',  name = 'means')(lay4)
    sigmas = layers.Dense(num_mixtures*num_gaussians, activation = 'softplus', name = 'sigmas')(lay4)
    weights = layers.Dense(num_gaussians, activation = 'softmax', name = 'weights')(lay4) #Intermediate layers 
    outputs = tf.concat([means,sigmas,weights], axis = 1)
    return tf.keras.Model(inputs = input1, outputs = outputs)  

With weight initialization strategies as presented by He et al.. It returns an array of GMM properties (means, variances, and weights) named GMMprops, which is passed to the sampling layer. The sampling layer is defined using the custom layer class GMMSamplingLayer as

class GMMSamplingLayer(tf.keras.layers.Layer):
    def __init__(self, num_mixtures, num_gaussians, num_samples_per_mixture, **kwargs):
        super(GMMSamplingLayer, self).__init__(**kwargs)
        self.num_mixtures = num_mixtures
        self.num_gaussians = num_gaussians
        self.num_samples_per_mixture = num_samples_per_mixture
    
    def call(self, inputs):
        means, sigmas, weights = inputs
        # Reshaping required to accommodate for the batch size (, num_mixtures*num_gaussians) during training
        means = tf.reshape(means, (-1, self.num_mixtures, self.num_gaussians))
        sigmas = tf.reshape(sigmas, (-1, self.num_mixtures, self.num_gaussians))
        weights = tf.reshape(weights, (-1, self.num_gaussians))
        
        samples = []
        truncated_components = []
        for i in range(self.num_gaussians):
            component = tfp.distributions.MultivariateNormalDiag(loc=means[:, :, i], scale_diag=sigmas[:, :, i])
            truncated_components.append(component)
    
        cat = tfd.Categorical(probs=weights)
        q = tfd.Mixture(cat=cat, components=truncated_components)
        samples = []
        samples = q.sample(self.num_samples_per_mixture)
        s1=tf.transpose(samples,perm = [1,0,2])
        self.reshaped_samples = tf.reshape(s1, (-1, self.num_mixtures))
        return self.reshaped_samples

When called, GMMSamplingLayer reshapes its inputs, means, sigmas, and weights to the required shape (None, num_mixtures, num_gaussians) for the former two and (None, num_gaussians) for the latter. Then, it creates a truncated multivariate Gaussian distribution by first defining each component from each means and sigmas, and then building q=tfd.Mixture combining all components, each one with corresponding categorical probabilities weights. Then, it samples num_samples_per_mixture from q and reshapes the samples to (None, num_mixtures), returning reshaped_samples to pass to the decoder.

The decoder is simply loaded from a previously trained model_forward, which takes the reshaped samples and reconstructs the array of response statistics u.

Back to multivariate_main.py, the model is compiled using the Adam optimizer, with specified learning rate inverse_LR and loss function ELBO_loss as

model.compile(optimizer = optimizers.Adam(learning_rate = inverse_LR), loss = model.ELBO_loss, metrics = [model.Conditional_likelihood_term, model.Mixture_dens_term])    
                model_history = model.fit(x = [u_train, r_train, p_train],
                  y = u_train,
                  batch_size = 1024,
                  epochs = n_epochs_enc,
                  shuffle = True,
                  validation_data = ([u_val, r_val, p_val], u_val))

The loss function ELBO_loss is defined as the sum of two components, based on the findings by Goh et al. (2021) and Rodríguez et al. (2023). The first component, defined in My_InverseForward as a method Mixture_dense_term, estimates the logarithmic probability of obtaining the samples reshaped_samples given the distribution defined by the GMM properties we have fitted. The method returns the mean of the obtained probabilites. In the code, this component is defined as

def Mixture_dens_term(self, y_true, y_pred):
    GMM_props = self.GMM_props
    means, sigmas, weights = tf.split(GMM_props, [self.num_mixtures * self.num_gaussians,
                                                    self.num_mixtures * self.num_gaussians,
                                                     self.num_gaussians], axis=-1)
    #Reshaping required to accomodate for the batch size (, num_mixtures*num_gaussians) during training
    means = tf.reshape(means, (-1, self.num_mixtures, self.num_gaussians))
    sigmas = tf.reshape(sigmas, (-1, self.num_mixtures, self.num_gaussians))
    weights = tf.reshape(weights, (-1, self.num_gaussians))
    weights = tf.clip_by_value(weights, 1e-6, 1.0)
    ## REPEATING THE MEANS TO FIT WITH THE NUMBER OF SAMPLES ,N
    m = tf.repeat(means[:,:,tf.newaxis], self.num_samples_per_mixture, axis = 2)
    s = tf.repeat(sigmas[:,:,tf.newaxis], self.num_samples_per_mixture, axis = 2)
    w = tf.repeat(weights[:,tf.newaxis], self.num_samples_per_mixture, axis = 1)
    ## THIS PERMUTATION IS TO MAKE THE MEANS, SIGMAS AND WEIGHTS HAVE THE DESIRED SHAPE (None*N, num_mixtures, num_gaussians):
    means = tf.reshape(tf.transpose(m, perm = [0,2,1,3]), [-1, self.num_mixtures, self.num_gaussians])
    sigmas = tf.reshape(tf.transpose(s, perm = [0,2,1,3]), [-1, self.num_mixtures, self.num_gaussians])
    weights =tf.reshape(w, [-1,self.num_gaussians]) 
    
    components = []
    for i in range(self.num_gaussians):
        component = tfp.distributions.MultivariateNormalDiag(loc=means[:, :, i], scale_diag=sigmas[:, :, i])
        components.append(component)

    cat = tfp.distributions.Categorical(probs=weights)
    mixture = tfp.distributions.Mixture(cat=cat, components=components)
   
    Mixture_density_loss = tf.math.reduce_mean(mixture.log_prob(self.reshaped_samples))
    return tf.math.square(tf.cast(self.beta, dtype=tf.float32))* Mixture_density_loss

The second component is related to the relative misfit of data between the true and predicted health status, y_true and y_pred, respectively. It also incorporates beta as a parameter for noise control. In the code, it is defined as a method Conditional_likelihood_term as

def Conditional_likelihood_term(self, y_true, y_pred):
    y_true = self.ColumnSliceLayer(y_true)
    y_pred = self.ColumnSliceLayer(y_pred)
    beta = self.beta
    Ytrue = tf.repeat(y_true[:,:,tf.newaxis], self.num_samples_per_mixture, axis = 2)
    Ytrue = tf.reshape(tf.transpose(Ytrue, perm = [0,2,1]), [-1,y_true.shape[1]])
    Ydiff = tf.math.square((Ytrue - y_pred)/(beta*y_pred+1e-10))
    LKLHD_Loss = 0.5*tf.math.reduce_mean(Ydiff, axis = None)
    return tf.math.square(tf.cast(self.beta, dtype=tf.float32))*LKLHD_Loss

Once trained, the code produces loss evolution graphs for all components in the loss function and saves them at the specified path within the Results directory. A separate CSV file is written and stored in the same path within the Output directory.

About

Deep Neural Network with probabilistic description of damage condition space via Gaussian Mixture Model

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages