Dropout is a technique that prevents overfitting in neural networks by randomly dropping nodes of a layer during training. As a result, the trained model works as an ensemble model consisting of multiple neural networks.
From previous experiments we got the best values of learning rate 0.01 and layer size of 100. We'll use these values for the next experiment along with different values of dropout rates:
# Function to define model by adding new dense layer and dropout
def make_model(learning_rate=0.01, size_inner=100, droprate=0.5):
base_model = Xception(weights='imagenet',
include_top=False,
input_shape=(150,150,3))
base_model.trainable = False
#########################################
inputs = keras.Input(shape=(150,150,3))
base = base_model(inputs, training=False)
vectors = keras.layers.GlobalAveragePooling2D()(base)
inner = keras.layers.Dense(size_inner, activation='relu')(vectors)
drop = keras.layers.Dropout(droprate)(inner) # add dropout layer
outputs = keras.layers.Dense(10)(drop)
model = keras.Model(inputs, outputs)
#########################################
optimizer = keras.optimizers.Adam(learning_rate=learning_rate)
loss = keras.losses.CategoricalCrossentropy(from_logits=True)
# Compile the model
model.compile(optimizer=optimizer,
loss=loss,
metrics=['accuracy'])
return model
# Create checkpoint to save best model for version 3
filepath = './xception_v3_{epoch:02d}_{val_accuracy:.3f}.h5'
checkpoint = keras.callbacks.ModelCheckpoint(filepath=filepath,
save_best_only=True,
monitor='val_accuracy',
mode='max')
# Set the best values of learning rate and inner layer size based on previous experiments
learning_rate = 0.001
size = 100
# Dict to store results
scores = {}
# List of dropout rates
droprates = [0.0, 0.2, 0.5, 0.8]
for droprate in droprates:
print(droprate)
model = make_model(learning_rate=learning_rate,
size_inner=size,
droprate=droprate)
# Train for longer (epochs=30) cause of dropout regularization
history = model.fit(train_ds, epochs=30, validation_data=val_ds, callbacks=[checkpoint])
scores[droprate] = history.history
print()
print()Note: Because we introduce dropout in the neural networks, we will need to train our model for longer, hence, number of epochs is set to 30.
Classes, functions, attributes:
tf.keras.layers.Dropout(): dropout layer to randomly sets input units (i.e, nodes) to 0 with a frequency of rate at each epoch during trainingrate: argument to set the fraction of the input units to drop, it is a value of float between 0 and 1
Add notes from the video (PRs are welcome)
- A neural network might learn false patterns, i.e. if it repeatedly recognizes a certain logo on a t-shirt it might learn that the logo defines the t-shirt which is wrong since the logo might also be seen on a hoodie.
- hiding parts of the images (freeze) from being seen by the learning neural network
- dropout = randomly freezing parts of the image
- comparing different performance parameters while changing dropout rate and regularization
|
The notes are written by the community. If you see an error here, please create a PR with a fix. |