An question about the architecture of DQNAgent #473

tazhu2023 · 2024-08-02T08:06:38Z

tazhu2023
Aug 2, 2024

Regarding the arch of DQNAgent, the last layer is "softmax" is to output the probability distribution of the actions?
And the layer before is "sigmoid", why the architecure is like this? Noramlly "sigmoid" is not required before a "softmax"? 
    
 def _build_policy_network(self):
    network = tf.keras.Sequential([
        tf.keras.layers.InputLayer(input_shape=self.observation_shape),
        tf.keras.layers.Conv1D(filters=64, kernel_size=6, padding="same", activation="tanh"),
        tf.keras.layers.MaxPooling1D(pool_size=2),
        tf.keras.layers.Conv1D(filters=32, kernel_size=3, padding="same", activation="tanh"),
        tf.keras.layers.MaxPooling1D(pool_size=2),
        tf.keras.layers.Flatten(),
        tf.keras.layers.Dense(self.n_actions, activation="sigmoid"),
        tf.keras.layers.Dense(self.n_actions, activation="softmax")
    ])

    return network

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

An question about the architecture of DQNAgent #473

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

An question about the architecture of DQNAgent #473

Uh oh!

tazhu2023 Aug 2, 2024

Replies: 0 comments

tazhu2023
Aug 2, 2024