-
Notifications
You must be signed in to change notification settings - Fork 50
Hidden State mapping to two value nodes instead of 1 #20
Copy link
Copy link
Open
Description
Hi,
I'm confused on why you've defined the value head as you did in models.py. Namely, the value head as it is will output two numbers instead of 1, since you're mapping from the (2,4096) final hidden state to a (2,1) dimension tensor for the final value. It looks like you're missing half the hidden states. I would expect for it to map from a flattened version of the final hidden state to a single node.
As a sanity check I looked for where this was used and in line 1114 of trainers.py, I noticed that you're only taking in the first value in this (2,1) vector.
Can you tell me why you've made this design choice? I feel like I'm misinterpreting something here.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels