Skip to content
Discussion options

You must be logged in to vote

Hi,
thanks for your reply.
I got the PER to work on my problem now.
In case somebody comes accross the same problems the first one was a technical problem:

  • In my code all the Q value tensors/TD errors where of matrix shape [batch_size, 1].
    The importance sampling weights from the library are of vector shape [batch_size].
    When I multiplied them with the loss tensor python broadcasted the result to a matrix [batch_size, batch_size] which I didn't notice because the mean also works and spits out one number if you give it a matrix.
    So to fix this I simply had to unsqueeze the weights after reading them.
    replay_weights = torch.unsqueeze(torch.from_numpy(batch1['weights']).to(device),1)

  • The…

Replies: 2 comments 4 replies

Comment options

You must be logged in to vote
3 replies
@ymd-h
Comment options

@ymd-h
Comment options

@Kait0
Comment options

Comment options

You must be logged in to vote
1 reply
@ymd-h
Comment options

Answer selected by ymd-h
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants