Reward calculated for training Generator?

In the `train_AREL.py`. When calculate the Reward for training the generator:

```
rewards = Variable(gen_score.data - 0 * normed_seq_log_probs.data)
```
why you  minus the `0 * normed_seq_log_probs.data`? in the commit history, i notice you use the `0.0001 * normed_seq_log_probs.data`.  

In the original paper, i  think it corresponding to the Eq(9) and the `normed_seq_log_probs` might be the log π(W), so the coefficient should be 1.  Could you tell me your reason?


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reward calculated for training Generator? #26

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Reward calculated for training Generator? #26

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions