Information regarding Reward value

Based on the  paper, Reward is `D(y') - MSE`. It's confusing as the reward should be based on how good the Generator is able to fool the Discriminator i.e. how close the D(y) and D(y') are rather than the absolute value of D(y'). As the discriminator values are not scaled the value of D(y') can keep on increasing.  Shouldn't the reward be something like 1/ ||D(y') - D(y)|| 

Can you elaborate on this point or provide some reference for this? 




Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Information regarding Reward value #6

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Information regarding Reward value #6

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions