Skip to content

Information regarding Reward value #6

@trideeprath

Description

@trideeprath

Based on the paper, Reward is D(y') - MSE. It's confusing as the reward should be based on how good the Generator is able to fool the Discriminator i.e. how close the D(y) and D(y') are rather than the absolute value of D(y'). As the discriminator values are not scaled the value of D(y') can keep on increasing. Shouldn't the reward be something like 1/ ||D(y') - D(y)||

Can you elaborate on this point or provide some reference for this?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions