Skip to content

recalculation of reward after action update #1

@soldierofhell

Description

@soldierofhell

Hi @homangab,
Thanks for the effort to boost the performance of CEM optimization in the right (gradient) direction.
However reading the code I don't see (probably should be somwhere here) an update of rewards after the actions are updated by optimizer.

returns = self.env.rollout(actions)
should be calculated once again. Am I wrong?

Regards,

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions