[Question] What is the point of having a separate .rewards dictionary?

### Question

As far as I can tell, rewards are managed in the following manner:
1. `_cumulative_rewards[agent]` is returned by `env.last()` (along with `observation`, `termination`, `truncation`, `info`)
2. Policy chooses an `action`, which is then executed by `env.step(action)`.

I know that rewards are more complicated than in the typical RL environment, because the reward for `agent_0` should in some circumstances be adjusted during the turn of `agent_1`, for example. IIUC, `_cumulative_rewards` is used to account for this.

Within `step` (in 2. above), the following occurs:
1. We set `self._cumulative_rewards[agent] = 0`, because the policy for `agent` has already received this reward from `env.last()` and processed it while choosing a next `action`.
2. The dictionary `self.rewards` is updated according to the consequences of `action`.
3. We then use `self._accumulate_rewards()` to update the `self._cumulative_rewards` dictionary. From this code, this straightforwardly just increments `self._cumulative_rewards` with the values in `self.rewards`.

I have two questions:
1. Why is self.rewards even needed? Why not just directly adjust self._cumulative_rewards to incorporate the consequences of `action`?
2. Am I correct in thinking that self.rewards should be set to 0 for all agents after every call to `self._accumulate_rewards()`? My reasoning is that otherwise, the values in rewards will be added multiple times to `_cumulative_rewards`, which is undesirable. If this is the case, why isn't this functionality built into `self._cumulative_rewards`?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Question] What is the point of having a separate .rewards dictionary? #1300

Question

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

[Question] What is the point of having a separate .rewards dictionary? #1300

Description

Question

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions