PEARLWorker

Hi, I have two issues that I do not quite understand why they are part of the code

1. The code overwrites the action `a` in the deterministic case:
https://github.com/rlworkgroup/garage/blob/b4abe07f0fa9bac2cb70e4a3e315c2e7e5b08507/src/garage/torch/algos/pearl.py#L743-L746
There is an open pull request about this here: https://github.com/rlworkgroup/garage/pull/2275.

2. I was wondering if the context is ever used in `self.agent`. As far as I understand in the `pearl.py` file we never use the context of `self._policy` and it is also not used within the class `ContextConditionedPolicy`.
https://github.com/rlworkgroup/garage/blob/b4abe07f0fa9bac2cb70e4a3e315c2e7e5b08507/src/garage/torch/algos/pearl.py#L754-L759

Some hints are appreciated
Thanks

	a, agent_info = self.agent.get_action(self._prev_obs)
	if self._deterministic:
	a = agent_info['mean']
	a, agent_info = self.agent.get_action(self._prev_obs)

	if self._accum_context:
	s = TimeStep.from_env_step(env_step=es,
	last_observation=self._prev_obs,
	agent_info=agent_info,
	episode_info=self._episode_info)
	self.agent.update_context(s)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PEARLWorker #2310

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

PEARLWorker #2310

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions