-
Notifications
You must be signed in to change notification settings - Fork 324
Open
Description
Hi, I have two issues that I do not quite understand why they are part of the code
-
The code overwrites the action
ain the deterministic case:
garage/src/garage/torch/algos/pearl.py
Lines 743 to 746 in b4abe07
a, agent_info = self.agent.get_action(self._prev_obs) if self._deterministic: a = agent_info['mean'] a, agent_info = self.agent.get_action(self._prev_obs)
There is an open pull request about this here: Fix double action sampling in PEARLWorker #2275. -
I was wondering if the context is ever used in
self.agent. As far as I understand in thepearl.pyfile we never use the context ofself._policyand it is also not used within the classContextConditionedPolicy.
garage/src/garage/torch/algos/pearl.py
Lines 754 to 759 in b4abe07
if self._accum_context: s = TimeStep.from_env_step(env_step=es, last_observation=self._prev_obs, agent_info=agent_info, episode_info=self._episode_info) self.agent.update_context(s)
Some hints are appreciated
Thanks
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels