-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Description
I've been training manipulation policies on FetchPush-v4 with SAC, and one thing that keeps bugging me: when I start a new training run, all the episode-level experience from previous runs is gone. The replay buffer only lives within a single model.learn() call.
What I'd like is a way to persist high-level episode summaries (not raw transitions — just things like "this grasp angle worked", "this approach distance was too far") across training runs, so the agent can recall what worked before.
I hacked together a BaseCallback that does this:
class ExperienceMemoryCallback(BaseCallback):
def _on_step(self) -> bool:
# detect episode end via Monitor info
for info in self.locals.get("infos", []):
if "episode" in info:
self._record_episode(info)
return TrueIt stores episode summaries in a local SQLite db, and at the start of each new run, retrieves relevant past experiences. In my tests on FetchPush-v4 with 10 seeds, success rate went from ~42% to ~67% compared to training from scratch.
I've packaged this as robotmem (pip install robotmem) if anyone wants to try it. The SB3 callback is at robotmem.sb3.RobotMemSB3Callback.
Would there be interest in mentioning this in the docs as a community callback, or is this too niche? Happy to adapt the API if there's a preferred pattern for third-party callbacks.