Why play back the policy into the same RL train script will not make the agent learn? #2205
Replies: 2 comments
-
Thanks for posting this. Which scripts are you using for training and then play? Playback may be just doing inference. Alternatively, weights may be unaltered after training (frozen), unless you pick up the training where you left off (checkpoint). If you could share some code or hwo are you calling your training and inference rounds would help. |
Beta Was this translation helpful? Give feedback.
-
Thank you for your answer. I use the functions in the play.py script to load the trained policy and train the agent using train.py script. Then I use play.py to observe my training result. train.py
|
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
I’m experimenting with loading a trained policy (successfully trained) in my environment and playing back its actions during training. However, I discovered that simply using the pre-trained policy to generate actions based on the current training environment’s observations does not lead to any further learning by the agent.
The pseudo code look something like this:
except the actions, everything else are the same.
I am using SKRL, PPO. My task is Franka Lift Cube
Beta Was this translation helpful? Give feedback.
All reactions