-
Notifications
You must be signed in to change notification settings - Fork 264
Open
Description
Hi, and thanks for the great work on this project!
I noticed that the current PPO implementation in MuJoCo Playground appears to use a standard MLP policy. I would like to use PPO with an LSTM (recurrent) policy to handle tasks that depend on temporal information.
- Does MuJoCo Playground currently support LSTM (or any recurrent) policies with PPO?
- Since it’s based on Brax, does Brax itself support recurrent policies (e.g., PPO + LSTM), or would this need to be implemented externally? If so, can you refer some guideline?
Thanks in advance for clarifying!
FranzKnut
Metadata
Metadata
Assignees
Labels
No labels