-
Notifications
You must be signed in to change notification settings - Fork 211
Closed
Labels
questionFurther information is requestedFurther information is requested
Description
Hello, when the LSTM network is used as the basic unit, the environmental state transition model is used for sampling, and the state sequence of t steps consecutive moments is used as the input of the network (s1,..., st).
How does Recurrent ppo set this value? Can you tell me the code location?
Thanks
Metadata
Metadata
Assignees
Labels
questionFurther information is requestedFurther information is requested