-
Notifications
You must be signed in to change notification settings - Fork 1k
Open
Description
Problem Description
Would it be useful to add a complex (nested/dictionary) action and obs space variant of the PPO algo? I did this for minerl and wondered if it would be useful to contribute into the main library? I'd happily make a PR.
Checklist
- I have checked that there is no similar issue in the repo.
- I have checked the documentation site and found not relevant information in GitHub issues.
Current Behavior
Currently PPO only supports continuous or discrete actions separately and a single array observation.
Expected Behavior
PPO can support arbitrary complex action and observation spaces.
Possible Solution
- Use
treeto map over actions and observation. - Store arrays in the same struct shape as the obs space or flatten them for storage and unflatten when passing to the network.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels