Decision Transformer Training Option #1307

Closed

Labels

mlneeds discussion

opened

I would like to see if the PPO method is better than the Decision Transformer method of learning to maximize reward.

Metadata

Assignees

No one assigned

Labels

mlneeds discussion

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests