Recurrent Dqn

One central element of the Atari DQN is the use of 4 consecutive frames as input making the state more Markov, ie. having the vital dynamic movement information. This paper http://arxiv.org/abs/1507.06527v3 discusses DRQN: the multiframe input can be substituted with LSTM with the same effect (but no systematic advantage for one or the other). Also the Deepmind async paper mentions using LSTM instead of multi frame inputs for more challenging visual domains (Torcs and Labyrinth).

I think this would fit well in this codebase, I'll try to contribute this at one point.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Recurrent Dqn #8

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Recurrent Dqn #8

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions