Skip to content

fix: reworked frame skipping and max-pooling for Atari#20

Open
panahiparham wants to merge 1 commit intoandnp:mainfrom
panahiparham:atari-preprocessing
Open

fix: reworked frame skipping and max-pooling for Atari#20
panahiparham wants to merge 1 commit intoandnp:mainfrom
panahiparham:atari-preprocessing

Conversation

@panahiparham
Copy link
Contributor

Pulled frame skipping out of the gymnasium environment to perform max-pooling of consecutive frames as performed in dqn zoo codebase. The agent's stream of experience should now follow the pipeline described below:

In every step of environment the Atari simulator takes 4 steps by repeating the selected action. This simplifies the RL problem and speeds up execution. If the agent loses a life or the episode terminates, the frame skipping loop ends early and the environments discount factor is set to 0. The agent receives the total reward obtained during frame skipping loop. Consecutive observations are max-pooled to handle screen flickering due to Atari2600's hardware limitations. After max-pooling the frames are resized to (84, 84) and turned grayscale. At each step, the agent receives a stack of past 4 observed (not skipped) processed frames (observation shape (84, 84, 4)). In the below diagram ~ denotes skipped frames, small letters denote max pooled frames (e.g. b = max pool(3, 4)), and capital letters denote max pooled frames after resizing and turning into grayscale (e.g. C = max pool(7, 8)).

0    | 1  2  3  4    | 5  6  7  8    | 9  10 11  12   | 13 14 15 16   | (frames)
0    | ~  ~  3  4    | ~  ~  7  8    | ~  ~  11  12   | ~  ~  15 16   | (skipping)
a    | ~  ~  ~  b    | ~  ~  ~  c    | ~  ~  ~   d    | ~  ~  ~  e    | (max-pooling)
A    | ~  ~  ~  B    | ~  ~  ~  C    | ~  ~  ~   D    | ~  ~  ~  E    | (resize and grayscale)
A000 | ~  ~  ~  AB00 | ~  ~  ~  ABC0 | ~  ~  ~   ABCD | ~  ~  ~  BCDE | (stacking)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant