Any one who can share model details?

class M1(DQNConfig):
    backend = 'tf'
    env_type = 'detail'
    action_repeat = 1

class M2(DQNConfig):
    backend = 'tf'
    env_type = 'detail'
    action_repeat = 4

I use 
python main.py --env_name=Breakout-v0 --is_train=True --display=False --use_gpu=True --model=m2
and
python main.py --env_name=Breakout-v0 --is_train=True --display=False --use_gpu=True --model=m1

The "avg_ep_r" in both models reaches 2.1 - 2.3  at around 5 million iterations. But when it comes to even 15 million iterations,  the "avg_ep_r" still fluctuates between 2.1 and 2.3.

Just like the result they have shown( I guess that is the result of Action-repeat (frame-skip) of 1, without learning rate decay). I didn't change any parameters.

![image](https://user-images.githubusercontent.com/12911394/46345648-eb8b8f80-c677-11e8-9f35-9d8b0c51cbc0.png)



The strange thing is, even when I use model m2(Action-repeat (frame-skip) of 4), my result is similar to model m1. 
The "avg_ep_r" fluctuates between 2.1 and 2.3 from around 5 million to 15 million iterations. 
The max_ep_r fluctuates between 10 and 18 from around 5 million to 15 million iterations.

class M2(DQNConfig):
    backend = 'tf'
    env_type = 'detail'
    action_repeat = 4


Do I need to change some parameters to reach the best result they have shown?

Thank you very much.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Any one who can share model details? #61

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Any one who can share model details? #61

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions