Skip to content

Any one who can share model details? #61

@Richardxxxxxxx

Description

@Richardxxxxxxx

class M1(DQNConfig):
backend = 'tf'
env_type = 'detail'
action_repeat = 1

class M2(DQNConfig):
backend = 'tf'
env_type = 'detail'
action_repeat = 4

I use
python main.py --env_name=Breakout-v0 --is_train=True --display=False --use_gpu=True --model=m2
and
python main.py --env_name=Breakout-v0 --is_train=True --display=False --use_gpu=True --model=m1

The "avg_ep_r" in both models reaches 2.1 - 2.3 at around 5 million iterations. But when it comes to even 15 million iterations, the "avg_ep_r" still fluctuates between 2.1 and 2.3.

Just like the result they have shown( I guess that is the result of Action-repeat (frame-skip) of 1, without learning rate decay). I didn't change any parameters.

image

The strange thing is, even when I use model m2(Action-repeat (frame-skip) of 4), my result is similar to model m1.
The "avg_ep_r" fluctuates between 2.1 and 2.3 from around 5 million to 15 million iterations.
The max_ep_r fluctuates between 10 and 18 from around 5 million to 15 million iterations.

class M2(DQNConfig):
backend = 'tf'
env_type = 'detail'
action_repeat = 4

Do I need to change some parameters to reach the best result they have shown?

Thank you very much.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions