Skip to content

The program always executes behavior clone when running CQLLearner. #305

@jiangjiadi

Description

@jiangjiadi

When I run the cql algorithm, I found the algorithm only execute behavior clone. I checked the config used. The training step is 100 and the 'num_bc_iters' is set to 50.
When I further dive to the source code of CQLLearner, I found the 'counts' in function 'step' has two keys "steps" and "walltime".
image
However, in the inplementation of 'step', the key used is "learner_steps".
image
The invalid key "learner_steps" makes the "cur_step" always be 0, thus causing the algorithm only execute behavior clone.
When I correct the key "learner_steps" to "steps", the problem is solved.
image

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions