Skip to content

Questions regarding SB3 implementation #4

@2BH

Description

@2BH

Hi Onno,

I'm very interested in the idea that you proposed and recently started to work on the code.

I have a question regarding the implementation of your SB3 for SAC with colored noise.

Since in SB3, the update of policy is done every N steps. This means that most likely, your policy will be updated multiple times within one episode. Each update will call the sampling function from your colored gaussian distribution and thus, draw samples from your colored process generator's buffer. This would affect the correlation also during the interaction with the environments as some "samples" in between are taken for updating policy.

Would this be a problem then? Or you think it would not have huge impact anyway.

Best,

Baohe

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions