Skip to content

Question about multiple FlattenExtractor components in ActorCriticPolicy #2165

@yeknafar

Description

@yeknafar

Hi SB3 team,

Thank you for your great work on this library!

I have a question regarding the ActorCriticPolicy architecture . I noticed that there are three separate FlattenExtractor instances: features_extractor, pi_features_extractor, and vf_features_extractor.

ActorCriticPolicy(
  (features_extractor): FlattenExtractor(
    (flatten): Flatten(start_dim=1, end_dim=-1)
  )
  (pi_features_extractor): FlattenExtractor(
    (flatten): Flatten(start_dim=1, end_dim=-1)
  )
  (vf_features_extractor): FlattenExtractor(
    (flatten): Flatten(start_dim=1, end_dim=-1)
  )
  (mlp_extractor): MlpExtractor(
    (policy_net): Sequential(
      (0): Linear(in_features=4, out_features=64, bias=True)
      (1): Tanh()
      (2): Linear(in_features=64, out_features=64, bias=True)
      (3): Tanh()
    )
    (value_net): Sequential(
      (0): Linear(in_features=4, out_features=64, bias=True)
      (1): Tanh()
      (2): Linear(in_features=64, out_features=64, bias=True)
      (3): Tanh()
    )
  )
  (action_net): Linear(in_features=64, out_features=2, bias=True)
  (value_net): Linear(in_features=64, out_features=1, bias=True)
)

Could you please clarify what the purpose of each of these is? Specifically:

Why are there three flatten extractors instead of just one?

What is the difference between features_extractor, pi_features_extractor, and vf_features_extractor in this context?

My guess is that features_extractor is used for shared feature extraction, while pi_features_extractor and vf_features_extractor are used for separate, dedicated extraction paths for the policy and value networks, respectively. I also assume that these cannot be active at the same time — that is, when the shared extractor is used, the other two are inactive, and vice versa.

Is this understanding correct?

As a suggestion: it would be helpful if the documentation included a few concrete examples showing the actual output structure of the networks. That would make it easier to understand how the components interact.

Thanks again!

Metadata

Metadata

Assignees

No one assigned

    Labels

    documentationImprovements or additions to documentationquestionFurther information is requested

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions