question about " start_use_mix_training_steps " and " mixed_value_threshold "

Firstly thanks for sharing such a wonderful proj , i get a confusion in the code:   list "top_new_masks"  in the make_batch function of batch_worker.py(about line 173) , append data to it with `top_new_masks.append(int(sample_idx > collected_transitions - self.mixed_value_threshold)),`   list "top_new_masks"  was used in the agents/base.py  ,`this_target_values = target_values * top_value_masks.unsqueeze(1).repeat(1, unroll_steps + 1) \
                                     + search_values * (1 - top_value_masks).unsqueeze(1).repeat(1, unroll_steps + 1)`  why this top_new_masks be helpful ？ and what " start_use_mix_training_steps " and " mixed_value_threshold " does ？ thanks  again， and looking forward to your response 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

question about " start_use_mix_training_steps " and " mixed_value_threshold " #9

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

question about " start_use_mix_training_steps " and " mixed_value_threshold " #9

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions