-
Notifications
You must be signed in to change notification settings - Fork 899
Open
Description
the update value network should be:
alpha_w = 1e-3 # 初始化
optimizer_w = optim.Adam(**s_value_func**.parameters(), lr=alpha_w)
optimizer_w.zero_grad()
policy_loss_w =-delta
policy_loss_w.backward(retain_graph = True)
clip_grad_norm_(policy_loss_w, 0.1)
optimizer_w.step()
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels