File tree Expand file tree Collapse file tree 2 files changed +5
-4
lines changed Expand file tree Collapse file tree 2 files changed +5
-4
lines changed Original file line number Diff line number Diff line change @@ -28,8 +28,9 @@ Others:
28
28
29
29
Documentation:
30
30
^^^^^^^^^^^^^^
31
- - Fix typo in docstring "nature" -> "Nature" (@Melanol)
32
- - Add info on split tensorboard logs into (@Melanol)
31
+ - Fixed typo in docstring "nature" -> "Nature" (@Melanol)
32
+ - Added info on split tensorboard logs into (@Melanol)
33
+ - Fixed typo in ppo doc (@francescoluciano)
33
34
34
35
35
36
Release 1.6.0 (2022-07-11)
@@ -1014,4 +1015,4 @@ And all the contributors:
1014
1015
@eleurent @ac-93 @cove9988 @theDebugger811 @hsuehch @Demetrio92 @thomasgubler @IperGiove @ScheiklP
1015
1016
@simoninithomas @armandpl @manuel-delverme @Gautam-J @gianlucadecola @buoyancy99 @caburu @xy9485
1016
1017
@Gregwar @ycheng517 @quantitative-technologies @bcollazo @git-thor @TibiGG @cool-RR @MWeltevrede
1017
- @Melanol @qgallouedec
1018
+ @Melanol @qgallouedec @francescoluciano
Original file line number Diff line number Diff line change 8
8
The `Proximal Policy Optimization <https://arxiv.org/abs/1707.06347 >`_ algorithm combines ideas from A2C (having multiple workers)
9
9
and TRPO (it uses a trust region to improve the actor).
10
10
11
- The main idea is that after an update, the new policy should be not too far form the old policy.
11
+ The main idea is that after an update, the new policy should be not too far from the old policy.
12
12
For that, ppo uses clipping to avoid too large update.
13
13
14
14
You can’t perform that action at this time.
0 commit comments