Skip to content

Commit 792771d

Browse files
authored
Add docs for algos/CEM (#2141)
* Add cem doc fix ppo doc title * Chmod numpy.png
1 parent d843e5b commit 792771d

File tree

4 files changed

+51
-2
lines changed

4 files changed

+51
-2
lines changed

docs/index.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -62,6 +62,7 @@ and how to implement new MDPs and new algorithms.
6262
user/algo_vpg
6363
user/algo_td3
6464
user/algo_ddpg
65+
user/algo_cem
6566
6667
.. toctree::
6768
:maxdepth: 2

docs/user/algo_cem.md

Lines changed: 48 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,48 @@
1+
# Cross Entropy Method
2+
3+
```eval_rst
4+
+-------------------+--------------------------------------------------------------------------------------------------------------------------------------------------+
5+
| **Paper** | The cross-entropy method: A unified approach to Monte Carlo simulation, randomized optimization and machine learning :cite:`rubinstein2004cross` |
6+
+-------------------+--------------------------------------------------------------------------------------------------------------------------------------------------+
7+
| **Framework(s)** | .. figure:: ./images/numpy.png |
8+
| | :scale: 40% |
9+
| | :class: no-scaled-link |
10+
+-------------------+--------------------------------------------------------------------------------------------------------------------------------------------------+
11+
| **API Reference** | `garage.np.algos.CEM <../_autoapi/garage/np/algos/index.html#garage.np.algos.CEM>`_ |
12+
+-------------------+--------------------------------------------------------------------------------------------------------------------------------------------------+
13+
| **Code** | `garage/np/algos/cem.py <https://github.com/rlworkgroup/garage/blob/master/src/garage/np/algos/cem.py>`_ |
14+
+-------------------+--------------------------------------------------------------------------------------------------------------------------------------------------+
15+
```
16+
17+
Cross Entropy Method (CEM) works by iteratively optimizing a gaussian
18+
distribution of policy.
19+
20+
In each epoch, CEM does the following:
21+
22+
1. Sample n_samples policies from a gaussian distribution of mean cur_mean and
23+
std cur_std.
24+
25+
2. Collect episodes for each policy.
26+
27+
3. Update cur_mean and cur_std by doing Maximum Likelihood Estimation over the
28+
n_best top policies in terms of return.
29+
30+
## Examples
31+
32+
### NumPy
33+
34+
```eval_rst
35+
.. literalinclude:: ../../examples/np/cem_cartpole.py
36+
```
37+
38+
## References
39+
40+
```eval_rst
41+
.. bibliography:: references.bib
42+
:style: unsrt
43+
:filter: docname in docnames
44+
```
45+
46+
----
47+
48+
*This page was authored by Ruofu Wang ([@yeukfu](https://github.com/yeukfu)).*

docs/user/algo_ppo.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -35,13 +35,13 @@ regularization adds the mean entropy to the surrogate objective. See
3535

3636
Garage has implementations of PPO with PyTorch and TensorFlow.
3737

38-
## PyTorch
38+
### PyTorch
3939

4040
```eval_rst
4141
.. literalinclude:: ../../examples/torch/ppo_pendulum.py
4242
```
4343

44-
## TensorFlow
44+
### TensorFlow
4545

4646
```eval_rst
4747
.. literalinclude:: ../../examples/tf/ppo_pendulum.py

docs/user/images/numpy.png

4.54 KB
Loading

0 commit comments

Comments
 (0)