Add docs for algos/CEM (#2141)

yeukfu · web-flow · commit 792771dc7ecc · 2020-10-27T04:11:54.000Z
* Add cem doc fix ppo doc title

* Chmod numpy.png
diff --git a/docs/index.md b/docs/index.md
@@ -62,6 +62,7 @@ and how to implement new MDPs and new algorithms.
    user/algo_vpg
    user/algo_td3
    user/algo_ddpg
+   user/algo_cem
 
 .. toctree::
    :maxdepth: 2
diff --git a/docs/user/algo_cem.md b/docs/user/algo_cem.md
@@ -0,0 +1,48 @@
+# Cross Entropy Method
+
+```eval_rst
++-------------------+--------------------------------------------------------------------------------------------------------------------------------------------------+
+| **Paper**         | The cross-entropy method: A unified approach to Monte Carlo simulation, randomized optimization and machine learning :cite:`rubinstein2004cross` |
++-------------------+--------------------------------------------------------------------------------------------------------------------------------------------------+
+| **Framework(s)**  | .. figure:: ./images/numpy.png                                                                                                                   |
+|                   |    :scale: 40%                                                                                                                                   |
+|                   |    :class: no-scaled-link                                                                                                                        |
++-------------------+--------------------------------------------------------------------------------------------------------------------------------------------------+
+| **API Reference** | `garage.np.algos.CEM <../_autoapi/garage/np/algos/index.html#garage.np.algos.CEM>`_                                                              |
++-------------------+--------------------------------------------------------------------------------------------------------------------------------------------------+
+| **Code**          | `garage/np/algos/cem.py <https://github.com/rlworkgroup/garage/blob/master/src/garage/np/algos/cem.py>`_                                         |
++-------------------+--------------------------------------------------------------------------------------------------------------------------------------------------+
+```
+
+Cross Entropy Method (CEM) works by iteratively optimizing a gaussian
+distribution of policy.
+
+In each epoch, CEM does the following:
+
+1. Sample n_samples policies from a gaussian distribution of mean cur_mean and
+std cur_std.
+
+2. Collect episodes for each policy.
+
+3. Update cur_mean and cur_std by doing Maximum Likelihood Estimation over the
+n_best top policies in terms of return.
+
+## Examples
+
+### NumPy
+
+```eval_rst
+.. literalinclude:: ../../examples/np/cem_cartpole.py
+```
+
+## References
+
+```eval_rst
+.. bibliography:: references.bib
+   :style: unsrt
+   :filter: docname in docnames
+```
+
+----
+
+*This page was authored by Ruofu Wang ([@yeukfu](https://github.com/yeukfu)).*
diff --git a/docs/user/algo_ppo.md b/docs/user/algo_ppo.md
@@ -35,13 +35,13 @@ regularization adds the mean entropy to the surrogate objective. See
 
 Garage has implementations of PPO with PyTorch and TensorFlow.
 
-## PyTorch
+### PyTorch
 
 ```eval_rst
 .. literalinclude:: ../../examples/torch/ppo_pendulum.py
 ```
 
-## TensorFlow
+### TensorFlow
 
 ```eval_rst
 .. literalinclude:: ../../examples/tf/ppo_pendulum.py
diff --git a/docs/user/images/numpy.png b/docs/user/images/numpy.png