diff --git a/_posts/2019-03-05-dp-vs-rl.md b/_posts/2019-03-05-dp-vs-rl.md index 1821fbc0..058ca83c 100644 --- a/_posts/2019-03-05-dp-vs-rl.md +++ b/_posts/2019-03-05-dp-vs-rl.md @@ -4,7 +4,7 @@ author: Mike Innes, Neethu Maria Joy, Tejan Karmali layout: blog --- -We've discussed the idea of [differentiable programming](https://fluxml.ai/2019/02/07/what-is-differentiable-programming.html), where we incorporate existing programs into deep learning models. This article shows what ∂P can bring to some simple but classic control problems, where we would normally use black-box Reinforcement Learning (RL). ∂P-based models not only learn far more effective control strategies, but also train orders of magnitude faster. The [code](https://github.com/FluxML/model-zoo/tree/cdda5cad3e87b216fa67069a5ca84a3016f2a604/games/differentiable-programming) is all available to run for yourself – they will mostly train in a few seconds on any laptop. +We've discussed the idea of [differentiable programming](https://fluxml.ai/2019/02/07/what-is-differentiable-programming.html), where we incorporate existing programs into deep learning models. This article shows what ∂P can bring to some simple but classic control problems, where we would normally use black-box Reinforcement Learning (RL). ∂P-based models not only learn far more effective control strategies, but also train orders of magnitude faster. The [code](https://github.com/FluxML/model-zoo/tree/master/contrib/games/differentiable-programming/trebuchet) is all available to run for yourself – they will mostly train in a few seconds on any laptop. ## Follow the Gradient