Nit: fix logarithm operation

agentydragon · web-flow · commit 6a2b88fe12cd · 2021-09-08T12:48:19.000+02:00
diff --git a/site/en/tutorials/reinforcement_learning/actor_critic.ipynb b/site/en/tutorials/reinforcement_learning/actor_critic.ipynb
@@ -444,7 +444,7 @@
         "\n",
         "The actor loss is based on [policy gradients with the critic as a state dependent baseline](https://www.youtube.com/watch?v=EKqxumCuAAY&t=62m23s) and computed with single-sample (per-episode) estimates.\n",
         "\n",
-        "$$L_{actor} = -\\sum^{T}_{t=1} log\\pi_{\\theta}(a_{t} | s_{t})[G(s_{t}, a_{t})  - V^{\\pi}_{\\theta}(s_{t})]$$\n",
+        "$$L_{actor} = -\\sum^{T}_{t=1} \\log\\pi_{\\theta}(a_{t} | s_{t})[G(s_{t}, a_{t})  - V^{\\pi}_{\\theta}(s_{t})]$$\n",
         "\n",
         "where:\n",
         "- $T$: the number of timesteps per episode, which can vary per episode\n",