You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@@ -78,7 +78,7 @@ Comparing the objective function ($\mathbb{E}_{\tau}\sum\ r$) in reinforcement l
78
78
so that maximize $\mathbb{E}_{\tau} \sum r$
79
79
is equivalent to maximize $\mathbb{E}_{\tau}[\prod_{buying\ long}(r_{curr}/r_{pre}\ *\ cost) + \prod_{short\ selling}((2-r_{curr}/r_{pre})\ *\ cost)]$
80
80
81
-
The experimental results show that such a defination is better than the original gym-anytrading accumulated reward function :$\sum(r_{curr} - r_{pre})$.
81
+
The experimental results show that such a definition is better than the original gym-anytrading accumulated reward function :$\sum(r_{curr} - r_{pre})$.
82
82
### Render Function
83
83
84
84
As you see, you can use `render` method to plot the position and profit at one episode.
0 commit comments