Skip to content

Commit df1623a

Browse files
Update README.md
1 parent 032242b commit df1623a

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@ The environment:
1717

1818
Goal is to learn how to take actions in order to maximize the reward. The objective function is as following:
1919

20-
<b>Q[s, a] = Q[s, a] + λ * (r + γ * max (Q[s_, a_]) – Q[s, a]),</b>
20+
<b>Q_[s_, a_] = Q[s, a] + λ * (r + γ * max (Q[s_, a_]) – Q[s, a]),</b>
2121

2222
where,
2323
<br/><b>s</b> – current position of the agent,

0 commit comments

Comments
 (0)