Skip to content

Commit 6f9de03

Browse files
committed
Update readme
1 parent 5054d82 commit 6f9de03

File tree

1 file changed

+4
-2
lines changed

1 file changed

+4
-2
lines changed

README.md

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -120,9 +120,11 @@ Below is the pairwise algorithm comparison; the value for some row R and column
120120
| **random** | 0 | 0 | 0 | 0 | 0 | 0.03 | 0.05 | |
121121

122122

123-
The MCTS rollout algorithm outperforms all other players, including the human (myself, an average player). The MCTS deep Q-learning algorithm is second, although it beats MCTS rollout when allowed less than .2 second per move.
123+
The MCTS rollout algorithm outperforms all other players, including the human (myself, an average player). The MCTS deep Q-learning (DQL) algorithm is second, although it beats MCTS rollout when allowed less than .2 second per move.
124124

125-
There are two components determining the quality of a tree-search algorithm: number of searches and the quality of evaluation at the end of each search. MCTS rollout has poorer evaluation quality but it makes vastly more (10x) searches than MCTS deep Q-learning. That's why MCTS rollout is more performant than MCTS deep Q-learning (especially so when the neural-network inference runs slowly on a CPU—instead of a GPU).
125+
There are two components determining the quality of a tree-search algorithm: the number of searches and the quality of evaluation at the end of each search. MCTS rollout has poorer evaluation quality but it makes vastly more (10x) searches than MCTS DQL. That's why MCTS rollout is more performant than MCTS DQL (at least when the neural-network inference runs slowly on a CPU—instead of a GPU).
126+
127+
The question whether a specific type of MCTS DQL may ever beat MCTS rollout at Squadro is still left hanging. It is well-known that RL algorithms outperform more basic tree-search algorithms (like rollout) for games like Chess and Go. But those games have a much larger state space than Squadro 5x5. For small state spaces, heavy state evaluation through neural networks tends to have less value since rolling out from one state to the end is very quick.
126128

127129
## Usage
128130

0 commit comments

Comments
 (0)