You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Below is the pairwise algorithm comparison; the value for some row R and column C corresponds to the win rate of R against C. For example, I (human) beat MCTS deep Q-learning 20% of the time.
108
+
109
+
|| human | mcts deep q learning | mcts advancement | mcts rollout | ab relative advancement | relative advancement | advancement | random |
The MCTS rollout algorithm outperforms all other players, including the human (myself, an average player). The MCTS deep Q-learning algorithm is second, although it beats MCTS rollout when allowed less than .2 second per move.
0 commit comments