Skip to content

Commit 7144e86

Browse files
committed
Update readme
1 parent 463c3f2 commit 7144e86

File tree

1 file changed

+12
-12
lines changed

1 file changed

+12
-12
lines changed

README.md

Lines changed: 12 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -104,18 +104,18 @@ All the agents have been evaluated against each other under controlled condition
104104

105105
See [comparison.ipynb](notebooks/comparison.ipynb) for code reproducibility.
106106

107-
Here is the pairwise algorithm comparison:
108-
109-
| | human | mcts deep q learning | mcts advancement | mcts rollout | ab relative advancement | relative advancement | advancement | random |
110-
| :---------------------- | ----: | -------------------: | ---------------: | -----------: | ----------------------: | -------------------: | ----------: | -----: |
111-
| human | | 0.2 | 0.4 | 0 | 0.8 | 1 | 1 | 1 |
112-
| mcts deep q learning | 0.8 | | 0.75 | 0.24 | 0.54 | 1 | 1 | 1 |
113-
| mcts advancement | 0.6 | 0.25 | | 0.06 | 0.32 | 1 | 1 | 1 |
114-
| mcts rollout | 1 | 0.76 | 0.94 | | 0.77 | 0.98 | 0.99 | 1 |
115-
| ab relative advancement | 0.2 | 0.46 | 0.68 | 0.23 | | 1 | 1 | 1 |
116-
| relative advancement | 0 | 0 | 0 | 0.02 | 0 | | 0.5 | 0.97 |
117-
| advancement | 0 | 0 | 0 | 0.01 | 0 | 0.5 | | 0.95 |
118-
| random | 0 | 0 | 0 | 0 | 0 | 0.03 | 0.05 | |
107+
Below is the pairwise algorithm comparison; the value for some row R and column C corresponds to the win rate of R against C. For example, I (human) beat MCTS deep Q-learning 20% of the time.
108+
109+
| | human | mcts deep q learning | mcts advancement | mcts rollout | ab relative advancement | relative advancement | advancement | random |
110+
| :-------------------------- | ----: | -------------------: | ---------------: | -----------: | ----------------------: | -------------------: | ----------: | -----: |
111+
| **human** | | 0.2 | 0.4 | 0 | 0.8 | 1 | 1 | 1 |
112+
| **mcts deep q learning** | 0.8 | | 0.75 | 0.24 | 0.54 | 1 | 1 | 1 |
113+
| **mcts advancement** | 0.6 | 0.25 | | 0.06 | 0.32 | 1 | 1 | 1 |
114+
| **mcts rollout** | 1 | 0.76 | 0.94 | | 0.77 | 0.98 | 0.99 | 1 |
115+
| **ab relative advancement** | 0.2 | 0.46 | 0.68 | 0.23 | | 1 | 1 | 1 |
116+
| **relative advancement** | 0 | 0 | 0 | 0.02 | 0 | | 0.5 | 0.97 |
117+
| **advancement** | 0 | 0 | 0 | 0.01 | 0 | 0.5 | | 0.95 |
118+
| **random** | 0 | 0 | 0 | 0 | 0 | 0.03 | 0.05 | |
119119

120120

121121
The MCTS rollout algorithm outperforms all other players, including the human (myself, an average player). The MCTS deep Q-learning algorithm is second, although it beats MCTS rollout when allowed less than .2 second per move.

0 commit comments

Comments
 (0)