|
3 | 3 | This repository contains reinforcement learning training code for the following
|
4 | 4 | strategy types:
|
5 | 5 | * Lookup tables (LookerUp)
|
6 |
| -* Particle Swarm algorithms (PSOGambler) |
| 6 | +* Particle Swarm algorithms (PSOGambler), a stochastic version of LookerUp |
7 | 7 | * Feed Forward Neural Network (EvolvedANN)
|
8 | 8 | * Finite State Machine (FSMPlayer)
|
| 9 | +* Hidden Markov Models (HMMPLayer), essentially a stochastic version of a finite state machine |
9 | 10 |
|
10 |
| -The training is done by evolutionary algorithms or particle swarm algorithms. There |
11 |
| -is another repository that trains Neural Networks with gradient descent. In this |
12 |
| -repository there are scripts for each strategy type: |
| 11 | +Model training is by [evolutionary algorithms](https://en.wikipedia.org/wiki/Evolutionary_algorithm) |
| 12 | + or [particle swarm algorithms](https://en.wikipedia.org/wiki/Particle_swarm_optimization). |
| 13 | +There is another repository in the [Axerlod project](https://github.com/Axelrod-Python/Axelrod) |
| 14 | +that trains Neural Networks with gradient descent (using tensorflow) |
| 15 | +that will likely be incorporated here. In this repository there are scripts |
| 16 | +for each strategy type with a similar interface: |
13 | 17 |
|
14 | 18 | * [looker_evolve.py](looker_evolve.py)
|
15 | 19 | * [pso_evolve.py](pso_evolve.py)
|
16 | 20 | * [ann_evolve.py](ann_evolve.py)
|
17 | 21 | * [fsm_evolve.py](fsm_evolve.py)
|
| 22 | +* [hmm_evolve.py](hmm_evolve.py) |
| 23 | + |
| 24 | +See below for usage instructions. |
18 | 25 |
|
19 | 26 | In the original iteration the strategies were run against all the default
|
20 |
| -strategies in the Axelrod library. This is slow and probably not necessary. For |
21 |
| -example the Meta players are just combinations of the other players, and very |
22 |
| -computationally intensive; it's probably ok to remove those. So by default the |
23 |
| -training strategies are the `short_run_time_strategies` from the Axelrod library. |
| 27 | +strategies in the Axelrod library. This is slow and probably not necessary. |
| 28 | +By default the training strategies are the `short_run_time_strategies` from the |
| 29 | +Axelrod library. You may specify any other set of strategies for training. |
| 30 | + |
| 31 | +Basic testing is done by running the trained model against the full set of |
| 32 | +strategies in various tournaments. Depending on the optimization function |
| 33 | +testing methods will vary. |
24 | 34 |
|
25 | 35 | ## The Strategies
|
26 | 36 |
|
27 |
| -The LookerUp strategies are based on lookup tables with three parameters: |
28 |
| -* n1, the number of rounds of trailing history to use and |
| 37 | +**LookerUp** is based on lookup tables with three parameters: |
| 38 | +* n1, the number of rounds of trailing history to use |
29 | 39 | * n2, the number of rounds of trailing opponent history to use
|
30 | 40 | * m, the number of rounds of initial opponent play to use
|
31 | 41 |
|
32 |
| -PSOGambler is a stochastic version of LookerUp, trained with a particle swarm |
33 |
| -algorithm. The resulting strategies are generalizations of memory-N strategies. |
| 42 | +These are generalizations of deterministic memory-N strategies. |
| 43 | + |
| 44 | +**PSOGambler** is a stochastic version of LookerUp, trained with a particle |
| 45 | +swarm algorithm. The resulting strategies are generalizations of memory-N |
| 46 | +strategies. |
| 47 | + |
| 48 | +**EvolvedANN** is based on a [feed forward neural network](https://en.wikipedia.org/wiki/Feedforward_neural_network) |
| 49 | +with a single hidden layer. Various features are derived from the history of play. |
| 50 | +The number of nodes in the hidden layer can also be changed. |
34 | 51 |
|
35 |
| -EvolvedANN is one hidden layer feed forward neural network based algorithm. |
36 |
| -Various features are derived from the history of play. The number of nodes in |
37 |
| -the hidden layer can be changed. |
| 52 | +**EvolvedFSM** searches over [finite state machines](https://en.wikipedia.org/wiki/Finite-state_machine) |
| 53 | +with a given number of states. |
38 | 54 |
|
39 |
| -EvolvedFSM searches over finite state machines with a given number of states. |
| 55 | +**EvolvedHMM** implements a simple [hidden markov model](https://en.wikipedia.org/wiki/Hidden_Markov_model) |
| 56 | +based strategy, a stochastic finite state machine. |
40 | 57 |
|
41 |
| -Note that large values of the parameters will make the strategies prone to |
| 58 | +Note that large values of some parameters will make the strategies prone to |
42 | 59 | overfitting.
|
43 | 60 |
|
44 | 61 | ## Optimization Functions
|
@@ -183,6 +200,36 @@ Options:
|
183 | 200 | --states NUM_STATES Number of FSM states [default: 8]
|
184 | 201 | ```
|
185 | 202 |
|
| 203 | +
|
| 204 | +### Hidden Markov Model |
| 205 | +
|
| 206 | +```bash |
| 207 | +$ python hmm_evolve.py -h |
| 208 | +Hidden Markov Model Evolver |
| 209 | +
|
| 210 | +Usage: |
| 211 | + fsm_evolve.py [-h] [--generations GENERATIONS] [--population POPULATION] |
| 212 | + [--mu MUTATION_RATE] [--bottleneck BOTTLENECK] [--processes PROCESSORS] |
| 213 | + [--output OUTPUT_FILE] [--objective OBJECTIVE] [--repetitions REPETITIONS] |
| 214 | + [--turns TURNS] [--noise NOISE] [--nmoran NMORAN] |
| 215 | + [--states NUM_STATES] |
| 216 | +
|
| 217 | +Options: |
| 218 | + -h --help Show help |
| 219 | + --generations GENERATIONS Generations to run the EA [default: 500] |
| 220 | + --population POPULATION Population size [default: 40] |
| 221 | + --mu MUTATION_RATE Mutation rate [default: 0.1] |
| 222 | + --bottleneck BOTTLENECK Number of individuals to keep from each generation [default: 10] |
| 223 | + --processes PROCESSES Number of processes to use [default: 1] |
| 224 | + --output OUTPUT_FILE File to write data to [default: fsm_tables.csv] |
| 225 | + --objective OBJECTIVE Objective function [default: score] |
| 226 | + --repetitions REPETITIONS Repetitions in objective [default: 100] |
| 227 | + --turns TURNS Turns in each match [default: 200] |
| 228 | + --noise NOISE Match noise [default: 0.00] |
| 229 | + --nmoran NMORAN Moran Population Size, if Moran objective [default: 4] |
| 230 | + --states NUM_STATES Number of FSM states [default: 5] |
| 231 | +``` |
| 232 | +
|
186 | 233 | ## Open questions
|
187 | 234 |
|
188 | 235 | * What's the best table for n1, n2, m for LookerUp and PSOGambler? What's the
|
|
0 commit comments