|
13 | 13 |  |
14 | 14 |  |
15 | 15 |
|
16 | | -The official ACM AI **Intro to AI: Reinforcement Learning Workshop** repository. We demonstrate how to run reinforcement learning algorithms in a custom Pacman environment from Berkeley's [CS 188](https://inst.eecs.berkeley.edu/~cs188/fa22/projects/proj3/). |
| 16 | +The official ACM AI **Intro to AI: Reinforcement Learning Workshop** repository. In this workshop we will demonstrate how to run basic reinforcement learning algorithms in custom Gridworld, Pacman, and Crawler environments from Berkeley's [CS 188](https://inst.eecs.berkeley.edu/~cs188/fa22/projects/proj3/). |
17 | 17 |
|
18 | 18 | <!-- |
19 | 19 | SECTION: Table of Contents |
@@ -57,28 +57,38 @@ conda env create -f environment.yaml |
57 | 57 | conda activate ai |
58 | 58 | ``` |
59 | 59 |
|
60 | | -Workshop "Intro to AI: Reinforcement Learning" consists of 2 components: |
| 60 | +Workshop "Intro to AI: Multi Agent Search Algorithms" consists of 2 components: |
61 | 61 | - [Notebook](<!-- Local Path to Notebook -->) with completed code and explanations. |
62 | 62 | - [Summary Graphic](<!-- Local Path to Summary Graphic -->) to summarize key points of the workshop. (To be added after workshop) |
63 | 63 |
|
64 | | -Please refer to [CS 188](https://inst.eecs.berkeley.edu/~cs188/fa22/projects/proj2/#welcome-to-multi-agent-pacman) for exact details on the code. |
| 64 | +Please refer to [CS 188](https://inst.eecs.berkeley.edu/~cs188/fa22/projects/proj3/) for exact details on the code. |
65 | 65 |
|
66 | 66 | ## 1.2 Testing the Code |
67 | 67 |
|
68 | 68 | Try running |
69 | 69 | ``` |
70 | | -python gridworld.py -m |
| 70 | +python gridworld.py |
71 | 71 | ``` |
72 | | -to play in the gridworld environment manually and get a grasp on the environment. |
| 72 | +to play a game of Gridworld and get a grasp on the environment. Note environment moves are non-deterministic: as in if you try to move up, there is only an 80% chance you move right. We will be working on developing the following agents in the Gridworld environment primarily, but if they are implemented correctly, they should also work in the Pacman and Crawler environments. |
73 | 73 |
|
74 | | -Algorithms: |
| 74 | +You can try running the Pacman and Crawler environments through: |
| 75 | +``` |
| 76 | +python pacman.py |
| 77 | +``` |
| 78 | +and |
| 79 | +``` |
| 80 | +python crawler.py |
| 81 | +``` |
| 82 | +respectively. |
75 | 83 |
|
76 | | -Q2 - Value Iteration |
| 84 | +Algorithms: |
| 85 | +Q1 - Value Iteration |
77 | 86 |
|
78 | | -Q3 - Q-Learning and Epsilon-Greedy |
| 87 | +Q3 - Q-Learning |
79 | 88 |
|
80 | | -Q4 - Deep Q-Learning |
| 89 | +Q4 - Epsilon Greedy |
81 | 90 |
|
| 91 | +Q6 - Approximate Q-Learning |
82 | 92 | <!-- |
83 | 93 | Note: The above list will depend on your specific workshop. |
84 | 94 | --> |
@@ -111,14 +121,16 @@ Q4 - Deep Q-Learning |
111 | 121 | ```bash |
112 | 122 | intro-ai-series |
113 | 123 | | -- figures |
114 | | - | -- W3_Header_Dark.png |
115 | 124 | | -- W3_Header_Light.png |
| 125 | + | -- W3_Header_Dark.png |
116 | 126 | | -- src |
117 | | - | -- gridworld.py # gridworld env |
118 | | - | -- crawler.py # crawler env |
119 | | - | -- pacman.py # pacman env |
120 | | - | -- valueIterationAgents.py # implementation of value iteration algorithm |
121 | | - | -- qlearningAgents.py # implementation of q-learning algorithm |
| 127 | + | -- valueIterationAgents.py # implement value iteration |
| 128 | + | -- qlearningAgents.py # implement q-learning |
| 129 | + | -- mdp.py # defines methods on general MDPs |
| 130 | + | -- learningAgents.py # defines base classes for value iteration and q-learning which will be extended in implementation |
| 131 | + | -- gridworld.py # implements gridworld |
| 132 | + | -- featureExtractors.py # extracts features from (state, action) pairs for approximate Q-learning |
| 133 | + | -- util.py # useful utility function with data structures for implementing algorithms (optional o use), such as util.Counter (useful for Q-learning) |
122 | 134 | | -- autograder.py # run this for determining the correctness of code |
123 | 135 | | -- README.md |
124 | 136 | ``` |
|
0 commit comments