Skip to content

Commit e51df0c

Browse files
authored
Update README.md
1 parent 42b6cdb commit e51df0c

File tree

1 file changed

+27
-15
lines changed
  • 2022/FA22/intro-ai-series/workshop-3-reinforcement-learning

1 file changed

+27
-15
lines changed

2022/FA22/intro-ai-series/workshop-3-reinforcement-learning/README.md

Lines changed: 27 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@
1313
![Intro to AI: Reinforcement Learning](./figures/W3_Header_Light.png#gh-light-mode-only)
1414
![Intro to AI: Reinforcement Learning](./figures/W3_Header_Dark.png#gh-dark-mode-only)
1515

16-
The official ACM AI **Intro to AI: Reinforcement Learning Workshop** repository. We demonstrate how to run reinforcement learning algorithms in a custom Pacman environment from Berkeley's [CS 188](https://inst.eecs.berkeley.edu/~cs188/fa22/projects/proj3/).
16+
The official ACM AI **Intro to AI: Reinforcement Learning Workshop** repository. In this workshop we will demonstrate how to run basic reinforcement learning algorithms in custom Gridworld, Pacman, and Crawler environments from Berkeley's [CS 188](https://inst.eecs.berkeley.edu/~cs188/fa22/projects/proj3/).
1717

1818
<!--
1919
SECTION: Table of Contents
@@ -57,28 +57,38 @@ conda env create -f environment.yaml
5757
conda activate ai
5858
```
5959

60-
Workshop "Intro to AI: Reinforcement Learning" consists of 2 components:
60+
Workshop "Intro to AI: Multi Agent Search Algorithms" consists of 2 components:
6161
- [Notebook](<!-- Local Path to Notebook -->) with completed code and explanations.
6262
- [Summary Graphic](<!-- Local Path to Summary Graphic -->) to summarize key points of the workshop. (To be added after workshop)
6363

64-
Please refer to [CS 188](https://inst.eecs.berkeley.edu/~cs188/fa22/projects/proj2/#welcome-to-multi-agent-pacman) for exact details on the code.
64+
Please refer to [CS 188](https://inst.eecs.berkeley.edu/~cs188/fa22/projects/proj3/) for exact details on the code.
6565

6666
## 1.2 Testing the Code
6767

6868
Try running
6969
```
70-
python gridworld.py -m
70+
python gridworld.py
7171
```
72-
to play in the gridworld environment manually and get a grasp on the environment.
72+
to play a game of Gridworld and get a grasp on the environment. Note environment moves are non-deterministic: as in if you try to move up, there is only an 80% chance you move right. We will be working on developing the following agents in the Gridworld environment primarily, but if they are implemented correctly, they should also work in the Pacman and Crawler environments.
7373

74-
Algorithms:
74+
You can try running the Pacman and Crawler environments through:
75+
```
76+
python pacman.py
77+
```
78+
and
79+
```
80+
python crawler.py
81+
```
82+
respectively.
7583

76-
Q2 - Value Iteration
84+
Algorithms:
85+
Q1 - Value Iteration
7786

78-
Q3 - Q-Learning and Epsilon-Greedy
87+
Q3 - Q-Learning
7988

80-
Q4 - Deep Q-Learning
89+
Q4 - Epsilon Greedy
8190

91+
Q6 - Approximate Q-Learning
8292
<!--
8393
Note: The above list will depend on your specific workshop.
8494
-->
@@ -111,14 +121,16 @@ Q4 - Deep Q-Learning
111121
```bash
112122
intro-ai-series
113123
| -- figures
114-
| -- W3_Header_Dark.png
115124
| -- W3_Header_Light.png
125+
| -- W3_Header_Dark.png
116126
| -- src
117-
| -- gridworld.py # gridworld env
118-
| -- crawler.py # crawler env
119-
| -- pacman.py # pacman env
120-
| -- valueIterationAgents.py # implementation of value iteration algorithm
121-
| -- qlearningAgents.py # implementation of q-learning algorithm
127+
| -- valueIterationAgents.py # implement value iteration
128+
| -- qlearningAgents.py # implement q-learning
129+
| -- mdp.py # defines methods on general MDPs
130+
| -- learningAgents.py # defines base classes for value iteration and q-learning which will be extended in implementation
131+
| -- gridworld.py # implements gridworld
132+
| -- featureExtractors.py # extracts features from (state, action) pairs for approximate Q-learning
133+
| -- util.py # useful utility function with data structures for implementing algorithms (optional o use), such as util.Counter (useful for Q-learning)
122134
| -- autograder.py # run this for determining the correctness of code
123135
| -- README.md
124136
```

0 commit comments

Comments
 (0)