Commit dde4c59
authored
Unsupervised (#25)
* Version update initial implementation unsupervised
* Mask out rewards
* Reload new agent when transitioning from unsupervised
* Take the mean
* Default flags
* Unsupervised in agent
* Change only replay buffer
* Learn reward and policy in unsupervised
* Update configs
* Scale up rewards
* Update exploration steps
* Initialize model weights better1 parent 2a176dd commit dde4c59
File tree
10 files changed
+90
-95
lines changed- safe_opax
- configs
- agent
- experiment
- la_mbda
- rl
10 files changed
+90
-95
lines changedSome generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
52 | 52 | | |
53 | 53 | | |
54 | 54 | | |
55 | | - | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
This file was deleted.
This file was deleted.
Lines changed: 9 additions & 8 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2 | 2 | | |
3 | 3 | | |
4 | 4 | | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
5 | 8 | | |
6 | 9 | | |
7 | | - | |
| 10 | + | |
8 | 11 | | |
9 | 12 | | |
10 | | - | |
11 | | - | |
12 | | - | |
13 | | - | |
14 | | - | |
15 | | - | |
16 | 13 | | |
17 | 14 | | |
18 | 15 | | |
19 | | - | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
This file was deleted.
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
4 | 4 | | |
5 | 5 | | |
6 | 6 | | |
7 | | - | |
| 7 | + | |
8 | 8 | | |
9 | 9 | | |
| 10 | + | |
10 | 11 | | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
11 | 17 | | |
12 | 18 | | |
13 | 19 | | |
14 | 20 | | |
| 21 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
52 | 52 | | |
53 | 53 | | |
54 | 54 | | |
55 | | - | |
56 | | - | |
57 | 55 | | |
58 | 56 | | |
59 | 57 | | |
| |||
148 | 146 | | |
149 | 147 | | |
150 | 148 | | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
| 154 | + | |
| 155 | + | |
151 | 156 | | |
152 | 157 | | |
153 | 158 | | |
154 | 159 | | |
155 | | - | |
156 | 160 | | |
| 161 | + | |
| 162 | + | |
| 163 | + | |
| 164 | + | |
| 165 | + | |
| 166 | + | |
157 | 167 | | |
158 | 168 | | |
159 | 169 | | |
160 | 170 | | |
| 171 | + | |
| 172 | + | |
| 173 | + | |
| 174 | + | |
161 | 175 | | |
162 | 176 | | |
163 | 177 | | |
164 | 178 | | |
165 | 179 | | |
| 180 | + | |
| 181 | + | |
| 182 | + | |
166 | 183 | | |
167 | 184 | | |
168 | 185 | | |
| |||
173 | 190 | | |
174 | 191 | | |
175 | 192 | | |
| 193 | + | |
176 | 194 | | |
177 | 195 | | |
178 | 196 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
250 | 250 | | |
251 | 251 | | |
252 | 252 | | |
| 253 | + | |
253 | 254 | | |
254 | 255 | | |
255 | 256 | | |
256 | 257 | | |
257 | | - | |
258 | | - | |
259 | 258 | | |
260 | | - | |
261 | | - | |
262 | | - | |
263 | | - | |
264 | | - | |
265 | | - | |
266 | | - | |
267 | | - | |
268 | | - | |
| 259 | + | |
| 260 | + | |
| 261 | + | |
269 | 262 | | |
| 263 | + | |
| 264 | + | |
| 265 | + | |
| 266 | + | |
| 267 | + | |
| 268 | + | |
| 269 | + | |
| 270 | + | |
| 271 | + | |
| 272 | + | |
| 273 | + | |
| 274 | + | |
| 275 | + | |
270 | 276 | | |
| 277 | + | |
| 278 | + | |
271 | 279 | | |
272 | 280 | | |
273 | 281 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
12 | 12 | | |
13 | 13 | | |
14 | 14 | | |
15 | | - | |
| 15 | + | |
16 | 16 | | |
17 | 17 | | |
18 | 18 | | |
19 | 19 | | |
20 | | - | |
21 | 20 | | |
22 | 21 | | |
23 | 22 | | |
| |||
58 | 57 | | |
59 | 58 | | |
60 | 59 | | |
61 | | - | |
| 60 | + | |
62 | 61 | | |
63 | 62 | | |
64 | 63 | | |
| |||
86 | 85 | | |
87 | 86 | | |
88 | 87 | | |
89 | | - | |
90 | | - | |
91 | | - | |
92 | | - | |
93 | | - | |
94 | | - | |
95 | | - | |
96 | | - | |
97 | | - | |
98 | | - | |
99 | | - | |
100 | | - | |
101 | | - | |
102 | | - | |
103 | | - | |
104 | | - | |
| 88 | + | |
105 | 89 | | |
106 | 90 | | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
107 | 109 | | |
108 | 110 | | |
109 | 111 | | |
| |||
197 | 199 | | |
198 | 200 | | |
199 | 201 | | |
200 | | - | |
| 202 | + | |
201 | 203 | | |
202 | 204 | | |
203 | 205 | | |
204 | 206 | | |
205 | 207 | | |
206 | | - | |
| 208 | + | |
207 | 209 | | |
208 | 210 | | |
209 | 211 | | |
| |||
233 | 235 | | |
234 | 236 | | |
235 | 237 | | |
| 238 | + | |
| 239 | + | |
| 240 | + | |
236 | 241 | | |
0 commit comments