Commit 93bdd8c
authored
Sneaky noop: action=-1 (#14)
It is useful to have a NOOP action in the environment, for "thinking
time" or other ablations. But we don't want to make the NNs that learn
from it use NOOPs.
Here, we intentionally send -1 (or any other invalid action <0). We
catch that case and make the environment not reset.File tree
2 files changed
+48
-2
lines changed- envpool/sokoban
2 files changed
+48
-2
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
15 | 15 | | |
16 | 16 | | |
17 | 17 | | |
| 18 | + | |
18 | 19 | | |
19 | 20 | | |
20 | 21 | | |
| |||
76 | 77 | | |
77 | 78 | | |
78 | 79 | | |
79 | | - | |
80 | | - | |
81 | 80 | | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
82 | 89 | | |
| 90 | + | |
83 | 91 | | |
84 | 92 | | |
85 | 93 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
342 | 342 | | |
343 | 343 | | |
344 | 344 | | |
| 345 | + | |
| 346 | + | |
| 347 | + | |
| 348 | + | |
| 349 | + | |
| 350 | + | |
| 351 | + | |
| 352 | + | |
| 353 | + | |
| 354 | + | |
| 355 | + | |
| 356 | + | |
| 357 | + | |
| 358 | + | |
| 359 | + | |
| 360 | + | |
| 361 | + | |
| 362 | + | |
| 363 | + | |
| 364 | + | |
| 365 | + | |
| 366 | + | |
| 367 | + | |
| 368 | + | |
| 369 | + | |
| 370 | + | |
| 371 | + | |
| 372 | + | |
| 373 | + | |
| 374 | + | |
| 375 | + | |
| 376 | + | |
| 377 | + | |
| 378 | + | |
| 379 | + | |
| 380 | + | |
| 381 | + | |
| 382 | + | |
345 | 383 | | |
346 | 384 | | |
347 | 385 | | |
0 commit comments