Conversation
There was a problem hiding this comment.
Pull request overview
This PR implements Atari environment support using the envpool library for high-performance, vectorized Atari game environments. The implementation leverages envpool's built-in frame stacking rather than the existing MultiModalWrapper approach used by other environments.
Key Changes:
- Adds new AtariEnvironment class that wraps envpool for Atari games with configurable frame stacking
- Integrates seed parameters into the environment factory to enable proper seeding of training and evaluation environments
- Fixes a potential NameError in training_runner.py by properly initializing train_info
Reviewed changes
Copilot reviewed 5 out of 7 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| scripts/util/configurations.py | Adds AtariConfig class with default 4-frame stacking configuration |
| scripts/environments/atari/atari_environment.py | New AtariEnvironment implementation using envpool with support for frame stacking, rendering, and seeding |
| scripts/environments/environment_factory.py | Extends factory to accept seed parameters, adds AtariEnvironment case, fixes "Unknown" typo |
| scripts/training_runner.py | Initializes train_info at loop start to prevent NameError when episode ends before policy update |
| scripts/base_runner.py | Passes train_seed and eval_seed to environment factory |
| requirements.txt | Adds envpool==0.8.4 dependency |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| frame = np.stack([frame] * 3, axis=-1) | ||
| return cv2.resize(frame, (width, height), interpolation=cv2.INTER_CUBIC) | ||
|
|
||
| def render(self): | ||
| frame = self.grab_frame() | ||
| cv2.imshow(self.name, frame) | ||
| cv2.waitKey(1) | ||
|
|
||
| def get_overlay_info(self) -> dict: |
There was a problem hiding this comment.
The grab_frame method assumes that if state.shape has 4 dimensions, it's RGB and takes the last 3 channels with self.state[-3:]. However, with envpool's frame stacking (stack_num=4), the observation shape will be (4, 84, 84) for grayscale Atari games. This means len(self.state.shape) == 3, not 4, so it would go to the else branch. The logic should check the actual channel dimension rather than the number of dimensions to distinguish between RGB and grayscale stacked frames.
| episodic_life=evaluation, | ||
| reward_clip=evaluation, |
There was a problem hiding this comment.
The evaluation parameter is inverted for episodic_life and reward_clip. In Atari, these settings should be disabled during evaluation (set to False) for standard evaluation, and enabled during training (set to True). Currently, when evaluation=True, these are set to True, which is the opposite of the intended behavior.
| episodic_life=evaluation, | |
| reward_clip=evaluation, | |
| episodic_life=not evaluation, | |
| reward_clip=not evaluation, |
| return self.env.action_space.high[0] | ||
|
|
||
| @cached_property | ||
| def min_action_value(self) -> float: | ||
| return self.env.action_space.low[0] |
There was a problem hiding this comment.
For Atari environments with Discrete action spaces, attempting to access .high[0] and .low[0] will raise an AttributeError because Discrete spaces don't have these attributes. These properties are only applicable to Box action spaces, but Atari games use Discrete action spaces.
| return self.env.action_space.high[0] | |
| @cached_property | |
| def min_action_value(self) -> float: | |
| return self.env.action_space.low[0] | |
| if isinstance(self.env.action_space, spaces.Box): | |
| return self.env.action_space.high[0] | |
| elif isinstance(self.env.action_space, spaces.Discrete): | |
| return self.env.action_space.n - 1 | |
| else: | |
| raise ValueError(f"Unhandled action space type: {type(self.env.action_space)}") | |
| @cached_property | |
| def min_action_value(self) -> float: | |
| if isinstance(self.env.action_space, spaces.Box): | |
| return self.env.action_space.low[0] | |
| elif isinstance(self.env.action_space, spaces.Discrete): | |
| return 0 | |
| else: | |
| raise ValueError(f"Unhandled action space type: {type(self.env.action_space)}") |
| return np.array([self.env.action_space.sample()], dtype=int) | ||
|
|
||
| def set_seed(self, seed: int) -> None: | ||
| self.env.reset() |
There was a problem hiding this comment.
Calling self.env.reset() without using seed parameter doesn't actually seed the environment. This method should accept and use the seed parameter to properly seed the environment, similar to how OpenAIEnvironment handles it with self.env.reset(seed=seed).
| self.env.reset() | |
| self.env.reset(seed=seed) |
| def render(self): | ||
| frame = self.grab_frame() | ||
| cv2.imshow(self.name, frame) | ||
| cv2.waitKey(1) |
There was a problem hiding this comment.
The window name is created using the task name and seed, but if the config.display is not 1, the self.name attribute won't be set. This will cause an AttributeError when render() is called and tries to use self.name. Consider initializing self.name regardless of the display setting or adding a conditional check in the render method.
No description provided.