Implement Atari environment using envpool#136

Open

SamBoasman wants to merge 2 commits intomainfrom

Contributor

SamBoasman commented Dec 15, 2025

No description provided.


          Implement Atari environment using envpool

c1876be

SamBoasman requested a review from Copilot

December 15, 2025 04:14


          Auto-format code 🧹🌟🤖

c149add

Copilot started reviewing on behalf of SamBoasman

December 15, 2025 04:15

Copilot AI reviewed

View reviewed changes

Copilot AI left a comment

Pull request overview

This PR implements Atari environment support using the envpool library for high-performance, vectorized Atari game environments. The implementation leverages envpool's built-in frame stacking rather than the existing MultiModalWrapper approach used by other environments.

Key Changes:

Adds new AtariEnvironment class that wraps envpool for Atari games with configurable frame stacking
Integrates seed parameters into the environment factory to enable proper seeding of training and evaluation environments
Fixes a potential NameError in training_runner.py by properly initializing train_info

Reviewed changes

Copilot reviewed 5 out of 7 changed files in this pull request and generated 5 comments.

Show a summary per file

File	Description
scripts/util/configurations.py	Adds AtariConfig class with default 4-frame stacking configuration
scripts/environments/atari/atari_environment.py	New AtariEnvironment implementation using envpool with support for frame stacking, rendering, and seeding
scripts/environments/environment_factory.py	Extends factory to accept seed parameters, adds AtariEnvironment case, fixes "Unknown" typo
scripts/training_runner.py	Initializes train_info at loop start to prevent NameError when episode ends before policy update
scripts/base_runner.py	Passes train_seed and eval_seed to environment factory
requirements.txt	Adds envpool==0.8.4 dependency

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

scripts/environments/atari/atari_environment.py

Comment on lines +80 to +88

+                          frame = np.stack([frame] * 3, axis=-1)
+                      return cv2.resize(frame, (width, height), interpolation=cv2.INTER_CUBIC)
+                  def render(self):
+                      frame = self.grab_frame()
+                      cv2.imshow(self.name, frame)
+                      cv2.waitKey(1)
+                  def get_overlay_info(self) -> dict:

Copilot AI Dec 15, 2025

The grab_frame method assumes that if state.shape has 4 dimensions, it's RGB and takes the last 3 channels with self.state[-3:]. However, with envpool's frame stacking (stack_num=4), the observation shape will be (4, 84, 84) for grayscale Atari games. This means len(self.state.shape) == 3, not 4, so it would go to the else branch. The logic should check the actual channel dimension rather than the number of dimensions to distinguish between RGB and grayscale stacked frames.

Copilot uses AI. Check for mistakes.

scripts/environments/atari/atari_environment.py

Comment on lines +20 to +21

		episodic_life=evaluation,
		reward_clip=evaluation,

Copilot AI Dec 15, 2025

The evaluation parameter is inverted for episodic_life and reward_clip. In Atari, these settings should be disabled during evaluation (set to False) for standard evaluation, and enabled during training (set to True). Currently, when evaluation=True, these are set to True, which is the opposite of the intended behavior.

Suggested change

      
                        episodic_life=evaluation,
          
                        reward_clip=evaluation,
          
                        episodic_life=not evaluation,
          
                        reward_clip=not evaluation,

Copilot uses AI. Check for mistakes.

scripts/environments/atari/atari_environment.py

Comment on lines +32 to +36

+                      return self.env.action_space.high[0]
+                  @cached_property
+                  def min_action_value(self) -> float:
+                      return self.env.action_space.low[0]

Copilot AI Dec 15, 2025

For Atari environments with Discrete action spaces, attempting to access .high[0] and .low[0] will raise an AttributeError because Discrete spaces don't have these attributes. These properties are only applicable to Box action spaces, but Atari games use Discrete action spaces.

Suggested change

      
                    return self.env.action_space.high[0]
          
                @cached_property
          
                def min_action_value(self) -> float:
          
                    return self.env.action_space.low[0]
          
                    if isinstance(self.env.action_space, spaces.Box):
          
                        return self.env.action_space.high[0]
          
                    elif isinstance(self.env.action_space, spaces.Discrete):
          
                        return self.env.action_space.n - 1
          
                    else:
          
                        raise ValueError(f"Unhandled action space type: {type(self.env.action_space)}")
          
                @cached_property
          
                def min_action_value(self) -> float:
          
                    if isinstance(self.env.action_space, spaces.Box):
          
                        return self.env.action_space.low[0]
          
                    elif isinstance(self.env.action_space, spaces.Discrete):
          
                        return 0
          
                    else:
          
                        raise ValueError(f"Unhandled action space type: {type(self.env.action_space)}")

Copilot uses AI. Check for mistakes.

scripts/environments/atari/atari_environment.py

+                      return np.array([self.env.action_space.sample()], dtype=int)
+                  def set_seed(self, seed: int) -> None:
+                      self.env.reset()

Copilot AI Dec 15, 2025

Calling self.env.reset() without using seed parameter doesn't actually seed the environment. This method should accept and use the seed parameter to properly seed the environment, similar to how OpenAIEnvironment handles it with self.env.reset(seed=seed).

Suggested change

      
                    self.env.reset()
          
                    self.env.reset(seed=seed)

Copilot uses AI. Check for mistakes.

scripts/environments/atari/atari_environment.py

Comment on lines +83 to +86

+                  def render(self):
+                      frame = self.grab_frame()
+                      cv2.imshow(self.name, frame)
+                      cv2.waitKey(1)

Copilot AI Dec 15, 2025

The window name is created using the task name and seed, but if the config.display is not 1, the self.name attribute won't be set. This will cause an AttributeError when render() is called and tries to use self.name. Consider initializing self.name regardless of the display setting or adding a conditional check in the render method.

Copilot uses AI. Check for mistakes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet