Skip to content

sjsu-interconnect/ourhexgame

Repository files navigation

Tournament [update on Dec 1, 2024]

Instructions

  • Upload your agent (and related files) to this github repo (You can find your group's directory, agent_groupX.)
  • There is no need to create a separate branch to upload the agent. Just push everything to the main branch.
  • Right before every game, you should pull the repo so you have the latest opponent agent
    • If a group fails to upload the updated agent before the cutoff time, we will use the default one pre-uploaded to the repo
  • Please designate one person as an executor for all games for the smooth transitions
    • Please make sure this person's laptop has the libraries (PyTorch, Tensorflow, SKlearn, ...)
  • For group stage, every group will compete with the other groups in the same group stage. Only one group proceeds to the knockdown stage.
  • If there are two groups with the same results, we will break the tie by having another game with a different reward environment. tournament

Our Hex Game

We are going to create a common Hex Game environment OurHexGame for our PA5 and final project.

Assumption

  • The board size is 11x11.

Agents

  • possible_agents [“player_1”, “player_2”]
  • “player_1”: red, vertical
  • “player_2”: blue, horizontal

Observation

from gymnasium.spaces import Dict, Box, Discrete

Dict({
  "observation": spaces.Box(board_size, board_size),
  "pie_rule_used": spaces.Discrete(2), # 1 if used, 0 otherwise
})

Info

  • horizontal (0) or vertical (1)
  • The environment should provide an action mask to indicate invalid actions. (We can repurpose the observation.)
    • In the obs, if a hex is marked with 1 or 2, the mask should be 0. (invalid)
    • Otherwise, the mask is 1. (valid)
    • pie rule

Action

  • Discrete(board_size x board_size + 1)
    • Line 1: 1A (0), 1B, ...., 1K (10)
    • Line 2: 2A (11), 1B, ...., 2K (21)
    • ....
    • the last action is the pie rule

Reward Sparse

  • define sparse_flag to turn on/off the sparse reward env. If False, it should use the dense reward env.
  • Win +1
  • Lose -1
  • Otherwise, 0

Reward Dense

  • Each step, -1
  • Win +floor((board_size * board_size)/2)
  • Lose -ceil((board_size * board_size)/2)

Termination

  • (run DFS to check the winner after each cycle) If there is a winner, terminate
  • If illegal move, terminate (reset)
    • Agent should not try the illegal actions!

Rendering

please someone donate your code! thank you!

Runner

  • See myrunner-eg.py
env = OurHexGame(board_size=11, sparse_flag=True) # or False
agent = GXXAgent(env)
...
action = agent.select_action(observation, reward, termination, truncation, info)

About

cs272 hex game

Resources

Stars

Watchers

Forks

Packages

No packages published

Contributors 19

Languages