Chess_AI/.cursorrule at main · star14ms/Chess_AI · GitHub

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
# Rule: Chess Action Space Representation

When discussing or implementing the chess action space for this project, adhere to the following:

**1. Project's Chosen Action Space ("Pawn-Forever", ~1700 Actions):**
    *   **Goal:** Minimize total action space size while handling promotions.
    *   **Concept:** Pawns conceptually remain pawns but gain conditional access to promoted piece moves upon reaching the final rank. Action IDs are assigned based on the *initial* piece instance.
    *   **Structure:** Uses Absolute Action IDs (`Base ID + Relative ID - 1`).
    *   **Total Size:** ~1700 (850 per side).
    *   **Base Pieces (per side, 194 actions):**
        *   1 King (10 relative actions: 8 standard + 2 castling)
        *   1 Queen (56 relative actions)
        *   2 Rooks (28 relative actions each)
        *   2 Knights (8 relative actions each)
        *   2 Bishops (28 relative actions each)
    *   **Pawns (per side, 8 instances x 82 relative actions = 656 actions):**
        *   **18 Pawn-Specific Relative Actions:**
            *   6 basic moves (Fwd1, Fwd2, CapL, CapR, EPL, EPR).
            *   12 promotion moves (Fwd/CapL/CapR for Q, N, B, R).
        *   **56 embedded Queen moves** (conditional on pawn state).
        *   **8 embedded Knight moves** (conditional on pawn state).
    *   **Implementation:** Requires significant custom logic for state representation and conditional action masking based on whether a piece associated with a pawn slot has been promoted and to what.
    *   **Status:** This is the **preferred approach** for the project.

**2. Alternative Representations (Context):**
    *   **AlphaZero Standard (4672 Actions):** 8x8 source squares × 73 move type planes. Useful context but not the project's implementation.
    *   **Standard Max Instance (~2596 Actions):** Explicitly reserves action space for max theoretical pieces (9Q, 10R, etc.). Uses standard piece transformation. Considered but not chosen due to larger action space size.

**Guidance:**

*   **Default to discussing and implementing based on the "Pawn-Forever" scheme (1).**
*   Acknowledge the AlphaZero standard (2a) for context if needed.
*   Clearly state the requirement for custom conditional logic when implementing the chosen scheme (1).
*   Ensure `uci_to_*_action_id` functions and base ID calculations align with the 82 relative pawn actions and the 1700 total space.
*   **Note on Editing .ipynb:** When proposing code changes for `.ipynb` files, present the code directly in the chat instead of using file editing tools, as these can cause issues with notebook structure.
*   **Note on Editing Snippets:** If editing reference code provided by the user (e.g., via selection) that lacks necessary import statements, provide only the modified function or code block itself; do not add or guess imports unless specifically requested.

## Observation Vector Format (`get_board_vector` method in `chess_gym/chess_custom.py`)

The observation vector represents the state of the chessboard as an 8x8x13 NumPy array (`dtype=np.int8`).
Each 13-element feature vector corresponds to a square on the board.

**Shape:** `(8, 8, 13)` (Rank, File, Features)

**Features per Square (13 dimensions):**

*   **Dimension 0:** `Color`
    *   `-1`: Black piece
    *   `0`: Empty square
    *   `1`: White piece
*   **Dimensions 1-6:** `Piece Type` (One-hot encoded)
    *   Index 1: Pawn (`chess.PAWN`)
    *   Index 2: Knight (`chess.KNIGHT`)
    *   Index 3: Bishop (`chess.BISHOP`)
    *   Index 4: Rook (`chess.ROOK`)
    *   Index 5: Queen (`chess.QUEEN`)
    *   Index 6: King (`chess.KING`)
*   **Dimension 7:** `Moved Flag`
    *   `1`: The piece (King, Rook, or Pawn) has moved from its starting position or lost relevant castling rights.
    *   `0`: The piece has not moved, or is not K/R/P.
*   **Dimensions 8-12:** `Behavior Type` (One-hot encoded, relevant only if the piece *was originally* a Pawn)
    *   Index 8: Is currently a Pawn (original Pawn, not yet promoted).
    *   Index 9: Promoted to Knight.
    *   Index 10: Promoted to Bishop.
    *   Index 11: Promoted to Rook.
    *   Index 12: Promoted to Queen.
    *   *(All zero if the piece was not originally a Pawn)*

**Example Usage:** `observation = board.get_board_vector()`