Skip to content

Commit cc22d3d

Browse files
authored
Merge pull request #62 from lufftw/feat/pattern-game-theory-dp
feat(game_theory_dp): Add complete GameTheoryDP pattern
2 parents dd3aa32 + 23b892e commit cc22d3d

35 files changed

+3049
-0
lines changed
Lines changed: 209 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,209 @@
1+
# Game Theory DP - Intuition Guide
2+
3+
## The Mental Model: Think Like Your Opponent
4+
5+
Imagine playing chess:
6+
- You don't just think about your next move
7+
- You think: "If I do this, what will my opponent do? And then what's my best response?"
8+
9+
**Game Theory DP formalizes this recursive reasoning.**
10+
11+
## The Core Insight: Minimax
12+
13+
At any game state, the current player faces a choice:
14+
15+
```
16+
I want to MAXIMIZE my outcome
17+
But I know my opponent will MINIMIZE my outcome on their turn
18+
So I must choose the move that gives me the best result
19+
ASSUMING my opponent plays perfectly
20+
```
21+
22+
This is the **minimax principle**: maximize your minimum guaranteed outcome.
23+
24+
## Why "Score Difference" Works
25+
26+
Consider a game where we track score difference from the current player's perspective:
27+
28+
```
29+
If I'm ahead by 5 points, that's +5 for me
30+
When my opponent plays, THEIR +3 becomes MY -3
31+
So the scores naturally flip with each turn
32+
33+
dp(state) = my_gain - dp(next_state)
34+
35+
This is opponent's score difference
36+
(positive for them = negative for me)
37+
```
38+
39+
The subtraction handles the perspective switch automatically!
40+
41+
## Pattern 1: Taking from Ends (Stone Game)
42+
43+
**Scenario**: Array of values, take from either end each turn.
44+
45+
```
46+
piles = [3, 9, 1, 2]
47+
48+
My turn: I can take 3 (left) or 2 (right)
49+
If I take 3:
50+
- I gain 3
51+
- Opponent faces [9, 1, 2]
52+
- My total outcome = 3 - (opponent's best from [9, 1, 2])
53+
54+
Key insight: dp[i][j] = max score DIFFERENCE for current player
55+
when piles[i..j] remain
56+
```
57+
58+
**Why this works**:
59+
- State is interval [i, j] of remaining piles
60+
- Each move shrinks interval by 1
61+
- Eventually interval is empty (base case)
62+
63+
## Pattern 2: Taking from Front (Stone Game III)
64+
65+
**Scenario**: Array of values, take 1-3 from front each turn.
66+
67+
```
68+
values = [1, 2, 3, -9]
69+
70+
My turn: I can take 1, 12, or 123
71+
If I take first 2 (gaining 3):
72+
- Opponent faces [3, -9]
73+
- My outcome = 3 - dp([3, -9])
74+
75+
Key insight: dp[start] = max score diff starting from index start
76+
```
77+
78+
**Why the -9 matters**:
79+
- Naive: "Take as much as possible!"
80+
- Smart: Taking -9 might help if it forces opponent into worse position
81+
- Negative values make strategy non-trivial
82+
83+
## Pattern 3: Bitmask Games (Can I Win)
84+
85+
**Scenario**: Pool of numbers 1 to n, each used once, first to reach target wins.
86+
87+
```
88+
Numbers: {1, 2, 3, 4}, Target: 6
89+
90+
If I pick 4:
91+
- Remaining: {1, 2, 3}, Need: 2
92+
- Opponent can pick 2 or 3 to win!
93+
94+
If I pick 3:
95+
- Remaining: {1, 2, 4}, Need: 3
96+
- Opponent picks any, I might recover...
97+
```
98+
99+
**Why bitmask**:
100+
- State = which numbers are still available
101+
- With n numbers, 2^n possible states
102+
- Memoize on the bitmask integer
103+
104+
## The "I Win If Opponent Loses" Pattern
105+
106+
For win/lose games (not score games):
107+
108+
```python
109+
def can_win(state):
110+
for move in all_moves:
111+
if not can_win(state_after_move):
112+
return True # This move makes opponent lose!
113+
return False # All moves let opponent win :(
114+
```
115+
116+
**Insight**: I need just ONE move that leads to opponent losing.
117+
Opponent needs ALL my moves to lead to them winning.
118+
119+
## Trace Through: Predict the Winner
120+
121+
```
122+
nums = [1, 5, 2]
123+
124+
Build up from base cases:
125+
dp[0][0] = 1 (only pile 0, I take it)
126+
dp[1][1] = 5 (only pile 1, I take it)
127+
dp[2][2] = 2 (only pile 2, I take it)
128+
129+
dp[0][1]: piles 0 and 1 remain
130+
Take left (1): 1 - dp[1][1] = 1 - 5 = -4
131+
Take right (5): 5 - dp[0][0] = 5 - 1 = 4
132+
→ dp[0][1] = max(-4, 4) = 4 (take the 5!)
133+
134+
dp[1][2]: piles 1 and 2 remain
135+
Take left (5): 5 - dp[2][2] = 5 - 2 = 3
136+
Take right (2): 2 - dp[1][1] = 2 - 5 = -3
137+
→ dp[1][2] = max(3, -3) = 3 (take the 5!)
138+
139+
dp[0][2]: all piles remain (THIS IS THE ANSWER)
140+
Take left (1): 1 - dp[1][2] = 1 - 3 = -2
141+
Take right (2): 2 - dp[0][1] = 2 - 4 = -2
142+
→ dp[0][2] = max(-2, -2) = -2
143+
144+
Player 1's advantage = -2 < 0 → Player 2 wins!
145+
```
146+
147+
## Common Pitfalls
148+
149+
1. **Forgetting perspective flip**: The subtraction `- dp(next)` is crucial!
150+
151+
2. **Wrong win condition**:
152+
- `> 0` for strict win
153+
- `>= 0` when tie counts as win
154+
155+
3. **Not handling edge cases**:
156+
- Empty array
157+
- Single element
158+
- All elements same
159+
160+
4. **Inefficient state**:
161+
- Using (left, right, whose_turn) when (left, right) suffices
162+
- The turn is implicit in the recursion
163+
164+
## Complexity Guide
165+
166+
| Game Type | States | Per-State Work | Total |
167+
|-----------|--------|----------------|-------|
168+
| Interval [i,j] | O(n²) | O(1) | O(n²) |
169+
| Linear (start) | O(n) | O(k) | O(nk) |
170+
| Bitmask | O(2^n) | O(n) | O(n·2^n) |
171+
| (start, M) | O(n·n) | O(n) | O(n³) |
172+
173+
## The Universal Game Theory DP Template
174+
175+
```python
176+
from functools import lru_cache
177+
178+
def game_theory_dp(initial_state):
179+
"""
180+
Template for two-player optimal games.
181+
182+
Returns: outcome for first player
183+
(positive = win, negative = loss, 0 = tie)
184+
"""
185+
@lru_cache(maxsize=None)
186+
def dp(state):
187+
if is_terminal(state):
188+
return terminal_value(state)
189+
190+
best = float('-inf')
191+
for action in possible_actions(state):
192+
gain = immediate_gain(state, action)
193+
next_state = apply_action(state, action)
194+
195+
# Key: subtract opponent's best outcome
196+
value = gain - dp(next_state)
197+
best = max(best, value)
198+
199+
return best
200+
201+
return dp(initial_state)
202+
```
203+
204+
Fill in:
205+
- `is_terminal`: game over?
206+
- `terminal_value`: score when game ends
207+
- `possible_actions`: what moves can I make?
208+
- `immediate_gain`: points I get from this move
209+
- `apply_action`: new state after move

0 commit comments

Comments
 (0)