Skip to content

Commit 4e3f12d

Browse files
authored
Performance: Puzzle statistics are pre-calculated (#75)
* Performance: Puzzle statistics are pre-calculated * more optimizes
1 parent 20b8f21 commit 4e3f12d

16 files changed

+391
-102
lines changed
Lines changed: 318 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,318 @@
1+
# Puzzle Statistics Query Optimization - Step 2
2+
3+
This document outlines the step-by-step plan to update all read queries to use the new `puzzle_statistics` table and `puzzling_type`/`puzzlers_count` columns on `puzzle_solving_time`.
4+
5+
## Prerequisites
6+
7+
Before starting, ensure Step 1 is complete:
8+
- [x] `puzzle_statistics` table created
9+
- [x] `puzzling_type` and `puzzlers_count` columns added to `puzzle_solving_time`
10+
- [x] Migrations run
11+
- [x] `myspeedpuzzling:recalculate-puzzle-statistics` command executed
12+
13+
---
14+
15+
## Phase 1: HIGH IMPACT - Replace Aggregations with `puzzle_statistics` Table
16+
17+
These changes eliminate expensive GROUP BY + COUNT/AVG/MIN aggregations by reading precomputed values.
18+
19+
### 1.1 SearchPuzzle.php - `byUserInput()`
20+
21+
**File:** `src/Query/SearchPuzzle.php`
22+
**Method:** `byUserInput()` (line 80)
23+
24+
**Changes:**
25+
- Replace `LEFT JOIN puzzle_solving_time pst ON pst.puzzle_id = pb.puzzle_id` with `LEFT JOIN puzzle_statistics ps ON ps.puzzle_id = pb.puzzle_id`
26+
- Replace aggregation columns (lines 163-169):
27+
```sql
28+
-- FROM:
29+
COUNT(pst.id) AS solved_times,
30+
AVG(CASE WHEN pst.team IS NULL THEN pst.seconds_to_solve END) AS average_time_solo,
31+
MIN(CASE WHEN pst.team IS NULL THEN pst.seconds_to_solve END) AS fastest_time_solo,
32+
AVG(CASE WHEN json_array_length(pst.team->'puzzlers') = 2 THEN pst.seconds_to_solve END) AS average_time_duo,
33+
MIN(CASE WHEN json_array_length(pst.team->'puzzlers') = 2 THEN pst.seconds_to_solve END) AS fastest_time_duo,
34+
AVG(CASE WHEN json_array_length(pst.team->'puzzlers') > 2 THEN pst.seconds_to_solve END) AS average_time_team,
35+
MIN(CASE WHEN json_array_length(pst.team->'puzzlers') > 2 THEN pst.seconds_to_solve END) AS fastest_time_team
36+
37+
-- TO:
38+
COALESCE(ps.solved_times_count, 0) AS solved_times,
39+
ps.average_time_solo,
40+
ps.fastest_time_solo,
41+
ps.average_time_duo,
42+
ps.fastest_time_duo,
43+
ps.average_time_team,
44+
ps.fastest_time_team
45+
```
46+
- Remove GROUP BY clause (lines 173-184)
47+
- Update ORDER BY to not reference removed columns
48+
49+
**Tests to verify:** Run existing tests for SearchPuzzle
50+
51+
---
52+
53+
### 1.2 GetPuzzleOverview.php - All 3 Methods
54+
55+
**File:** `src/Query/GetPuzzleOverview.php`
56+
**Methods:** `byEan()`, `byId()`, `byTagId()`
57+
58+
**Changes for each method:**
59+
- Replace `LEFT JOIN puzzle_solving_time` with `LEFT JOIN puzzle_statistics ps ON ps.puzzle_id = puzzle.id`
60+
- Replace aggregation columns with direct reads from `ps.*`
61+
- Remove GROUP BY clause
62+
63+
**Example for `byId()` (lines 103-128):**
64+
```sql
65+
-- FROM:
66+
COUNT(puzzle_solving_time.id) AS solved_times,
67+
AVG(CASE WHEN team IS NULL AND seconds_to_solve > 0 THEN seconds_to_solve END) AS average_time_solo,
68+
...
69+
70+
-- TO:
71+
COALESCE(ps.solved_times_count, 0) AS solved_times,
72+
ps.average_time_solo,
73+
ps.fastest_time_solo,
74+
ps.average_time_duo,
75+
ps.fastest_time_duo,
76+
ps.average_time_team,
77+
ps.fastest_time_team
78+
```
79+
80+
**Tests to verify:** Run existing tests for GetPuzzleOverview
81+
82+
---
83+
84+
### 1.3 GetPuzzlesOverview.php - `allApprovedOrAddedByPlayer()`
85+
86+
**File:** `src/Query/GetPuzzlesOverview.php`
87+
**Method:** `allApprovedOrAddedByPlayer()` (line 20)
88+
89+
**Changes:**
90+
- Replace `LEFT JOIN puzzle_solving_time` with `LEFT JOIN puzzle_statistics ps ON ps.puzzle_id = puzzle.id`
91+
- Replace aggregation columns (lines 35-41)
92+
- Remove GROUP BY clause (line 48)
93+
94+
**Tests to verify:** Run existing tests for GetPuzzlesOverview
95+
96+
---
97+
98+
### 1.4 GetMostSolvedPuzzles.php - `top()`
99+
100+
**File:** `src/Query/GetMostSolvedPuzzles.php`
101+
**Method:** `top()` (line 20)
102+
103+
**Changes:**
104+
- Change FROM clause: `FROM puzzle_statistics ps` instead of `FROM puzzle_solving_time`
105+
- Join puzzle: `INNER JOIN puzzle ON puzzle.id = ps.puzzle_id`
106+
- Replace aggregations with direct column reads:
107+
```sql
108+
ps.solved_times_count AS solved_times,
109+
ps.average_time_solo,
110+
ps.fastest_time_solo
111+
```
112+
- Update GROUP BY to only include puzzle and manufacturer
113+
114+
**Note:** `topInMonth()` method CANNOT be optimized - it needs time-based filtering.
115+
116+
**Tests to verify:** Run existing tests for GetMostSolvedPuzzles
117+
118+
---
119+
120+
## Phase 2: MEDIUM IMPACT - Replace `json_array_length()` with `puzzling_type`
121+
122+
These changes replace expensive JSON parsing with indexed column lookups.
123+
124+
### 2.1 GetPuzzleSolvers.php - 3 Methods
125+
126+
**File:** `src/Query/GetPuzzleSolvers.php`
127+
128+
#### Method: `soloByPuzzleId()` (line 24)
129+
- Line 49: Replace `AND puzzle_solving_time.team IS NULL` with `AND puzzle_solving_time.puzzling_type = 'solo'`
130+
131+
#### Method: `duoByPuzzleId()` (line 88)
132+
- Line 125: Replace `AND json_array_length(team -> 'puzzlers') = 2` with `AND pst.puzzling_type = 'duo'`
133+
- Line 123: Can remove `AND pst.team IS NOT NULL` (implied by puzzling_type)
134+
135+
#### Method: `teamByPuzzleId()` (line 164)
136+
- Line 201: Replace `AND json_array_length(team -> 'puzzlers') > 2` with `AND pst.puzzling_type = 'team'`
137+
- Line 199: Can remove `AND pst.team IS NOT NULL`
138+
139+
#### Method: `relaxCountsByPuzzleId()` (line 240)
140+
- Lines 248-250: Replace with:
141+
```sql
142+
COUNT(*) FILTER (WHERE puzzling_type = 'solo') AS solo_count,
143+
COUNT(*) FILTER (WHERE puzzling_type = 'duo') AS duo_count,
144+
COUNT(*) FILTER (WHERE puzzling_type = 'team') AS team_count
145+
```
146+
147+
**Tests to verify:** Run existing tests for GetPuzzleSolvers
148+
149+
---
150+
151+
### 2.2 GetFastestPlayers.php - `perPiecesCount()`
152+
153+
**File:** `src/Query/GetFastestPlayers.php`
154+
**Method:** `perPiecesCount()` (line 21)
155+
156+
**Changes:**
157+
- Line 33: Replace `WHERE pst.team IS NULL` with `WHERE pst.puzzling_type = 'solo'`
158+
159+
**Tests to verify:** Run existing tests for GetFastestPlayers
160+
161+
---
162+
163+
### 2.3 GetFastestPairs.php - `perPiecesCount()`
164+
165+
**File:** `src/Query/GetFastestPairs.php`
166+
**Method:** `perPiecesCount()` (line 21)
167+
168+
**Changes:**
169+
- Line 69: Replace `AND json_array_length(team -> 'puzzlers') = 2` with `AND puzzle_solving_time.puzzling_type = 'duo'`
170+
- Line 67: Can remove `AND puzzle_solving_time.team IS NOT NULL`
171+
172+
**Tests to verify:** Run existing tests for GetFastestPairs
173+
174+
---
175+
176+
### 2.4 GetFastestGroups.php - `perPiecesCount()`
177+
178+
**File:** `src/Query/GetFastestGroups.php`
179+
**Method:** `perPiecesCount()` (line 21)
180+
181+
**Changes:**
182+
- Line 69: Replace `AND json_array_length(team -> 'puzzlers') > 2` with `AND puzzle_solving_time.puzzling_type = 'team'`
183+
- Line 67: Can remove `AND puzzle_solving_time.team IS NOT NULL`
184+
185+
**Tests to verify:** Run existing tests for GetFastestGroups
186+
187+
---
188+
189+
### 2.5 GetPlayerSolvedPuzzles.php - Multiple Methods
190+
191+
**File:** `src/Query/GetPlayerSolvedPuzzles.php`
192+
193+
#### Method: `soloByPlayerId()` (around line 144)
194+
- Replace `team IS NULL` with `puzzling_type = 'solo'`
195+
196+
#### Method: `duoByPlayerId()` (around line 268)
197+
- Replace `json_array_length(team -> 'puzzlers') = 2` with `puzzling_type = 'duo'`
198+
199+
#### Method: `teamByPlayerId()` (around line 395)
200+
- Replace `json_array_length(team -> 'puzzlers') > 2` with `puzzling_type = 'team'`
201+
202+
**Tests to verify:** Run existing tests for GetPlayerSolvedPuzzles
203+
204+
---
205+
206+
### 2.6 GetPlayerStatistics.php - Multiple Methods
207+
208+
**File:** `src/Query/GetPlayerStatistics.php`
209+
210+
#### Method: `solo()` (around line 22)
211+
- Replace `team IS NULL` with `puzzling_type = 'solo'`
212+
213+
#### Method: `duo()` (around line 69)
214+
- Replace `json_array_length(team -> 'puzzlers') = 2` with `puzzling_type = 'duo'`
215+
216+
#### Method: `team()` (around line 122)
217+
- Replace `json_array_length(team -> 'puzzlers') > 2` with `puzzling_type = 'team'`
218+
219+
**Tests to verify:** Run existing tests for GetPlayerStatistics
220+
221+
---
222+
223+
## Phase 3: Additional Optimizations
224+
225+
### 3.1 GetExportableSolvingTimes.php
226+
227+
**File:** `src/Query/GetExportableSolvingTimes.php`
228+
**Method:** `byPlayerId()` (line 24)
229+
230+
**Changes:**
231+
- Lines 45-50: Replace CASE expression for puzzling type:
232+
```sql
233+
-- FROM:
234+
CASE
235+
WHEN pst.team IS NULL THEN 'solo'
236+
WHEN json_array_length(pst.team -> 'puzzlers') = 2 THEN 'duo'
237+
ELSE json_array_length(pst.team -> 'puzzlers')
238+
END AS group_size
239+
240+
-- TO:
241+
pst.puzzling_type,
242+
pst.puzzlers_count
243+
```
244+
245+
**Note:** This may require updating the Results DTO and export format.
246+
247+
---
248+
249+
### 3.2 GetUnsolvedPuzzles.php
250+
251+
**File:** `src/Query/GetUnsolvedPuzzles.php`
252+
253+
**Changes:**
254+
- Replace complex JSON checks with `puzzling_type` column checks where applicable
255+
256+
---
257+
258+
## Phase 4: Verification & Cleanup
259+
260+
### 4.1 Run All Tests
261+
```bash
262+
docker compose exec web vendor/bin/phpunit --exclude-group panther
263+
```
264+
265+
### 4.2 Run Static Analysis
266+
```bash
267+
docker compose exec web composer run phpstan
268+
docker compose exec web composer run cs-fix
269+
```
270+
271+
### 4.3 Manual Testing Checklist
272+
- [ ] Puzzle search page loads correctly with statistics
273+
- [ ] Puzzle detail page shows correct solo/duo/team times
274+
- [ ] Fastest players leaderboard works
275+
- [ ] Fastest pairs leaderboard works
276+
- [ ] Fastest groups leaderboard works
277+
- [ ] Player profile shows correct statistics
278+
- [ ] Most solved puzzles page works
279+
280+
### 4.4 Performance Verification
281+
Compare query times before and after for:
282+
- Puzzle search with filters
283+
- Puzzle detail page load
284+
- Leaderboard pages
285+
286+
---
287+
288+
## Files NOT to Modify
289+
290+
These files need row-level data or time-based filtering:
291+
292+
| File | Reason |
293+
|------|--------|
294+
| `GetMostSolvedPuzzles.php:topInMonth()` | Filters by month/year |
295+
| `GetMostActivePlayers.php` | Per-player aggregations |
296+
| `GetStatistics.php` | Global stats (could create separate table later) |
297+
| `GetPlayerChartData.php` | Player-specific, time-filtered |
298+
| `GetCompetitionParticipants.php` | Competition-specific filtering |
299+
300+
---
301+
302+
## Rollback Plan
303+
304+
If issues are discovered:
305+
1. Revert query changes
306+
2. Old queries will work - data in puzzle_statistics is supplementary
307+
3. Statistics will still update via domain events
308+
309+
---
310+
311+
## Implementation Order
312+
313+
Recommended order to minimize risk:
314+
315+
1. **Phase 2 first** (puzzling_type replacements) - Lower risk, isolated changes
316+
2. **Phase 1 second** (puzzle_statistics joins) - Higher impact, needs careful testing
317+
3. **Phase 3 last** - Optional optimizations
318+
4. **Phase 4** - Verification after each phase

src/Query/GetCompetitionParticipants.php

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -155,7 +155,7 @@ public function getConnectedParticipants(string $competitionId, array $roundsFil
155155
puzzle ON puzzle.id = puzzle_solving_time.puzzle_id
156156
WHERE
157157
puzzle_solving_time.player_id IN (:playerIds)
158-
AND puzzle_solving_time.team IS NULL
158+
AND puzzle_solving_time.puzzling_type = 'solo'
159159
AND puzzle.pieces_count = 500
160160
SQL;
161161

src/Query/GetExportableSolvingTimes.php

Lines changed: 2 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -40,15 +40,8 @@ public function byPlayerId(string $playerId): array
4040
pst.first_attempt,
4141
pst.finished_puzzle_photo,
4242
pst.comment,
43-
CASE
44-
WHEN pst.team IS NULL THEN 'solo'
45-
WHEN json_array_length(pst.team -> 'puzzlers') = 2 THEN 'duo'
46-
ELSE 'team'
47-
END AS solving_type,
48-
CASE
49-
WHEN pst.team IS NULL THEN 1
50-
ELSE json_array_length(pst.team -> 'puzzlers')
51-
END AS players_count,
43+
pst.puzzling_type AS solving_type,
44+
pst.puzzlers_count AS players_count,
5245
(
5346
SELECT string_agg(
5447
COALESCE(

src/Query/GetFastestGroups.php

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -64,9 +64,8 @@ public function perPiecesCount(int $piecesCount, int $howManyPlayers, null|Count
6464
LATERAL json_array_elements(puzzle_solving_time.team -> 'puzzlers') WITH ORDINALITY AS player_elem(player, ordinality)
6565
LEFT JOIN player p ON p.id = (player_elem.player ->> 'player_id')::UUID
6666
WHERE puzzle.pieces_count = :piecesCount
67-
AND puzzle_solving_time.team IS NOT NULL
67+
AND puzzle_solving_time.puzzling_type = 'team'
6868
AND seconds_to_solve > 0
69-
AND json_array_length(team -> 'puzzlers') > 2
7069
AND puzzle_solving_time.suspicious = false
7170
GROUP BY puzzle.id, player.id, manufacturer.id, puzzle_solving_time.id, competition.id
7271
)

src/Query/GetFastestPairs.php

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -64,9 +64,8 @@ public function perPiecesCount(int $piecesCount, int $howManyPlayers, null|Count
6464
LATERAL json_array_elements(puzzle_solving_time.team -> 'puzzlers') WITH ORDINALITY AS player_elem(player, ordinality)
6565
LEFT JOIN player p ON p.id = (player_elem.player ->> 'player_id')::UUID
6666
WHERE puzzle.pieces_count = :piecesCount
67-
AND puzzle_solving_time.team IS NOT NULL
67+
AND puzzle_solving_time.puzzling_type = 'duo'
6868
AND seconds_to_solve > 0
69-
AND json_array_length(team -> 'puzzlers') = 2
7069
AND puzzle_solving_time.suspicious = false
7170
GROUP BY puzzle.id, player.id, manufacturer.id, puzzle_solving_time.id, competition.id
7271
)

src/Query/GetFastestPlayers.php

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -30,7 +30,7 @@ public function perPiecesCount(int $piecesCount, int $limit, null|CountryCode $c
3030
FROM puzzle_solving_time pst
3131
INNER JOIN puzzle p ON p.id = pst.puzzle_id
3232
INNER JOIN player pl ON pl.id = pst.player_id
33-
WHERE pst.team IS NULL
33+
WHERE pst.puzzling_type = 'solo'
3434
AND p.pieces_count = :piecesCount
3535
AND pst.seconds_to_solve > 0
3636
AND pl.is_private = false

src/Query/GetMostActivePlayers.php

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -67,7 +67,7 @@ public function mostActiveSoloPlayers(int $limit): array
6767
FROM puzzle_solving_time
6868
INNER JOIN player ON puzzle_solving_time.player_id = player.id
6969
INNER JOIN puzzle ON puzzle_solving_time.puzzle_id = puzzle.id
70-
WHERE puzzle_solving_time.team IS NULL
70+
WHERE puzzle_solving_time.puzzling_type = 'solo'
7171
GROUP BY player.id
7272
ORDER BY solved_puzzles_count DESC, total_pieces_count DESC, total_seconds DESC
7373
LIMIT :limit
@@ -114,7 +114,7 @@ public function mostActiveSoloPlayersInMonth(int $limit, int $month, int $year):
114114
FROM puzzle_solving_time
115115
INNER JOIN player ON puzzle_solving_time.player_id = player.id
116116
INNER JOIN puzzle ON puzzle_solving_time.puzzle_id = puzzle.id
117-
WHERE puzzle_solving_time.team IS NULL
117+
WHERE puzzle_solving_time.puzzling_type = 'solo'
118118
AND EXTRACT(MONTH FROM puzzle_solving_time.tracked_at) = :month
119119
AND EXTRACT(YEAR FROM puzzle_solving_time.tracked_at) = :year
120120
GROUP BY player.id

0 commit comments

Comments
 (0)