Skip to content

Commit 678c5f8

Browse files
data analysis complete revise
1 parent 60db465 commit 678c5f8

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

50 files changed

+328068
-5223
lines changed

README.md

Lines changed: 96 additions & 46 deletions
Original file line numberDiff line numberDiff line change
@@ -133,84 +133,130 @@ The app handles shutdown and stops the Mongo container cleanly.
133133

134134
Repeat for every tablet.
135135

136-
## 7) Data Export and Analysis Procedure
136+
## 7) Data Migration + Analysis Pipeline (Unified 01..07)
137137

138-
Use this when you need CSV/JSON outputs for strategy and picklist work.
138+
This workflow is now config-driven and runs through ordered scripts with CSV handoff:
139139

140-
### 7.1 Keep backend running
140+
1. `01_extract_source.py`
141+
2. `02_clean_normalize.py`
142+
3. `03_feature_engineering.py`
143+
4. `04_team_aggregation.py`
144+
5. `05_picklist_scores.py`
145+
6. `06_export_app_payloads.py`
146+
7. `07_seed_fake_data.py` (optional, controlled by config/flags)
141147

142-
In terminal #1 (repo root):
148+
### 7.1 One-time schema migration (recommended before first 2026 event run)
149+
150+
From repo root:
143151

144152
```powershell
145-
npm run start
153+
npm run --workspace server migrate-match-schema
146154
```
147155

148-
### 7.2 Run analysis script
156+
Migration report output:
149157

150-
In terminal #2:
158+
- `server/static/match-schema-migration-report.json`
159+
160+
### 7.2 Python setup (once per machine)
151161

152162
```powershell
153163
cd ScoutingApp2026\data-analysis
154164
python -m venv venv
155165
.\venv\Scripts\Activate.ps1
156166
pip install -r requirements.txt
157-
python export_csv.py
158167
```
159168

160-
### 7.3 Output locations
169+
### 7.3 Configure pipeline behavior
170+
171+
Edit:
161172

162-
Primary outputs go to:
173+
- `data-analysis/pipeline_config.json`
163174

164-
- `data-analysis/output`
175+
Main knobs:
165176

166-
Legacy CSV outputs are also written to:
177+
- `source.mode`: `mongo` or `fake`
178+
- `source.mongo_url` / `source.db`
179+
- `paths.output_dir`
180+
- `analysis.metrics` (enabled flags, weights, direction)
181+
- `analysis.timeline_bin_sec`
182+
- `fake_data.*` (including `run_stage_07` and `seed_mongo`)
167183

168-
- `data-analysis/match_raw_2026.csv`
169-
- `data-analysis/super_raw_2026.csv`
170-
- `data-analysis/pit_2026.csv`
171-
- `data-analysis/team_agg_2026.csv`
172-
- `data-analysis/metric_summary_2026.csv`
184+
### 7.4 Run full pipeline (real Mongo data)
173185

174-
### 7.4 Optional analysis flags
186+
Keep server running in terminal #1:
175187

176188
```powershell
177-
python export_csv.py --mongo-url mongodb://localhost:27017/ --db test --output-dir .\output
189+
npm run start
178190
```
179191

180-
## 8) Generate Fake Data (for Testing Picklist/Recon)
192+
Then in terminal #2:
181193

182-
There are two fake-data paths.
194+
```powershell
195+
cd ScoutingApp2026\data-analysis
196+
.\venv\Scripts\Activate.ps1
197+
python run_pipeline.py --source-mode mongo
198+
```
183199

184-
### 8.1 Database fake scouting data (recommended)
200+
### 7.5 Run full pipeline (fake source, with fake generation stage)
185201

186-
Populates Mongo collections with synthetic match/pit/leaderboard entries.
202+
```powershell
203+
cd ScoutingApp2026\data-analysis
204+
.\venv\Scripts\Activate.ps1
205+
python run_pipeline.py --source-mode fake --run-stage-07
206+
```
187207

188-
From repo root:
208+
Optional: seed Mongo during stage 07:
189209

190210
```powershell
191-
npm run --workspace server gen-fake-data
211+
python run_pipeline.py --source-mode fake --run-stage-07 --seed-mongo
192212
```
193213

194-
Optional environment overrides (PowerShell examples):
214+
### 7.6 Pipeline outputs
215+
216+
All outputs are written to `data-analysis/output` (or `paths.output_dir`):
217+
218+
- `00_pipeline_report.json`
219+
- `01_match_raw.csv`, `01_pit_raw.csv`, `01_raw_snapshot.json`
220+
- `02_match_clean.csv`, `02_pit_clean.csv`, `02_validation_report.csv`
221+
- `03_match_features.csv`, `03_timeseries_long.csv`, `03_auto_path_points.csv`
222+
- `04_team_aggregates.csv`
223+
- `05_picklist_scores.csv`, `05_metric_contributions.csv`
224+
- `06_picklist_payload.json`, `06_team_profiles.json`
225+
- `07_seed_report.json` (only when stage 07 runs)
226+
227+
Picklist app reads analyzed payload from:
228+
229+
- `data-analysis/output/06_picklist_payload.json`
230+
- API route: `GET /data/retrieve/analyzed`
231+
232+
### 7.7 Legacy command compatibility
233+
234+
`python export_csv.py` now forwards to `run_pipeline.py` and uses the same config/flags.
235+
236+
## 8) Fake Data Options
237+
238+
### 8.1 Pipeline-native fake data (recommended)
239+
240+
Use stage 07 directly:
195241

196242
```powershell
197-
$env:FAKE_MATCH_COUNT='80'
198-
$env:FAKE_TEAM_COUNT='40'
199-
$env:FAKE_SCOUTER_COUNT='16'
200-
$env:FAKE_CLEAR='true'
201-
$env:FAKE_INCLUDE_PIT='true'
202-
$env:FAKE_INCLUDE_LEADERBOARD='true'
203-
$env:FAKE_INCLUDE_AUTO_PATH='true'
204-
npm run --workspace server gen-fake-data
243+
cd data-analysis
244+
.\venv\Scripts\Activate.ps1
245+
python 07_seed_fake_data.py
205246
```
206247

207-
### 8.2 Static analysis JSON file
248+
Or via orchestrator:
208249

209-
Writes `server/static/output_analysis.json`.
250+
```powershell
251+
python run_pipeline.py --source-mode fake --run-stage-07
252+
```
210253

211-
From repo root:
254+
### 8.2 Legacy server fake scripts (optional / dev-only)
255+
256+
These still exist for server-side testing:
212257

213258
```powershell
259+
npm run --workspace server gen-fake-data
214260
npm run --workspace server gen-fake-json
215261
```
216262

@@ -281,7 +327,10 @@ Writes: `client/src/assets/matchSchedule.json`
281327

282328
### 10.3 Generate team metadata/colors/avatars
283329

284-
Requires `server/static/output_analysis.json` (generate with `gen-fake-json` or provide your own).
330+
Requires either:
331+
332+
- `data-analysis/output/06_team_profiles.json` (preferred; generated by pipeline stage 06), or
333+
- `server/static/output_analysis.json` (legacy fallback).
285334

286335
```powershell
287336
npm run --workspace server gen-team-info
@@ -328,17 +377,22 @@ npm run start
328377
# Dev run
329378
npm run dev
330379
331-
# Analysis
380+
# Migration
381+
npm run --workspace server migrate-match-schema
382+
383+
# Analysis pipeline (real data)
332384
cd data-analysis
333385
python -m venv venv
334386
.\venv\Scripts\Activate.ps1
335387
pip install -r requirements.txt
336-
python export_csv.py
388+
python run_pipeline.py --source-mode mongo
337389
338-
# Fake data
390+
# Fake data (pipeline-native)
391+
python run_pipeline.py --source-mode fake --run-stage-07
392+
393+
# Legacy fake data scripts (optional)
339394
cd ..
340395
npm run --workspace server gen-fake-data
341-
npm run --workspace server gen-fake-json
342396
343397
# Event utilities
344398
npm run --workspace server download-teams
@@ -370,7 +424,3 @@ npm run build --workspace database
370424

371425
- Ensure backend is running (`npm run start`) or Mongo container is up.
372426
- Check mongo URL (`mongodb://localhost:27017/`).
373-
374-
### `sendExport` script note
375-
376-
- `server/scripts/sendExport.ts` currently connects to `mongodb://0.0.0.0:27107/` (port typo vs `27017`). Update that file before relying on this script.

client/src/apps/match/MatchApp.tsx

Lines changed: 0 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -990,15 +990,11 @@ function MatchApp() {
990990
robotAbsent,
991991
autoStartingPosition,
992992
autoPath: autoPathTrace,
993-
autoMoved: false,
994993
shootTimeBySegment,
995994
passTimeBySegment,
996995
actionTimeline,
997996
ballsPerSecondUsed,
998997
autoFuelScored: 0,
999-
autoTower: 'None',
1000-
autoFuelWinner: 'unknown',
1001-
shift1ActiveHubIfTie: null,
1002998
teleFuelBySegment: {
1003999
transition: 0,
10041000
shift1: 0,
@@ -1008,14 +1004,12 @@ function MatchApp() {
10081004
endgame: 0,
10091005
},
10101006
teleTower,
1011-
climbTimeBucket: null,
10121007
breakdown,
10131008
driverQuality,
10141009
defenseProvided,
10151010
defenseReceived,
10161011
fouls,
10171012
breaks,
1018-
comments: [],
10191013
freeText,
10201014
};
10211015

0 commit comments

Comments
 (0)