4201VitruvianBots
diff --git a/‎README.md‎
Lines changed: 96 additions & 46 deletions b/‎README.md‎
Lines changed: 96 additions & 46 deletions
diff --git a/‎client/src/apps/match/MatchApp.tsx‎
Lines changed: 0 additions & 6 deletions b/‎client/src/apps/match/MatchApp.tsx‎
Lines changed: 0 additions & 6 deletions
@@ -133,84 +133,130 @@ The app handles shutdown and stops the Mongo container cleanly.
 
 Repeat for every tablet.
 
-## 7) Data Export and Analysis Procedure
+## 7) Data Migration + Analysis Pipeline (Unified 01..07)
 
-Use this when you need CSV/JSON outputs for strategy and picklist work.
+This workflow is now config-driven and runs through ordered scripts with CSV handoff:
 
-### 7.1 Keep backend running
+1. `01_extract_source.py`
+2. `02_clean_normalize.py`
+3. `03_feature_engineering.py`
+4. `04_team_aggregation.py`
+5. `05_picklist_scores.py`
+6. `06_export_app_payloads.py`
+7. `07_seed_fake_data.py` (optional, controlled by config/flags)
 
-In terminal #1 (repo root):
+### 7.1 One-time schema migration (recommended before first 2026 event run)
+
+From repo root:
 
 ```powershell
-npm run start
+npm run --workspace server migrate-match-schema
 ```
 
-### 7.2 Run analysis script
+Migration report output:
 
-In terminal #2:
+- `server/static/match-schema-migration-report.json`
+
+### 7.2 Python setup (once per machine)
 
 ```powershell
 cd ScoutingApp2026\data-analysis
 python -m venv venv
 .\venv\Scripts\Activate.ps1
 pip install -r requirements.txt
-python export_csv.py
 ```
 
-### 7.3 Output locations
+### 7.3 Configure pipeline behavior
+
+Edit:
 
-Primary outputs go to:
+- `data-analysis/pipeline_config.json`
 
-- `data-analysis/output`
+Main knobs:
 
-Legacy CSV outputs are also written to:
+- `source.mode`: `mongo` or `fake`
+- `source.mongo_url` / `source.db`
+- `paths.output_dir`
+- `analysis.metrics` (enabled flags, weights, direction)
+- `analysis.timeline_bin_sec`
+- `fake_data.*` (including `run_stage_07` and `seed_mongo`)
 
-- `data-analysis/match_raw_2026.csv`
-- `data-analysis/super_raw_2026.csv`
-- `data-analysis/pit_2026.csv`
-- `data-analysis/team_agg_2026.csv`
-- `data-analysis/metric_summary_2026.csv`
+### 7.4 Run full pipeline (real Mongo data)
 
-### 7.4 Optional analysis flags
+Keep server running in terminal #1:
 
 ```powershell
-python export_csv.py --mongo-url mongodb://localhost:27017/ --db test --output-dir .\output
+npm run start
 ```
 
-## 8) Generate Fake Data (for Testing Picklist/Recon)
+Then in terminal #2:
 
-There are two fake-data paths.
+```powershell
+cd ScoutingApp2026\data-analysis
+.\venv\Scripts\Activate.ps1
+python run_pipeline.py --source-mode mongo
+```
 
-### 8.1 Database fake scouting data (recommended)
+### 7.5 Run full pipeline (fake source, with fake generation stage)
 
-Populates Mongo collections with synthetic match/pit/leaderboard entries.
+```powershell
+cd ScoutingApp2026\data-analysis
+.\venv\Scripts\Activate.ps1
+python run_pipeline.py --source-mode fake --run-stage-07
+```
 
-From repo root:
+Optional: seed Mongo during stage 07:
 
 ```powershell
-npm run --workspace server gen-fake-data
+python run_pipeline.py --source-mode fake --run-stage-07 --seed-mongo
 ```
 
-Optional environment overrides (PowerShell examples):
+### 7.6 Pipeline outputs
+
+All outputs are written to `data-analysis/output` (or `paths.output_dir`):
+
+- `00_pipeline_report.json`
+- `01_match_raw.csv`, `01_pit_raw.csv`, `01_raw_snapshot.json`
+- `02_match_clean.csv`, `02_pit_clean.csv`, `02_validation_report.csv`
+- `03_match_features.csv`, `03_timeseries_long.csv`, `03_auto_path_points.csv`
+- `04_team_aggregates.csv`
+- `05_picklist_scores.csv`, `05_metric_contributions.csv`
+- `06_picklist_payload.json`, `06_team_profiles.json`
+- `07_seed_report.json` (only when stage 07 runs)
+
+Picklist app reads analyzed payload from:
+
+- `data-analysis/output/06_picklist_payload.json`
+- API route: `GET /data/retrieve/analyzed`
+
+### 7.7 Legacy command compatibility
+
+`python export_csv.py` now forwards to `run_pipeline.py` and uses the same config/flags.
+
+## 8) Fake Data Options
+
+### 8.1 Pipeline-native fake data (recommended)
+
+Use stage 07 directly:
 
 ```powershell
-$env:FAKE_MATCH_COUNT='80'
-$env:FAKE_TEAM_COUNT='40'
-$env:FAKE_SCOUTER_COUNT='16'
-$env:FAKE_CLEAR='true'
-$env:FAKE_INCLUDE_PIT='true'
-$env:FAKE_INCLUDE_LEADERBOARD='true'
-$env:FAKE_INCLUDE_AUTO_PATH='true'
-npm run --workspace server gen-fake-data
+cd data-analysis
+.\venv\Scripts\Activate.ps1
+python 07_seed_fake_data.py
 ```
 
-### 8.2 Static analysis JSON file
+Or via orchestrator:
 
-Writes `server/static/output_analysis.json`.
+```powershell
+python run_pipeline.py --source-mode fake --run-stage-07
+```
 
-From repo root:
+### 8.2 Legacy server fake scripts (optional / dev-only)
+
+These still exist for server-side testing:
 
 ```powershell
+npm run --workspace server gen-fake-data
 npm run --workspace server gen-fake-json
 ```
 
@@ -281,7 +327,10 @@ Writes: `client/src/assets/matchSchedule.json`
 
 ### 10.3 Generate team metadata/colors/avatars
 
-Requires `server/static/output_analysis.json` (generate with `gen-fake-json` or provide your own).
+Requires either:
+
+- `data-analysis/output/06_team_profiles.json` (preferred; generated by pipeline stage 06), or
+- `server/static/output_analysis.json` (legacy fallback).
 
 ```powershell
 npm run --workspace server gen-team-info
@@ -328,17 +377,22 @@ npm run start
 # Dev run
 npm run dev
 
-# Analysis
+# Migration
+npm run --workspace server migrate-match-schema
+
+# Analysis pipeline (real data)
 cd data-analysis
 python -m venv venv
 .\venv\Scripts\Activate.ps1
 pip install -r requirements.txt
-python export_csv.py
+python run_pipeline.py --source-mode mongo
 
-# Fake data
+# Fake data (pipeline-native)
+python run_pipeline.py --source-mode fake --run-stage-07
+
+# Legacy fake data scripts (optional)
 cd ..
 npm run --workspace server gen-fake-data
-npm run --workspace server gen-fake-json
 
 # Event utilities
 npm run --workspace server download-teams
@@ -370,7 +424,3 @@ npm run build --workspace database
 
 - Ensure backend is running (`npm run start`) or Mongo container is up.
 - Check mongo URL (`mongodb://localhost:27017/`).
-
-### `sendExport` script note
-
-- `server/scripts/sendExport.ts` currently connects to `mongodb://0.0.0.0:27107/` (port typo vs `27017`). Update that file before relying on this script.
 
@@ -990,15 +990,11 @@ function MatchApp() {
             robotAbsent,
             autoStartingPosition,
             autoPath: autoPathTrace,
-            autoMoved: false,
             shootTimeBySegment,
             passTimeBySegment,
             actionTimeline,
             ballsPerSecondUsed,
             autoFuelScored: 0,
-            autoTower: 'None',
-            autoFuelWinner: 'unknown',
-            shift1ActiveHubIfTie: null,
             teleFuelBySegment: {
                 transition: 0,
                 shift1: 0,
@@ -1008,14 +1004,12 @@ function MatchApp() {
                 endgame: 0,
             },
             teleTower,
-            climbTimeBucket: null,
             breakdown,
             driverQuality,
             defenseProvided,
             defenseReceived,
             fouls,
             breaks,
-            comments: [],
             freeText,
         };