You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
SELECTp.project_name, p.aircraft, p.year, COUNT(f.id) as n_flights
FROM projects p
JOIN flights f ONp.id=f.idGROUP BYp.idORDER BY n_flights DESC;
Priority projects:
Project
Aircraft
Why
CGWAVES
GV (N677F)
Current baseline
GOTHIC/GOTHAAM
GV
Similar aircraft, different science
CAESAR
C-130
Cross-platform transfer test
FRAPPE
C-130
Boundary layer focus
2.2 Cross-Project Variable Alignment
Find common variables:
# For each project, get variable setsproject_vars= {}
forproject_idin [12, 15, 20]: # Example IDsvars=db_conn.query(""" SELECT vm.variable_name FROM variable_metadata vm JOIN variable_projects vp ON vm.id = vp.variable_id WHERE vp.project_id = %s """, (project_id,))
project_vars[project_id] =set(v['variable_name'] forvinvars)
# Find intersectioncommon_vars=set.intersection(*project_vars.values())
print(f"Common variables: {len(common_vars)}")
Categorize variables:
Universal: Present in all projects (ATX, THETA, PSXC, MR)
Platform-specific: Differ by aircraft (TASX ranges)
Campaign-specific: Only in certain projects (specialized instruments)
2.3 Incremental Training Strategy
Option A: Sequential fine-tuning
1. Train on CGWAVES → baseline model
2. Fine-tune on GOTHIC (lower LR) → expanded model
3. Fine-tune on CAESAR → cross-platform model
Option B: Joint training from scratch
1. Combine all project data into single dataset
2. Add project embedding token (like BERT's [CLS])
3. Train jointly with project-balanced sampling
Option C: Domain adversarial training
1. Add discriminator head predicting project/platform
2. Use gradient reversal to learn project-invariant features
3. Goal: Embeddings useful across projects, not memorizing project-specific patterns
2.4 Data Pipeline Updates
Update local preprocessing:
# In normalization stats cell, loop over projects:forproject_idinPROJECT_IDS:
stats=compute_normalization_stats(db_conn, project_id, common_vars)
# Save per-project stats
Update cache export:
# Export windows for each projectforproject_idinPROJECT_IDS:
windows_path=f'windows_{project_id}.npz'# ... export logic