-
-
Notifications
You must be signed in to change notification settings - Fork 3
Open
Labels
1 - refactorDRY conflict. Refactor requiredDRY conflict. Refactor required2 - enhancementA request or update to existing functionalityA request or update to existing functionality
Description
Migrate AstroNet from TensorFlow/Keras to PyTorch whilst modernising the data pipeline with Polars and Apache Arrow for zero-copy operations.
Tracking Document: .github/issues/100/TODO.md
Key Objectives
- Data Pipeline: Pure Polars + Arrow with zero-copy tensor conversion (5-10x speedup target)
- Model Migration: Convert T2, Tinho, and ATX architectures to PyTorch with numerical verification
- Training: PyTorch Lightning with mixed precision and experiment tracking
- Deployment: ONNX export with quantization (2-5x inference speedup target)
- Testing: >80% coverage with ML-specific tests (behavioural, invariance, convergence)
Migration Phases
- Phase 0: Bridge layer and migration verification
- Phase 1: Data pipeline modernisation
- Phase 2: Model architecture migration
- Phase 3: Training infrastructure with Lightning
- Phase 4: Hyperparameter optimisation
- Phase 5: Inference and export
- Phase 6: Testing and documentation
- Phase 7: Validation and optimisation
- Phase 8: Production readiness [Optional]
Success Criteria
- All architectures numerically equivalent to TF baseline (within 1%)
- Data loading 5-10x faster (benchmarked)
- Inference 2-5x faster with ONNX + quantization
- Test coverage >80%
- Zero pandas dependencies in core code
- Complete migration guide and ADRs
Related Issues
To be populated as phase-specific issues are created
References
- Migration plan:
.github/100/TODO.md - Current baseline: Tinho log loss ≤0.450, accuracy ≥78.6%
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
1 - refactorDRY conflict. Refactor requiredDRY conflict. Refactor required2 - enhancementA request or update to existing functionalityA request or update to existing functionality