Skip to content

[META] PyTorch Migration & Modernisation #100

@tallamjr

Description

@tallamjr

Migrate AstroNet from TensorFlow/Keras to PyTorch whilst modernising the data pipeline with Polars and Apache Arrow for zero-copy operations.

Tracking Document: .github/issues/100/TODO.md

Key Objectives

  • Data Pipeline: Pure Polars + Arrow with zero-copy tensor conversion (5-10x speedup target)
  • Model Migration: Convert T2, Tinho, and ATX architectures to PyTorch with numerical verification
  • Training: PyTorch Lightning with mixed precision and experiment tracking
  • Deployment: ONNX export with quantization (2-5x inference speedup target)
  • Testing: >80% coverage with ML-specific tests (behavioural, invariance, convergence)

Migration Phases

  • Phase 0: Bridge layer and migration verification
  • Phase 1: Data pipeline modernisation
  • Phase 2: Model architecture migration
  • Phase 3: Training infrastructure with Lightning
  • Phase 4: Hyperparameter optimisation
  • Phase 5: Inference and export
  • Phase 6: Testing and documentation
  • Phase 7: Validation and optimisation
  • Phase 8: Production readiness [Optional]

Success Criteria

  • All architectures numerically equivalent to TF baseline (within 1%)
  • Data loading 5-10x faster (benchmarked)
  • Inference 2-5x faster with ONNX + quantization
  • Test coverage >80%
  • Zero pandas dependencies in core code
  • Complete migration guide and ADRs

Related Issues

To be populated as phase-specific issues are created

References

  • Migration plan: .github/100/TODO.md
  • Current baseline: Tinho log loss ≤0.450, accuracy ≥78.6%

Metadata

Metadata

Assignees

No one assigned

    Labels

    1 - refactorDRY conflict. Refactor required2 - enhancementA request or update to existing functionality

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions