Integrate Deep Learning Models into Monthly Forecasting Workflow

## Summary

Integrate deep learning models into the monthly discharge forecasting workflow with a clean, maintainable structure that allows easy addition of new models while maintaining compatibility with existing workflows.

## Background

The current codebase has legacy deep learning code in `deep_scr/` that needs to be restructured and integrated into the production workflow. The goal is to create a unified interface for both traditional ML models (SciRegressor) and deep learning models, supporting both standalone forecasting and meta-learning approaches.

## Objectives

1. **Restructure Deep Learning Components**: Move `deep_scr/` code into proper production structure under `forecast_models/deep_models/`
2. **Unified Interface**: Create deep learning models that inherit from `BaseForecastModel` with same interface as `SciRegressor`
3. **Multiple Architectures**: Support LSTM, CNN-LSTM, TiDE, TSMixer, and Mamba models
4. **Enhanced Dataset**: Generic dataset supporting multi-input structure with NaN handling
5. **Easy Extensibility**: Plugin-style architecture for adding new model types

## Technical Requirements

### Data Structure (at prediction time t):
- `x_past`: (batch, past_time_steps, past_features) - past discharge, P, T, past predictions
- `x_nan_mask`: (batch, past_time_steps, past_features) - binary mask for missing features  
- `x_future`: (batch, future_time_steps, future_vars) - weather forecast, temporal features
- `x_now`: (batch, 1, now_vars) - current predictions/errors from other models
- `x_static`: static basin features

### Core Components Needed:
- [ ] `DeepRegressor` class inheriting from `BaseForecastModel`
- [ ] `DeepMetaLearner` for meta-learning approaches
- [ ] Generic dataset classes with configurable NaN handling
- [ ] Neural network architectures (LSTM, CNN-LSTM, TiDE, TSMixer, Mamba)
- [ ] Loss functions (Quantile, Asymmetric Laplace)
- [ ] PyTorch Lightning base modules and training utilities

### Integration Requirements:
- [ ] LOOCV implementation (yearly cross-validation)
- [ ] Hyperparameter tuning with Optuna
- [ ] Model saving/loading functionality
- [ ] Operational prediction pipeline
- [ ] Compatibility with existing `FeatureExtractor` and evaluation pipeline

## Proposed Structure

```
monthly_forecasting/
├── forecast_models/
│   ├── deep_models/                    # NEW: Deep learning models
│   │   ├── deep_regressor.py          # Main forecaster class
│   │   ├── deep_meta_learner.py       # Meta-learning variant
│   │   ├── architectures/             # Neural network architectures
│   │   │   ├── lstm_models.py
│   │   │   ├── cnn_lstm_models.py
│   │   │   ├── tide_models.py
│   │   │   ├── tsmixer_models.py
│   │   │   └── mamba_models.py
│   │   ├── losses/                    # Loss functions
│   │   └── utils/                     # Training utilities
```

## Implementation Plan

### Phase 1: Core Infrastructure
- [ ] Set up project structure and dependencies
- [ ] Implement base deep learning classes
- [ ] Create generic dataset classes
- [ ] Develop PyTorch Lightning base modules

### Phase 2: Model Architectures  
- [ ] Implement LSTM variants
- [ ] Add CNN-LSTM models
- [ ] Integrate TiDE architecture
- [ ] Add TSMixer and Mamba models

### Phase 3: Integration & Testing
- [ ] Integrate with existing workflow
- [ ] Implement LOOCV and hyperparameter tuning
- [ ] Create comprehensive test suite
- [ ] Performance benchmarking

### Phase 4: Documentation & Examples
- [ ] Configuration examples
- [ ] Usage documentation
- [ ] Architecture extension guide

## Success Criteria

1. Deep learning models follow same interface as existing `SciRegressor`
2. Models can be easily added through configuration
3. Full integration with existing evaluation pipeline
4. Comprehensive test coverage
5. Performance comparable to or better than existing models

## Dependencies

- PyTorch
- PyTorch Lightning  
- Additional deep learning libraries as needed

## References

- Detailed plan: `scratchpads/planning/deep-learning-integration-comprehensive-plan.md`
- Existing code: `monthly_forecasting/deep_scr/`
- Current interface: `monthly_forecasting/forecast_models/base_class.py`

🤖 Generated with [Claude Code](https://claude.ai/code)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Integrate Deep Learning Models into Monthly Forecasting Workflow #47

Summary

Background

Objectives

Technical Requirements

Data Structure (at prediction time t):

Core Components Needed:

Integration Requirements:

Proposed Structure

Implementation Plan

Phase 1: Core Infrastructure

Phase 2: Model Architectures

Phase 3: Integration & Testing

Phase 4: Documentation & Examples

Success Criteria

Dependencies

References

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Integrate Deep Learning Models into Monthly Forecasting Workflow #47

Description

Summary

Background

Objectives

Technical Requirements

Data Structure (at prediction time t):

Core Components Needed:

Integration Requirements:

Proposed Structure

Implementation Plan

Phase 1: Core Infrastructure

Phase 2: Model Architectures

Phase 3: Integration & Testing

Phase 4: Documentation & Examples

Success Criteria

Dependencies

References

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions