This document outlines the development roadmap for SciGo, a blazing-fast scikit-learn compatible ML library for Go. Our goal is to reach v1.0.0 with a stable, production-ready API and comprehensive machine learning capabilities.
- v0.x releases: Active development, API may change
- v1.0.0: API stability guarantee, backward compatibility commitment
- Semantic Versioning: Following semver.org strictly
-
Linear Models
- LinearRegression (QR decomposition)
- SGDRegressor / SGDClassifier
- PassiveAggressive algorithms
-
Preprocessing
- StandardScaler
- MinMaxScaler
- OneHotEncoder
-
Clustering
- MiniBatch K-Means
-
Tree Models
- LightGBM inference (Python model compatibility)
-
Advanced Features
- Full scikit-learn API compatibility
- Online learning / Streaming support
- gRPC/Protobuf support
- Model serialization framework
- LogisticRegression
- Binary and multiclass classification
- L1/L2 regularization
- Solver options (lbfgs, liblinear, newton-cg)
- DecisionTree
- DecisionTreeClassifier
- DecisionTreeRegressor
- Feature importance calculation
- Tree visualization support
- RandomForest
- RandomForestClassifier
- RandomForestRegressor
- Out-of-bag score
- Parallel tree training
- SVM (Support Vector Machines)
- SVC (Support Vector Classifier)
- SVR (Support Vector Regressor)
- Kernel support (linear, rbf, poly, sigmoid)
- XGBoost
- XGBClassifier
- XGBRegressor
- Python model compatibility (.json, .ubj formats)
- Native Go training implementation
- GPU acceleration support (optional)
- LightGBM Training
- Native Go training implementation
- Full feature parity with Python API
- Categorical feature support
- Early stopping
- Custom objective functions
- Performance Improvements
- SIMD optimizations
- Memory pooling
- Parallel processing enhancements
- Benchmarking Suite
- Comprehensive performance tests
- Comparison with scikit-learn
- Memory profiling
- API Review
- Final API adjustments
- Deprecation of experimental features
- Interface stabilization
- Documentation
- Complete API documentation
- Migration guides
- Best practices guide
- Testing
- 90%+ test coverage
- Integration tests
- Cross-compatibility tests with Python
- API Stability Guarantee
- Backward compatibility commitment
- Long-term support (LTS) declaration
- Deprecation policy establishment
- Production Features
- Model versioning
- A/B testing support
- Monitoring and metrics
- Ecosystem
- Plugin system
- Community contributions guide
- Partner integrations
| Feature | Priority | Complexity | Status |
|---|---|---|---|
| LogisticRegression | High | Medium | Planned (v0.4.0) |
| DecisionTree | High | Medium | Planned (v0.4.0) |
| RandomForest | High | High | Planned (v0.5.0) |
| SVM | Medium | High | Planned (v0.5.0) |
| XGBoost | High | Very High | Planned (v0.6.0) |
| LightGBM Training | High | High | Planned (v0.7.0) |
| Neural Networks | Low | Very High | Post-v1.0.0 |
| Deep Learning | Low | Very High | Post-v1.0.0 |
- APIs may change between minor versions
- Breaking changes will be documented in CHANGELOG
- Deprecation warnings for at least one version
- No breaking changes in minor/patch releases
- Deprecated features maintained for at least 3 minor versions
- Clear migration paths for any future major version
We welcome contributions! Priority areas for community help:
-
Algorithm Implementations
- Help implement planned algorithms
- Optimize existing implementations
-
Testing
- Increase test coverage
- Add benchmark tests
- Cross-validation with scikit-learn
-
Documentation
- Improve examples
- Add tutorials
- Translate documentation
-
Performance
- SIMD optimizations
- GPU acceleration
- Memory optimizations
- Performance: 2-5x faster than scikit-learn for common operations
- Compatibility: 100% API compatibility for implemented features
- Quality: 90%+ test coverage, zero critical bugs
- Adoption: 1000+ GitHub stars, 50+ production deployments
- Community: 20+ active contributors
| Risk | Mitigation Strategy |
|---|---|
| API design flaws discovered late | Extensive testing in v0.x releases |
| Performance regression | Continuous benchmarking CI |
| Breaking scikit-learn compatibility | Automated compatibility tests |
| Slow adoption | Focus on documentation and examples |
| Technical debt | Regular refactoring sprints |
- v0.4.0: ~2-3 weeks (LogisticRegression, DecisionTree)
- v0.5.0: ~2-3 weeks (RandomForest, SVM)
- v0.6.0: ~3-4 weeks (XGBoost)
- v0.7.0: ~2-3 weeks (LightGBM Training)
- v0.8.0: ~1-2 weeks (Performance)
- v0.9.0: ~1-2 weeks (API Finalization)
- v1.0.0: ~2-3 months from now (estimated Q2 2025)
- Project Lead: Yuminosuke Sato
- GitHub: https://github.com/YuminosukeSato/scigo
- Issues: GitHub Issues
- Discussions: GitHub Discussions
This roadmap is subject to change based on community feedback and project priorities. Last updated: 2025-08-07