|
Best Performance |
Polynomial Features |
Gradient Descent |
Code Coverage |
| ๐ Pure Implementation | ๐งฎ Multiple Algorithms | ๐ Advanced Features | ๐ Detailed Logs |
|---|---|---|---|
| Built from scratch using only NumPy | Batch, SGD & Mini-Batch GD | Polynomial features & L1 reg | Complete failure-to-success journey |
graph LR
A[๐ Load Data] --> B[๐ง Feature Engineering]
B --> C[๐ Normalization]
C --> D[๐ฏ Train Model]
D --> E{Choose Method}
E -->|Batch GD| F[๐ Rยฒ: 95.84%]
E -->|Stochastic GD| G[๐ Rยฒ: 98.50%]
E -->|Mini-Batch GD| H[๐ Rยฒ: 98.74%]
F --> I[๐ Evaluate]
G --> I
H --> I
I --> J[โจ Predictions]
style A fill:#e1f5ff
style H fill:#90EE90
style J fill:#FFD700
- โจ Features
- ๐ Quick Start
- ๐ฆ Installation
- ๐ก Usage Examples
- ๐ Project Structure
- ๐งช The Journey
- ๐ Performance Metrics
- ๐ฌ Mathematical Foundation
- ๐ Visualizations
- ๐งฐ Tech Stack
- ๐ค Contributing
- ๐ License
|
|
# 1๏ธโฃ Clone the repository
git clone https://github.com/willow788/Linear-Regression-model-from-scratch.git
cd Linear-Regression-model-from-scratch
# 2๏ธโฃ Create virtual environment
python3 -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# 3๏ธโฃ Install dependencies
pip install -r requirements.txt
# 4๏ธโฃ Run the model
python main.py
# ๐ That's it! Your model is training! ๐ณ Docker Quick Start (Click to expand)
# Build the image
docker build -t linear-regression .
# Run the container
docker run -it -p 8888:8888 linear-regression
# Or use docker-compose
docker-compose upfrom linear_regression import LinearRegression
from data_preprocessing import load_and_preprocess_data
# Load your data
X_train, X_test, y_train, y_test = load_and_preprocess_data('Advertising.csv')
# Create and train model
model = LinearRegression(
learn_rate=0.02,
iter=50000,
method='batch',
l1_reg=0.1
)
model.fit(X_train, y_train)
# Make predictions
predictions = model.predict(X_test)
print(f"โจ Model Rยฒ Score: {model.evaluate(y_test, predictions):.4f}")methods = {
'๐ Batch GD': {'method': 'batch', 'iter': 50000},
'โก Stochastic GD': {'method': 'stochastic', 'iter': 50},
'๐ Mini-Batch GD': {'method': 'mini-batch', 'iter': 1000, 'batch_size': 16}
}
for name, params in methods.items():
model = LinearRegression(learn_rate=0.01, **params)
model.fit(X_train, y_train)
score = calculate_r2(y_test, model.predict(X_test))
print(f"{name}: Rยฒ = {score:.4f}")from model_evaluation import cross_validation_score
# Perform 5-fold cross-validation
cv_score = cross_validation_score(X, y, k=5)
print(f"๐ฏ Cross-Validated Rยฒ Score: {cv_score:.4f}")from visualization import (
plot_loss_convergence,
plot_residuals,
plot_actual_vs_predicted
)
# Plot loss over iterations
plot_loss_convergence(model. loss_history)
# Analyze residuals
plot_residuals(y_test, predictions)
# Compare actual vs predicted
plot_actual_vs_predicted(y_test, predictions)๐ฆ Linear-Regression-model-from-scratch/
โ
โโโ ๐ Version- 1/ # ๐ด Initial experiments
โ โโโ ๐ experiment_log.txt # The negative Rยฒ saga
โ โโโ ๐ Raw jupyter Notebook/
โ
โโโ ๐ Version- 2/ # ๐ก Feature engineering
โ โโโ ๐ experiment_log.txt
โ โโโ ๐ Raw jupyter Notebook/
โ
โโโ ๐ Version- 3/ # ๐ Normalization fixes
โ โโโ ๐ experiment_log.txt
โ โโโ ๐ Raw jupyter Notebook/
โ
โโโ ๐ Version- 9/ # ๐ข Production ready!
โ โโโ ๐ Raw jupyter Notebook/
โ โ โโโ ๐ sales. ipynb # Complete analysis
โ โโโ ๐ Python Files/
โ โโโ ๐ data_preprocessing.py # Data pipeline
โ โโโ ๐ linear_regression.py # Core model
โ โโโ ๐ model_evaluation.py # Metrics & CV
โ โโโ ๐ visualization. py # Plotting utils
โ โโโ ๐ main.py # Main script
โ โโโ ๐ config.py # Configuration
โ
โโโ ๐งช tests/ # Test suite
โ โโโ ๐ test_linear_regression.py
โ โโโ ๐ test_data_preprocessing.py
โ โโโ ๐ test_model_evaluation.py
โ โโโ ๐ test_visualization.py
โ โโโ ๐ test_integration.py
โ โโโ ๐ conftest.py
โ
โโโ ๐ outputs/ # Generated visualizations
โ โโโ ๐ผ๏ธ loss_convergence.png
โ โโโ ๐ผ๏ธ residual_plot.png
โ โโโ ๐ผ๏ธ correlation_matrix.png
โ โโโ ๐ผ๏ธ actual_vs_predicted.png
โ โโโ ๐ผ๏ธ feature_importance.png
โ
โโโ ๐ Advertising.csv # Dataset
โโโ ๐ requirements.txt # Dependencies
โโโ ๐ requirements-dev.txt # Dev dependencies
โโโ ๐ณ Dockerfile # Container config
โโโ ๐ณ docker-compose.yml # Orchestration
โโโ โ๏ธ Makefile # Utility commands
โโโ ๐ README.md # You are here!
โโโ ๐ INSTALL.md # Installation guide
โโโ ๐ LICENSE # MIT License
| Version | Rยฒ Score | Key Learnings |
|---|---|---|
|
๐ด Version 1 The Crisis |
-18. 77 ๐ฑ |
Problems Discovered:
Breakthrough: "Failure teaches more than success ever could" |
|
๐ก Version 2 Engineering |
~0.60 ๐ |
Improvements Made:
|
|
๐ Version 3 Refinement |
~0.85 ๐ |
Progress:
|
|
๐ข Version 9 Production |
0.9874 ๐ |
Final Optimizations:
|
Rยฒ Score Evolution
โ
1.0 โค โโโโ ๐
0.9 โค โโโโโโโโ
0.8 โค โโโโโโโโโ
0.7 โค โโโโโโโโโ
0.6 โค โโโโโโโโ
0.5 โค โโโโโโโโ
0.0 โผโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโบ
-1. 0โคโโโ Iterations
-10.0 โคโโโ ๐ฑ
-18.0 โคโโโ
V1 V2 V3 V4-V8 V9
| Method | Test Rยฒ | Train Rยฒ | RMSE | MAE | Training Time |
|---|---|---|---|---|---|
| ๐ Batch GD | 0.9584 |
0.9509 |
0.2249 |
0.1533 |
~45s |
| โก Stochastic GD | 0.9850 |
0.9848 |
0.1352 |
0.1118 |
~5s |
| ๐ Mini-Batch GD | 0.9874 ๐ |
0.9860 |
0.1238 |
0.1011 |
~12s |
| Fold | Rยฒ Score | Status | |: ----:|:--------:|:------:| | 1 | 0.9870 | โ | | 2 | 0.9860 | โ | | 3 | 0.9925 | โ ๐ | | 4 | 0.9867 | โ | | 5 | 0.9690 | โ | | Mean | 0.9842 | โจ |
๐ Linear Regression EquationWhere:
|
๐ฏ Loss Function (with L1 Regularization)Where:
|
๐ Gradient Descent Update Rules (Click to expand)
Weight Update:
Bias Update:
Parameters:
-
$\alpha$ = learning rate -
$\lambda$ = L1 regularization parameter -
$\text{sign}(\mathbf{w})$ = sign function for L1 penalty
๐ข Polynomial Feature Expansion (Click to expand)
Original Features:
Expanded to 9 features:
| Feature # | Expression | Description |
|---|---|---|
| 1 | Original TV budget | |
| 2 | Original Radio budget | |
| 3 | Original Newspaper budget | |
| 4 | Quadratic TV effect | |
| 5 | Quadratic Radio effect | |
| 6 | Quadratic Newspaper effect | |
| 7 | Interaction effect | |
| 8 | Interaction effect | |
| 9 | Interaction effect |
|
Smooth convergence to global minimum |
Random scatter indicates good fit |
|
Points close to diagonal line |
Feature relationships visualized |
| Attribute | Details |
|---|---|
| ๐ Source | Kaggle / UCI ML Repository |
| ๐ Samples | 200 observations |
| ๐ข Features | TV, Radio, Newspaper (advertising budgets in $1000s) |
| ๐ฏ Target | Sales (in $1000s of units) |
| โ Quality | No missing values |
| ๐ Correlation | TV (0.78), Radio (0.58), Newspaper (0.23) with Sales |
๐ Sample Data Preview (Click to expand)
TV Radio Newspaper Sales
0 230. 1 37.8 69.2 22.1
1 44.5 39.3 45.1 10.4
2 17.2 45.9 69.3 9.3
3 151.5 41.3 58.5 18.5
4 180.8 10.8 58.4 12.9
|
|
-
๐ L2 Regularization (Ridge)
- Compare with L1
- Implement Elastic Net (L1 + L2)
-
๐ฏ Adaptive Learning Rates
- Adam optimizer
- RMSprop
- Learning rate scheduling
-
๐ Automated Hyperparameter Tuning
- Grid Search
- Random Search
- Bayesian Optimization
-
๐ Extended Dataset Support
- Boston Housing
- California Housing
- Custom datasets
-
๐ Web Interface
- Interactive predictions
- Real-time visualization
- Model playground
-
๐ฑ API Development
- REST API with FastAPI
- Model serving
- Deployment pipeline
-
๐ Educational Content
- Step-by-step tutorials
- Video explanations
- Blog posts
# ๐ฆ Installation
make install # Install production dependencies
make install-dev # Install dev dependencies
# ๐งช Testing
make test # Run all tests
make test-cov # Run tests with coverage report
# ๐จ Code Quality
make lint # Run linters
make format # Format code with black
# ๐ Running
make run # Run main script
make jupyter # Start Jupyter notebook
# ๐ณ Docker
make docker-build # Build Docker image
make docker-run # Run Docker container
# ๐งน Cleanup
make clean # Remove generated files|
Found a bug?
|
Have an idea?
|
Want to contribute?
|
# 1. Fork the repository
# 2. Clone your fork
git clone https://github.com/YOUR_USERNAME/Linear-Regression-model-from-scratch.git
# 3. Create a feature branch
git checkout -b feature/AmazingFeature
# 4. Make your changes and commit
git commit -m 'โจ Add some AmazingFeature'
# 5. Push to your branch
git push origin feature/AmazingFeature
# 6. Open a Pull RequestPlease ensure:
- โ
Code passes all tests (
pytest) - โ
Code is formatted (
make format) - โ Documentation is updated
- โ Commit messages are descriptive
|
๐ Dataset
|
๐ Inspiration
|
๐ ๏ธ Tools
|
๐ Community
|
โโโโโโโโโโโโโโโโโ โโโโโโ โโโโโโโ โโโโโโโโโโโโ โโโโโโโโโโโโโโ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโ โโโโโโโโโโโโโโ
โโโโโโโโ โโโ โโโโโโโโโโโโโโโโ โโโ โโโโโโโโโโโโโโโโโโโ
โโโโโโโโ โโโ โโโโโโโโโโโโโโโโ โโโ โโโโโโโโโโโโโโโโโโโ
โโโโโโโโ โโโ โโโ โโโโโโ โโโ โโโ โโโ โโโโโโโโโโโโโโ
โโโโโโโโ โโโ โโโ โโโโโโ โโโ โโโ โโโ โโโโโโโโโโโโโโ
๐ Built with passion and โ by willow788
Learning by doing, one gradient descent at a time ๐




