🎯 Linear Regression from Scratch

A Journey from Negative R² to 98%+ Accuracy 🚀

📊 Quick Stats

Best Performance

Polynomial Features

Gradient Descent

Code Coverage

🌟 What Makes This Special?

🎓 Pure Implementation	🧮 Multiple Algorithms	📈 Advanced Features	📝 Detailed Logs
Built from scratch using only NumPy	Batch, SGD & Mini-Batch GD	Polynomial features & L1 reg	Complete failure-to-success journey

graph LR
    A[📊 Load Data] --> B[🔧 Feature Engineering]
    B --> C[📏 Normalization]
    C --> D[🎯 Train Model]
    D --> E{Choose Method}
    E -->|Batch GD| F[📊 R²:  95.84%]
    E -->|Stochastic GD| G[📊 R²: 98.50%]
    E -->|Mini-Batch GD| H[🏆 R²: 98.74%]
    F --> I[📈 Evaluate]
    G --> I
    H --> I
    I --> J[✨ Predictions]
    
    style A fill:#e1f5ff
    style H fill:#90EE90
    style J fill:#FFD700

✨ Features

🎯 Core Features

✅ Pure NumPy Implementation
- No sklearn for core algorithm
- Deep understanding of math
- Educational & transparent
✅ Three Gradient Descent Methods
- 📊 Batch GD
- ⚡ Stochastic GD
- 🔄 Mini-Batch GD
✅ Advanced ML Techniques
- 🔢 Polynomial Features (up to degree 2)
- 🎚️ L1 Regularization (Lasso)
- ⏱️ Early Stopping
- 📏 Z-Score Normalization

📊 Analysis Features

✅ Robust Evaluation
- 🔄 K-Fold Cross-Validation
- 📈 Multiple Metrics (MSE, RMSE, MAE, R²)
- 📊 Train/Test Performance
✅ Rich Visualizations
- 📉 Loss Convergence Curves
- 🎯 Residual Analysis
- 🔥 Correlation Heatmaps
- 📊 Actual vs Predicted Plots
- 🏆 Feature Importance Charts
✅ Production Ready
- 🧪 95%+ Test Coverage
- 📝 Comprehensive Documentation
- 🐳 Docker Support

🚀 Quick Start

Get Up and Running in 60 Seconds! ⚡

# 1️⃣ Clone the repository
git clone https://github.com/willow788/Linear-Regression-model-from-scratch.git
cd Linear-Regression-model-from-scratch

# 2️⃣ Create virtual environment
python3 -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# 3️⃣ Install dependencies
pip install -r requirements.txt

# 4️⃣ Run the model
python main.py

# 🎉 That's it! Your model is training!

🐳 Docker Quick Start (Click to expand)

# Build the image
docker build -t linear-regression . 

# Run the container
docker run -it -p 8888:8888 linear-regression

# Or use docker-compose
docker-compose up

💡 Usage Examples

🎯 Basic Usage

from linear_regression import LinearRegression
from data_preprocessing import load_and_preprocess_data

# Load your data
X_train, X_test, y_train, y_test = load_and_preprocess_data('Advertising.csv')

# Create and train model
model = LinearRegression(
    learn_rate=0.02,
    iter=50000,
    method='batch',
    l1_reg=0.1
)

model.fit(X_train, y_train)

# Make predictions
predictions = model.predict(X_test)

print(f"✨ Model R² Score: {model.evaluate(y_test, predictions):.4f}")

🔄 Comparing Different Methods

methods = {
    '📊 Batch GD':  {'method': 'batch', 'iter': 50000},
    '⚡ Stochastic GD': {'method': 'stochastic', 'iter': 50},
    '🔄 Mini-Batch GD': {'method': 'mini-batch', 'iter': 1000, 'batch_size': 16}
}

for name, params in methods.items():
    model = LinearRegression(learn_rate=0.01, **params)
    model.fit(X_train, y_train)
    score = calculate_r2(y_test, model.predict(X_test))
    print(f"{name}: R² = {score:.4f}")

📊 Cross-Validation

from model_evaluation import cross_validation_score

# Perform 5-fold cross-validation
cv_score = cross_validation_score(X, y, k=5)
print(f"🎯 Cross-Validated R² Score: {cv_score:.4f}")

📈 Visualization

from visualization import (
    plot_loss_convergence,
    plot_residuals,
    plot_actual_vs_predicted
)

# Plot loss over iterations
plot_loss_convergence(model. loss_history)

# Analyze residuals
plot_residuals(y_test, predictions)

# Compare actual vs predicted
plot_actual_vs_predicted(y_test, predictions)

📁 Project Structure

📦 Linear-Regression-model-from-scratch/
│
├── 📂 Version- 1/                          # 🔴 Initial experiments
│   ├── 📓 experiment_log.txt               # The negative R² saga
│   └── 📊 Raw jupyter Notebook/
│
├── 📂 Version- 2/                          # 🟡 Feature engineering
│   ├── 📓 experiment_log.txt
│   └── 📊 Raw jupyter Notebook/
│
├── 📂 Version- 3/                          # 🟠 Normalization fixes
│   ├── 📓 experiment_log.txt
│   └── 📊 Raw jupyter Notebook/
│
├── 📂 Version- 9/                          # 🟢 Production ready! 
│   ├── 📊 Raw jupyter Notebook/
│   │   └── 📓 sales. ipynb                 # Complete analysis
│   └── 🐍 Python Files/
│       ├── 📄 data_preprocessing.py       # Data pipeline
│       ├── 📄 linear_regression.py        # Core model
│       ├── 📄 model_evaluation.py         # Metrics & CV
│       ├── 📄 visualization. py            # Plotting utils
│       ├── 📄 main.py                     # Main script
│       └── 📄 config.py                   # Configuration
│
├── 🧪 tests/                               # Test suite
│   ├── 📄 test_linear_regression.py
│   ├── 📄 test_data_preprocessing.py
│   ├── 📄 test_model_evaluation.py
│   ├── 📄 test_visualization.py
│   ├── 📄 test_integration.py
│   └── 📄 conftest.py
│
├── 📊 outputs/                             # Generated visualizations
│   ├── 🖼️ loss_convergence.png
│   ├── 🖼️ residual_plot.png
│   ├── 🖼️ correlation_matrix.png
│   ├── 🖼️ actual_vs_predicted.png
│   └── 🖼️ feature_importance.png
│
├── 📊 Advertising.csv                      # Dataset
├── 📋 requirements.txt                     # Dependencies
├── 📋 requirements-dev.txt                 # Dev dependencies
├── 🐳 Dockerfile                           # Container config
├── 🐳 docker-compose.yml                   # Orchestration
├── ⚙️ Makefile                             # Utility commands
├── 📖 README.md                            # You are here! 
├── 📖 INSTALL.md                           # Installation guide
└── 📜 LICENSE                              # MIT License

🧪 The Journey

From Failure to Success: A Data Science Story 📚

Version	R² Score	Key Learnings
🔴 Version 1 The Crisis	-18. 77 😱	Problems Discovered: ❌ No feature normalization ❌ Learning rate too high ❌ Linear features insufficient Breakthrough: "Failure teaches more than success ever could"
🟡 Version 2 Engineering	~0.60 📈	Improvements Made: ✅ Added polynomial features ✅ Implemented basic normalization ⚠️ Still unstable convergence
🟠 Version 3 Refinement	~0.85 📊	Progress: ✅ Z-score normalization ✅ Tuned learning rates ✅ Added interaction terms ⚠️ Slight overfitting detected
🟢 Version 9 Production	0.9874 🏆	Final Optimizations: ✅ L1 regularization (λ = 0.15) ✅ Early stopping (patience = 1000) ✅ K-fold cross-validation ✅ Multiple GD methods ✅ Comprehensive testing

📈 Progress Visualization

R² Score Evolution
│
1.0 ┤                                                    ████ 🏆
0.9 ┤                                           ████████
0.8 ┤                                  █████████
0.7 ┤                         █████████
0.6 ┤                ████████
0.5 ┤       ████████
0.0 ┼──────────────────────────────────────────────────────────►
   -1. 0┤███                                              Iterations
-10.0 ┤███ 😱
-18.0 ┤███
      V1   V2      V3           V4-V8              V9

📊 Performance Metrics

🏆 Model Comparison

Method	Test R²	Train R²	RMSE	MAE	Training Time
📊 Batch GD	`0.9584`	`0.9509`	`0.2249`	`0.1533`	~45s
⚡ Stochastic GD	`0.9850`	`0.9848`	`0.1352`	`0.1118`	~5s
🔄 Mini-Batch GD	`0.9874` 🏆	`0.9860`	`0.1238`	`0.1011`	~12s

📈 Cross-Validation Results (5-Fold)

| Fold | R² Score | Status | |: ----:|:--------:|:------:| | 1 | 0.9870 | ✅ | | 2 | 0.9860 | ✅ | | 3 | 0.9925 | ✅ 🏆 | | 4 | 0.9867 | ✅ | | 5 | 0.9690 | ✅ | | Mean | 0.9842 | ✨ |

🔬 Mathematical Foundation

The Math Behind the Magic ✨

📐 Linear Regression Equation

$$\hat{y} = X\mathbf{w} + b$$

Where:

$\hat{y}$ = predictions
$X$ = feature matrix
$\mathbf{w}$ = weights
$b$ = bias

🎯 Loss Function (with L1 Regularization)

$$L(\mathbf{w}, b) = \frac{1}{2m}\sum_{i=1}^{m}(h_\mathbf{w}(x^{(i)}) - y^{(i)})^2 + \frac{\lambda}{2}\sum_{j=1}^{n}|w_j|$$

Where:

$m$ = number of samples
$\lambda$ = regularization parameter

📊 Gradient Descent Update Rules (Click to expand)

Weight Update: $$\mathbf{w} := \mathbf{w} - \alpha \cdot \frac{1}{m}X^T(X\mathbf{w} - \mathbf{y}) - \alpha \cdot \lambda \cdot \text{sign}(\mathbf{w})$$

Bias Update: $$b := b - \alpha \cdot \frac{1}{m}\sum_{i=1}^{m}(h_\mathbf{w}(x^{(i)}) - y^{(i)})$$

Parameters:

$\alpha$ = learning rate
$\lambda$ = L1 regularization parameter
$\text{sign}(\mathbf{w})$ = sign function for L1 penalty

🔢 Polynomial Feature Expansion (Click to expand)

Original Features: $[TV, Radio, Newspaper]$

Expanded to 9 features:

Feature #	Expression	Description
1	$TV$	Original TV budget
2	$Radio$	Original Radio budget
3	$Newspaper$	Original Newspaper budget
4	$TV^2$	Quadratic TV effect
5	$Radio^2$	Quadratic Radio effect
6	$Newspaper^2$	Quadratic Newspaper effect
7	$TV \times Radio$	Interaction effect
8	$TV \times Newspaper$	Interaction effect
9	$Radio \times Newspaper$	Interaction effect

📈 Visualizations

📊 Model Performance Insights

📉 Loss Convergence

Smooth convergence to global minimum

🎯 Residual Analysis

Random scatter indicates good fit

📊 Actual vs Predicted

Points close to diagonal line

🔥 Correlation Matrix

Feature relationships visualized

🏆 Feature Importance

TV advertising shows strongest impact on sales

🧰 Tech Stack

Built With Modern Tools 🛠️

Python 3.8+ Core Language	NumPy Numerical Computing	Pandas Data Manipulation	Scikit-Learn Validation Tools
Jupyter Interactive Analysis	Matplotlib Visualizations	Seaborn Statistical Plots	Docker Containerization

📊 Dataset

📈 Advertising Dataset

Attribute	Details
📁 Source	Kaggle / UCI ML Repository
📊 Samples	200 observations
🔢 Features	TV, Radio, Newspaper (advertising budgets in $1000s)
🎯 Target	Sales (in $1000s of units)
✅ Quality	No missing values
📈 Correlation	TV (0.78), Radio (0.58), Newspaper (0.23) with Sales

📊 Sample Data Preview (Click to expand)

   TV    Radio  Newspaper  Sales
0  230. 1  37.8   69.2      22.1
1  44.5   39.3   45.1      10.4
2  17.2   45.9   69.3      9.3
3  151.5  41.3   58.5      18.5
4  180.8  10.8   58.4      12.9

🎓 Key Learnings

💡 Insights from Building ML from Scratch

🔑 Technical Insights

Normalization is Critical 🎯
- Without it, gradients explode
- Z-score normalization works best
- Apply to both features AND targets
Feature Engineering Matters 🔧
- Polynomial terms capture non-linearity
- Interaction terms reveal relationships
- Domain knowledge helps feature selection
Regularization Prevents Overfitting 🛡️
- L1 (Lasso) performs feature selection
- Sparsity helps interpretability
- Balance between bias and variance

📚 Development Insights

Hyperparameter Tuning is an Art 🎨
- Learning rate: too high = divergence
- Too low = slow convergence
- Cross-validation finds sweet spot
Different Methods, Different Trade-offs ⚖️
- Batch GD: Stable but slow
- SGD: Fast but noisy
- Mini-Batch: Best of both worlds
Document Your Failures 📝
- Negative R² taught more than success
- Experiment logs are invaluable
- Share your learning journey

🚀 Future Roadmap

What's Next? 🔮

🔄 L2 Regularization (Ridge)
- Compare with L1
- Implement Elastic Net (L1 + L2)
🎯 Adaptive Learning Rates
- Adam optimizer
- RMSprop
- Learning rate scheduling
🔍 Automated Hyperparameter Tuning
- Grid Search
- Random Search
- Bayesian Optimization
📊 Extended Dataset Support
- Boston Housing
- California Housing
- Custom datasets
🌐 Web Interface
- Interactive predictions
- Real-time visualization
- Model playground
📱 API Development
- REST API with FastAPI
- Model serving
- Deployment pipeline
📚 Educational Content
- Step-by-step tutorials
- Video explanations
- Blog posts

💻 Command Reference

⚡ Quick Commands

# 📦 Installation
make install              # Install production dependencies
make install-dev          # Install dev dependencies

# 🧪 Testing
make test                 # Run all tests
make test-cov             # Run tests with coverage report

# 🎨 Code Quality
make lint                 # Run linters
make format               # Format code with black

# 🚀 Running
make run                  # Run main script
make jupyter              # Start Jupyter notebook

# 🐳 Docker
make docker-build         # Build Docker image
make docker-run           # Run Docker container

# 🧹 Cleanup
make clean                # Remove generated files

🤝 Contributing

Join the Journey! 🌟

We welcome contributions from the community!

🐛 Bug Reports

Found a bug?
Open an Issue

💡 Feature Requests

Have an idea?
Suggest a Feature

🔧 Pull Requests

Want to contribute?
Submit a PR

📋 Contribution Steps

# 1. Fork the repository
# 2. Clone your fork
git clone https://github.com/YOUR_USERNAME/Linear-Regression-model-from-scratch.git

# 3. Create a feature branch
git checkout -b feature/AmazingFeature

# 4. Make your changes and commit
git commit -m '✨ Add some AmazingFeature'

# 5. Push to your branch
git push origin feature/AmazingFeature

# 6. Open a Pull Request

Please ensure:

✅ Code passes all tests (pytest)
✅ Code is formatted (make format)
✅ Documentation is updated
✅ Commit messages are descriptive

📜 License

This project is licensed under the MIT License

See LICENSE for more information.

🙏 Acknowledgments

Special Thanks ❤️

📊 Dataset
Advertising Dataset
Kaggle Community

🎓 Inspiration
Andrew Ng
Machine Learning Course

🛠️ Tools
NumPy, Pandas
Scikit-Learn Team

📚 Community
Stack Overflow
GitHub Community

📞 Contact & Connect

Let's Connect! 🌐

📊 Repository Stats

Language Composition

⭐ Star This Repository!

If you found this project helpful, please consider giving it a star! ⭐

 ███████╗████████╗ █████╗ ██████╗     ████████╗██╗  ██╗██╗███████╗
 ██╔════╝╚══██╔══╝██╔══██╗██╔══██╗    ╚══██╔══╝██║  ██║██║██╔════╝
 ███████╗   ██║   ███████║██████╔╝       ██║   ███████║██║███████╗
 ╚════██║   ██║   ██╔══██║██╔══██╗       ██║   ██╔══██║██║╚════██║
 ███████║   ██║   ██║  ██║██║  ██║       ██║   ██║  ██║██║███████║
 ╚══════╝   ╚═╝   ╚═╝  ╚═╝╚═╝  ╚═╝       ╚═╝   ╚═╝  ╚═╝╚═╝╚══════╝

💙 Built with passion and ☕ by willow788

Learning by doing, one gradient descent at a time 🚀

⬆ Back to Top

Name		Name	Last commit message	Last commit date
Latest commit History 129 Commits
Outputs		Outputs
TESTS		TESTS
Version - 3		Version - 3
Version - 5		Version - 5
Version - 7		Version - 7
Version - 8		Version - 8
Version 1		Version 1
Version 2		Version 2
Version- 4		Version- 4
Version- 6		Version- 6
Version- 9		Version- 9
.gitignore		.gitignore
INSTALL.md		INSTALL.md
README.md		README.md
req_dev.txt		req_dev.txt
requirement.txt		requirement.txt

willow788/Linear-Regression-model-from-scratch

Folders and files

Latest commit

History

Repository files navigation

🎯 Linear Regression from Scratch

A Journey from Negative R² to 98%+ Accuracy 🚀

📊 Quick Stats

🌟 What Makes This Special?

📖 Table of Contents

✨ Features

🎯 Core Features

📊 Analysis Features

🚀 Quick Start

Get Up and Running in 60 Seconds! ⚡

💡 Usage Examples

🎯 Basic Usage

🔄 Comparing Different Methods

📊 Cross-Validation

📈 Visualization

📁 Project Structure

🧪 The Journey

From Failure to Success: A Data Science Story 📚

📈 Progress Visualization

📊 Performance Metrics

🏆 Model Comparison

📈 Cross-Validation Results (5-Fold)

🔬 Mathematical Foundation

The Math Behind the Magic ✨

📐 Linear Regression Equation

🎯 Loss Function (with L1 Regularization)

📈 Visualizations

📊 Model Performance Insights

📉 Loss Convergence

🎯 Residual Analysis

📊 Actual vs Predicted

🔥 Correlation Matrix

🏆 Feature Importance

🧰 Tech Stack

Built With Modern Tools 🛠️

📊 Dataset

📈 Advertising Dataset

🎓 Key Learnings

💡 Insights from Building ML from Scratch

🔑 Technical Insights

📚 Development Insights

🚀 Future Roadmap

What's Next? 🔮

💻 Command Reference

⚡ Quick Commands

🤝 Contributing

Join the Journey! 🌟

🐛 Bug Reports

💡 Feature Requests

🔧 Pull Requests

📋 Contribution Steps

📜 License

🙏 Acknowledgments

Special Thanks ❤️

📞 Contact & Connect

Let's Connect! 🌐

📊 Repository Stats

Language Composition

⭐ Star This Repository!

If you found this project helpful, please consider giving it a star! ⭐

💙 Built with passion and ☕ by willow788

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages