Skip to content

Implementation of logistic regression from scratch without using external libraries like scikit learn. Here we write all the code to train and validate the model and compare the weights and the results with the standard sklearn model for clarification.

License

Notifications You must be signed in to change notification settings

Rohan-Thoma/Logistic-regression-from-scratch-without-using-sklearn

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 

Repository files navigation

How to implement logistic regression from scratch with L2 regularization without using sklearn

Screenshot 2023-10-04 142000_x16_drawing


🎯 Objective

To train and test logistic regression model without using any external libraries

➡️ Logistic regression is one of the first machine learning algorithms that anyone learns about when getting started with AI and ML. Understanding this alogorithm inside out is most important to guage the depth of knowledge of a machine learning engineer or a data scientist.
➡️ Knowledge of how to implement from scratch will be very useful while deploying the machine learning models in the production environment where we cannnot depend on external libraries which are not at all optimized to tackle real world data in raw state. We need to know the algorithm inside out to write in other languages such as C++ etc which are faster than python.
➡️ In this notebook we will see how to implement logistic regression step by step and also compare our results and weights with scikit learn for checking the credibility of the solution.

☕ Procedure

🔶 1. Creating a custom dataset

As this is an implementation walkthorugh, we will just create a dummy classifcation dataset for this notebook. After creating the dataset we will also split the dataset in to train and test sets.

🔶 2. Training the sklearn model for comparision of results

Here we train the sklearn LR model to get the weights and biases and save them for later for comparision with our custom implementation.

🔶 3. Implement Logistic regresion with L2 regularization without using sklearn

🏋️‍♂️ 3.1 Initialize the weights

Here we write a function to initialize the weights and bias terms

💻 3.2 Compute the sigmoid function

$sigmoid(z)= 1/(1+exp(-z))$
Here we write a function to implement the sigmoid function with the above given formula

📉 3.3 Compute log-loss

$log loss = -1*\frac{1}{n}\Sigma_{for each Y_{t},Y_{pred}}(Y_{t}*log_{10}(Y_{pred})+(1-Y_{t})*log_{10}(1-Y_{pred}))$
Here we write a function to compute the log-loss value with the above given formula.

🛼 3.4 Compute the gradient wrt 'w' (weights)

$dw^{(t)} = x_n(y_n − σ((w^{(t)})^{T} x_n+b^{t}))- \frac{λ}{N}w^{(t)}$
Here we write a function to compute the value of gradiets wrt weights with the help of above formula.

⚽ 3.5 Compute the gradient wrt 'b' (bias)

$db^{(t)} = y_n- σ((w^{(t)})^{T} x_n+b^{t})$
Here we write a function to compute the value of the gradient wrt to the bias term with the help of the above given formula.

🖖3.6 Prediction function

Here we write a function which takes in the model object and the dataset and outputs the results in the form of class labels for all the given data points.

🔶 4. Implementing the LR and training the model with the help of gradient descend algorithm

Here we write a function train() which is very similar to fit() in sklearn. This takes in the train and test datasets along with weights and regularization parameter intializations. We also take in the number of epochs and update the weights and biases after each epoch of training with the help of gradient descend algorithm. Here we also store the train and test losses for each epoch to plot later for visualization.

🔶 5. Comparision between the weights of sklearn and our custom implementation

Here we simply print the values of the weights and biases of the both the implementations and then also print the differences between the values to show that it is very negligible and close to zero.

🔶 6. Plot of train and test losses vs epochs to check for convergence

Here we taek the arrays that we saved while training which contain the -log-loss values of train and test datasets and plot them to visualize the convergence.

😎🤟🏻Some useful reference links:

🔗Connect with me on 🤝 LinkedIn : https://linkedin.com/in/rohan-vailala-thoma 💼Check out my other case study blogs on 😎🤟🏻: https://medium.com/@rohanvailalathoma

About

Implementation of logistic regression from scratch without using external libraries like scikit learn. Here we write all the code to train and validate the model and compare the weights and the results with the standard sklearn model for clarification.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published