To train and test logistic regression model without using any external libraries
➡️ Logistic regression is one of the first machine learning algorithms that anyone learns about when getting started with AI and ML. Understanding this alogorithm inside out is most important to guage the depth of knowledge of a machine learning engineer or a data scientist.
➡️ Knowledge of how to implement from scratch will be very useful while deploying the machine learning models in the production environment where we cannnot depend on external libraries which are not at all optimized to tackle real world data in raw state. We need to know the algorithm inside out to write in other languages such as C++ etc which are faster than python.
➡️ In this notebook we will see how to implement logistic regression step by step and also compare our results and weights with scikit learn for checking the credibility of the solution.
As this is an implementation walkthorugh, we will just create a dummy classifcation dataset for this notebook. After creating the dataset we will also split the dataset in to train and test sets.
Here we train the sklearn LR model to get the weights and biases and save them for later for comparision with our custom implementation.
Here we write a function to initialize the weights and bias terms
Here we write a function to implement the sigmoid function with the above given formula
Here we write a function to compute the log-loss value with the above given formula.
Here we write a function to compute the value of gradiets wrt weights with the help of above formula.
Here we write a function to compute the value of the gradient wrt to the bias term with the help of the above given formula.
Here we write a function which takes in the model object and the dataset and outputs the results in the form of class labels for all the given data points.
Here we write a function train() which is very similar to fit() in sklearn. This takes in the train and test datasets along with weights and regularization parameter intializations. We also take in the number of epochs and update the weights and biases after each epoch of training with the help of gradient descend algorithm. Here we also store the train and test losses for each epoch to plot later for visualization.
Here we simply print the values of the weights and biases of the both the implementations and then also print the differences between the values to show that it is very negligible and close to zero.
Here we taek the arrays that we saved while training which contain the -log-loss values of train and test datasets and plot them to visualize the convergence.
🔗Connect with me on 🤝 LinkedIn : https://linkedin.com/in/rohan-vailala-thoma 💼Check out my other case study blogs on 😎🤟🏻: https://medium.com/@rohanvailalathoma
