Skip to content

zel-kass/dslr

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

55 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Data Science x Logistic Regression: A Magical Classifier

Introduction: A Muggle’s Mission at Hogwarts

In this project, inspired by the world of Harry Potter, we took on the role of data scientists tasked with creating a machine learning model to replace the malfunctioning Sorting Hat. Using logistic regression, we must classify students into the four Hogwarts houses based on their academic performance. Objectives: Learning the Fundamentals of Logistic Regression

This assignment serves as an introduction to classification problems, focusing on:

  • Data Exploration – Understanding, analyzing, and cleaning datasets.
  • Data Visualization – Using histograms, scatter plots, and pair plots to identify patterns.
  • Logistic Regression – Implementing a multi-class classification model using the one-vs-all strategy.
  • Model Training & Prediction – Writing custom code to train a model using gradient descent and make predictions.

Core Tasks: Steps to Build the Sorting Algorithm

1. Data Analysis

Before building a model, understanding the dataset is crucial. We were required to:

  • Examine the dataset’s structure.
  • Compute basic statistical properties (count, mean, std, min, max) without using pre-built functions like Pandas’ describe().

2. Data Visualization

Visualizing data helps in feature selection and detecting anomalies. We must

  • Histograms to assess score distributions across
  • Scatter plots to compare features and detect
  • Pair plots to analyze relationships between multiple features.

3. Logistic Regression

The heart of the project is implementing logistic regression for classification. We had to:

Train a one-vs-all classifier using gradient descent to optimize weights.
Create two programs:
    logreg_train to train the model and store weights.
    logreg_predict to classify new students and generate a prediction file.

Bonus: Enhancing the Model

  • Implementing stochastic gradient descent or other optimization techniques.
  • Expanding statistical analysis with more descriptive features.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages