Skip to content

A collection of end-to-end Data Science and Machine Learning projects with explanations, datasets, notebooks, and deployments.

License

Notifications You must be signed in to change notification settings

Sanhith30/Data-Science-And-ML-Projects

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

215 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Data-Science-And-ML-Projects

"End-to-end Data Science and Machine Learning projects portfolio with Python, ML, Deep Learning, NLP, and deployment-ready solutions."

Data Science and ML Projectss

Projects

A comprehensive data analysis and machine learning project focused on predicting passenger survival on the Titanic

This project include:

  • Data Exploration & Cleaning
  • Feature Engineering
  • Model Building &Evaluation
  • Visualizations & Insights

Algorithms Used:

  • Logistic Regression

Tools & Libraries:

Python Pandas NumPy Matplotlib Seaborn Scikit-learn Jupyter


2ND Project

A complete end-to-end Machine Learning project to detect whether a news article is Real or Fake using NLP techniques, datasets, and deployment. A text classification project that applies Natural Language Processing (NLP) and Machine Learning to detect fake news from real news


This project includes:

  • Data Exploration & Cleaning
  • Text Preprocessing (tokenization, stopword removal, TF-IDF)
  • Model Building & Evaluation
  • Visualizations & Insights
  • Deployment with Streamlit

Algorithms Used:

  • Logistic Regression
  • Passive Aggressive Classifier

Tools & Libraries: Python Pandas NumPy Matplotlib Seaborn Scikit-learn NLTK Streamlit Jupyter


Deployment:

The model is deployed using Streamlit and can be run locally or hosted on platforms like Streamlit Cloud.

3RD Project

A machine learning-powered web application that predicts house rental prices in Hyderabad, India.
Built with Streamlit and Random Forest Regression, this app provides accurate price estimates based on property characteristics.


This project includes:

  • Data Cleaning & Preprocessing
  • Feature Engineering
  • Model Building & Evaluation
  • Cross-Validation & Model Performance Metrics
  • Interactive Web App with Streamlit
  • Data Visualizations & Insights

Algorithms Used:

  • Random Forest

Tools & Libraries:

Python Pandas NumPy Matplotlib Seaborn Scikit-learn Plotly Streamlit


A Natural Language Processing (NLP) and Machine Learning project focused on predicting whether two Quora questions are semantically similar (duplicates) or not.
The project leverages feature engineering, vectorization techniques, and multiple ML algorithms for evaluation.

This project includes:

  • Data Exploration & Cleaning
  • Advanced Feature Engineering (NLP features)
  • Text Preprocessing (stemming, stopword removal, contractions expansion)
  • Vectorization (TF-IDF, TF-IDF Word2Vec with GloVe)
  • Model Training & Evaluation
  • Visualizations (Word Clouds, t-SNE, PCA)

Algorithms Used:

  • Logistic Regression SVM XGBoost Decision Tree Random Forest

Tools & Libraries:

Python Pandas NumPy Matplotlib Seaborn Scikit-learn NLTK XGBoost Jupyter


A Machine Learning classification project to predict whether a given sonar signal corresponds to a Rock or a Mine (Metal Cylinder).
This project leverages multiple ML models, feature visualization, and evaluation techniques.


This project includes:

  • Data Exploration & Cleaning
  • Class Distribution Visualization
  • Feature Distribution Analysis
  • Dimensionality Reduction (PCA)
  • Model Building & Comparison
  • Confusion Matrix, ROC, Precision-Recall Curves

Algorithms Used:

Logistic Regression SVM KNN Gradient Boosting MLP Neural Net


Tools & Libraries:

Python Pandas NumPy Matplotlib Seaborn Scikit-learn XGBoost Jupyter Notebook


Deployment:

The project is structured for local Jupyter Notebook execution, but can be extended for deployment using Streamlit or Flask if needed.


πŸ“Š Repo Insights:

Stars Forks Issues License Last Commit Repo Size Languages

About

A collection of end-to-end Data Science and Machine Learning projects with explanations, datasets, notebooks, and deployments.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published