Skip to content

octave-ati/Portfolio

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

23 Commits
 
 
 
 

Repository files navigation

🎓 Portfolio

Welcome to my Portfolio where I list the projects that I did outside of my work.

Don't hesitate to contact me on LinkedIn if you want additional information regarding some of the projects.

📘 Table of Contents

🐍 Python

Current level: Advanced

Skills: Machine Learning, Deep Learning, Data analysis, Data cleaning, Data visualization, Natural Language Processing (NLP), Computer Vision (CV).

💻 Machine Learning

Project Name Area Description Libraries / Packages
ClassifyOps - a MLOps Text Classification Project MLOps, CI/CD, Orchestration, DataOps, Machine Learning, Natural Language Processing Deployment of a text classification model using modern MLOps practices _Airflow, Airbyte, Feast, BigQuery, DBT, DVC, Pre-commit, nltk, scikit-learn, great-expectations, fastAPI, MLFlow, Optuna _
Designing a Flight Booking Chatbot Machine Learning, Natural Language Processing, AI Development of a responsive chatbot using the Microsoft Botframework SDK. Microsoft BotFramework, Azure LUIS, Azure Monitor, Opencensus, pandas, numpy, matplotlib
Image Segmentation Model Machine Learning, Computer Vision, Image Segmentation As part of a Self Driving Vehicle, design of an Image Segmentation API deployed on Flask. Tensorflow, Keras, Flask, Scikit-Image, Albumentations, Open CV, squeezenet, pandas, numpy, matplotlib
Yelp reviews Image Classification and Sentiment Analysis Image Classification, Natural Language Processing, Topic Identification, Clustering The goal of this project was to classify Yelp review images as well as find topics of dissatisfaction among negative comments. Tensorflow, Keras, Scikit-Learn, Vader Sentiment, BERTopic, HDBScan, Spacy, UMAP-learn, Open CV, category encoders, plotly, pandas, numpy, matplotlib
Borrower Scoring Algorithm Credit Scoring, Machine Learning, Data Preprocessing, Class Imbalance, Hyperparameter Optimization Creation of a Scoring Algorithm for a Credit Company from a dirty and imbalanced dataset. Scikit-learn, Boruta-python, Scikit-Optimize, Imbalanced-learn, Category-encoders, pandas, numpy, matplotlib, seaborn
Twitter Sentiment Analysis Natural Language Processing, Deep Learning, Sentiment Analysis In this project, I develop 3 tiers of Sentiment Analysis tools to classify Twitter posts Tensorflow, Keras, BERT, Azure Machine Learning Studio, Scikit-learn, spacy, pandas, numpy, matplotlib
E-Commerce Customer Segmentation Clustering, Hyperparameter Optimization, Data Cleaning Segmenting a customer Database in order to identify actionnable characteristics that can be used in a marketing campaign Scikit-learn, geopy, clusteval, yellowbrick, pandas, numpy, matplotlib, seaborn
Building an Article Recommendation Application Recommendation model, Collaborative Filtering, Machine Learning Creating a recommendation model deployed on a smart phone application for a content creation website. Tensorflow, Keras, Scikit-learn, Surprise, UMAP-learn, pandas, numpy, matplotlib, seaborn
Detecting Counterfeit Notes Machine Learning, Logistic Regression The goal of this project was to use a database of genuine and fake bank notes to build an algorithm to detect counterfeit notes. pandas, numpy, scikit-learn, seaborn, scipy, statsmodels, matplotlib
Predicting a Child's Income Machine Learning, Stochastic Gradient Descent, Linear Regression During this project based on scientific research, I used data from the Worldbank databank to create a model to predict the income of a child based on the parent's income and the country data (mean income and gini index) pandas, sklearn, seaborn, scipy, statsmodels, matplotlib, tqdm

📊 Data Analysis

Project Name Area Description Libraries / Packages
Analyzing Food Data Data Cleaning, Data Analysis, Data Visualization Analyzing open source food data to identify health trends pandas, numpy, plotly, scipy, seaborn, matplotlib
Paris Tree Data Analysis Data Analysis, Data Visualization Analyzing Paris tree data to identify trends about trees pandas, numpy, scikit-learn, scipy, seaborn, matplotlib
Analyzing Sales Data (link to notebook) Data Analysis, Data Visualization Analyzing company sales data to highlight areas for improvement and identify potential new markets pandas, numpy, sklearn, seaborn, scipy, matplotlib
Conducting a Market Analysis (link to notebook) Hierarchical Clustering, Principal Component Analysis (PCA) Using data gathered from the FAO website and clustering methods, the goal of this project was to identify candidate countries for expanding a Chicken export company pandas, numpy, scikit-learn, seaborn, scipy, statsmodels, matplotlib
Conducting a Public Health Study (link to notebook) Data Analysis, Data Visualization In this project, I used FAO data to analyze the widespread malnutrition epidemic and communicate my analysis in a Powerpoint presentation. pandas, numpy, matplotlib

💾 SQL

Current level: Advanced

Databases used: MySQL, PostgreSQL, BigQuery, Apache.

Disclaimer : Most of my experience in SQL was during my work as a Data Analyst for the French Navy. For obvious security reasons, I am not able to disclose the code for these projects.

Project Name Description SQL Level
Cyclistic Bike Share Analysis This is my version of the capstone project for Google's Data Analytics Course, where I use open source data to provide insights about a bike sharing company. SQL was used to extract from a 2GB dataset the most relevant data for further analysis with R. Intermediate
Improving a Media Strategy This project was the first project of my Bachelor's Degree, it showcases my ability to write beginner level SQL queries, its goal is to increase the audience of a Media Company. Beginner

💹 Tableau

Level : Intermediate

Project Name (Link to Tableau Public) Description
Dental Pharma KPI Dashboard (link to Tableau Public dashboard) The goal of this project was to create a dashboard for a Pharmaceutical company by displaying 3 KPIs : costs, delays and deliverables delivered. One of the requirements was interactive thresholds that could allow the leaders of the company to identify the countries not respecting the KPI thresholds
Cyclistic Bike Share Analysis Viz(link to Tableau Public dashboard) This is my version of the capstone project for Google's Data Analytics Course, where I use open source data to provide insights about a bike sharing company. Tableau was used during this project to produce heatmaps of the most often used stations.

R

Level: Beginner

Packages used: Tidyverse (dplyr, ggplot), Seaborn, bigrquery

Project Name Area Description Packages
Cyclistic Bike Share Analysis Data Analysis, Data visualization This is my version of the capstone project for Google's Data Analytics Course, where I use open source data to provide insights about a bike sharing company. R was used to perform data analysis during this project. tidyverse, bigrquery, wk, dplyr, seaborn, ggplot

About

This repository stores my Portfolio as a Data Scientist, Machine Learning Engineer and Data Analyst.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages