Skip to content
View Rishi625's full-sized avatar
👀
👀

Block or report Rishi625

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Rishi625/README.md

Rushikesh Ghatage

Machine Learning Engineer • MLOps • NLP • LLM Systems

LinkedIn Email GitHub


About Me

Currently working as a Machine Learning Engineer at Ada IQ, Inc. — building production-grade ML/NLP pipelines, from entity resolution over 7M+ records to RL-guided knowledge graphs and multi-modal evaluation models deployed on SageMaker.

  • Architecting end-to-end ML systems — data pipelines, model training, serving, and monitoring
  • Experienced with LLM fine-tuning, agentic AI, knowledge graphs, and multi-modal deep learning
  • Strong focus on productionizing ML: Docker, Kubernetes, Terraform, CI/CD, AWS infrastructure

Work Experience

Ada IQ, Inc. — Machine Learning Engineer Co-op | Boston, MA | Jan 2025 – Ongoing

Built and productionized the entire ML pipeline end-to-end for a Fortune 500 apparel client

  • Productionized entity resolution (Glue/PySpark; adaptive salting + trigram/Jaccard) over 7M+ entities; reduced duplicates by 20%
  • Architected RL-guided knowledge graph (Neo4j) for product concept generation using PPO with graph traversal agent
  • Built multi-modal evaluation model (ResNet-50 + DistilBERT) with self-attention fusion; deployed on SageMaker
  • Automated infrastructure with Terraform + GitHub Actions orchestrating AWS Glue, Lambda, and Step Functions
  • Productionized BERTopic with Bayesian HPO; shipped SQL-ready topic features in Athena joined with ABSA + sentiment
  • Built dual-mode ABSA pipeline (guided + unguided) using Gemini with Function Calling on 4M+ comments
  • Identified critical performance drivers from 1M+ multi-channel comments using ABSA and statistical tests

Quantiphi Inc. — QA Automation Engineer | Mumbai, India | Oct 2022 – Aug 2023

  • Automated data extraction/validation in Python/SQL for Workday rollout; built KPI dashboards improving efficiency by 25%

Infosys Ltd. — System Engineer (SDET) | Pune, India | Nov 2020 – Oct 2022

  • Developed end-to-end test automation with Java, Selenium, Cucumber (BDD); reduced test cycle time by 40%

Tech Stack

Languages

Python SQL PySpark NoSQL

ML / Deep Learning / NLP

PyTorch HuggingFace TensorFlow scikit-learn FAISS

LLMs & AI Agents

Gemini LangChain vLLM LoRA LangFlow

MLOps & Cloud

AWS GCP Docker Kubernetes Terraform MLflow GitHub Actions

Data & Backend

Neo4j Elasticsearch FastAPI Streamlit Prometheus Tableau


Featured Projects

LLM-Finetune-Pipeline

Repo

Production ML pipeline for Llama 3.2 fine-tuning with LoRA/QLoRA, FastAPI + vLLM serving, Docker, Kubernetes, MLflow tracking, and CI/CD. End-to-end from data preprocessing to deployed inference.

PyTorch PEFT vLLM FastAPI Docker K8s MLflow

AIAgents — Multi-Agent Code Fixer

Repo

Multi-agent AI system using Gemini Flash API that autonomously diagnoses and fixes code bugs through an iterative observe-plan-act-verify loop with Planner and Reviewer agents.

Gemini API Agentic AI Multi-Agent JSON Contracts Git Checkpointing

MCP ML Monitor

Repo

MCP-based AI agent for real-time ML model monitoring & drift detection (KS-Test, PSI, Wasserstein) with LangFlow visual pipeline, LLM root cause analysis, and K8s integration.

MCP Drift Detection LangFlow Prometheus Kubernetes

VentureFit — Company Similarity

Repo

Company similarity framework using FAISS, Sentence Transformers, TF-IDF, BM25 with Gemini-powered summaries; deployed via FastAPI + Streamlit.

FAISS BERT BM25 FastAPI Streamlit NLP

CAIRO — Market Hypothesis Validator

Repo

Automated GTM validation pipeline using LLM framework to generate personas and rank investors. Won 2nd place at MIT AGI House × AI21 Labs hackathon.

LLMs Web Scraping NLP Hackathon Winner

Carbon Credit Trading Platform

Repo

Full-stack Carbon Credit Exchange built from scratch — MySQL/MongoDB database design, 11+ advanced SQL query patterns, interactive Streamlit dashboard with Plotly analytics.

MySQL SQL MongoDB Streamlit Plotly Database Design


Education

🎓 Northeastern University, Boston, MA — MS, Data Analytics Engineering

Machine Learning, NLP, MLOps, Deep Learning, Data Mining

🎓 NMIMS (Mukesh Patel School), Mumbai, India — BTech, Computer Engineering

Data Structures, Algorithms, AI, Databases


Open to opportunities in ML Engineering, Data Scientist, MLOps, NLP, and LLM Infrastructure roles.
📍 Boston, MA  |  📧 ghatage.r@northeastern.edu

Pinned Loading

  1. gibran96/course-registration-chatbot gibran96/course-registration-chatbot Public

    An LLM based chatbot that helps students make a decision on which courses to register for.

    Jupyter Notebook 4 2

  2. sundai-club/text-video-edit sundai-club/text-video-edit Public

    TypeScript 1 1

  3. VentureFit-Company-Similarity-Thesis-Fit VentureFit-Company-Similarity-Thesis-Fit Public

    A comprehensive system for analyzing and comparing document similarity using multiple approaches including TF-IDF, BM25, and BERT embeddings. The system includes both a FastAPI backend service and …

    Python

  4. AIAgents AIAgents Public

    Multi-agent AI system using Gemini Flash API that autonomously diagnoses and fixes code bugs through an iterative observe-plan-act-verify loop with Planner and Reviewer agents, structured JSON con…

    Python

  5. mcp-ml-monitor mcp-ml-monitor Public

    MCP Agent for ML Model Monitoring & Drift Detection — statistical drift analysis (KS-Test, PSI, Wasserstein), performance tracking, alerting system, LangFlow visual pipeline, and integrations with …

    Python

  6. LLM-Finetune-Pipeline LLM-Finetune-Pipeline Public

    Production-grade ML pipeline for Llama 3.2 fine-tuning with LoRA/QLoRA, FastAPI serving, Docker, Kubernetes, MLflow tracking, and CI/CD

    Python