Skip to content
View Aaryan2304's full-sized avatar

Highlights

  • Pro

Block or report Aaryan2304

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Aaryan2304/README.md

Hi there, I'm Aaryan Kurade πŸ‘‹

Typing SVG

Detection β€’ Segmentation β€’ Tracking β€’ Model Optimization β€’ Deployment

LinkedIn Portfolio Email Medium Twitter


πŸ’Ό About Me

Computer Vision Engineer building production ML systems across detection, segmentation, tracking, and geospatial applications. I specialize in deploying optimized CV pipelines (YOLO, RT-DETR, SAM) for sports analytics, satellite imagery, and industrial inspection with real-time inference constraints.

Core Expertise:

  • 🎯 Multi-object tracking (ByteTrack, DeepSORT)
  • πŸš€ Model optimization (ONNX, INT8/INT4 quantization, TensorRT)
  • ⚑ Real-time inference on edge devices
  • πŸ› οΈ Scalable deployment (FastAPI + Docker + Prometheus)
  • πŸ“Š Data annotation (1000+ images via Roboflow, CVAT, QGIS)

πŸ”§ Tech Stack

Languages & Frameworks

Computer Vision & Deep Learning

MLOps & Deployment

Data Annotation & Geospatial


πŸš€ Featured Projects

Production-ready tracking and analytics across volleyball, football, and basketball

Key Achievements:

  • ⚑ 100 FPS ball detection on CPU (Intel i7) using custom ONNX seq-9 model
  • 🎯 87.3% MOTA player tracking with ByteTrack + Kalman filtering
  • 🎨 Zero-shot team classification via SigLIP embeddings + KMeans
  • πŸ“Š Real-time inference (30-100 FPS) with hybrid CPU/GPU architecture

Tech: YOLO RT-DETR ByteTrack SigLIP ONNX Homography FastAPI


Semantic fashion search across 100K DeepFashion images

Key Achievements:

  • πŸš€ <100ms P95 latency on RTX 3050 GPU
  • πŸ—„οΈ FAISS vector index for 512-dim CLIP embeddings
  • πŸ’Ύ 60% memory efficiency via PyTorch optimizations
  • ⚑ Dynamic batching (8-32 images) with async FastAPI backend

Tech: CLIP FAISS FastAPI Streamlit Docker PyTorch


Multi-agent GeoAI system for natural language satellite imagery analysis

Key Features:

  • πŸ€– Multi-agent orchestration using LangGraph (Phi-3-mini, Moondream VLM, SAM3)
  • πŸ›°οΈ Google Earth Engine integration (10K+ datasets) with 24-hour caching
  • πŸ—ΊοΈ ChromaDB vector search for similar region discovery
  • ⚑ Target: <5s latency on RTX 3050 (1.5GB VRAM)

Tech: LangGraph Moondream SAM3 ChromaDB Google Earth Engine Folium


Real-time anomaly detection for surveillance systems

Key Achievements:

  • 🎯 92.47% precision | 83.78% recall | 0.7438 AUC (UCSD Ped2)
  • πŸ”„ Non-blocking pipeline with FastAPI + ThreadPoolExecutor
  • πŸ“Š Prometheus monitoring for GPU utilization and inference latency
  • πŸš€ Deployed on Render with Streamlit dashboard

Tech: Autoencoder FastAPI Prometheus Docker Streamlit


πŸ’Ό Professional Experience

🏒 Arakoo.ai | Software Engineer Intern | Aug 2025 - Oct 2025

Speech AI & LLM Optimization

  • Built real-time ASR pipeline with Voice Activity Detection, reducing false triggers by 40%
  • Deployed FastAPI async speaker diarization handling 50+ concurrent audio streams
  • Implemented prompt caching strategies cutting LLM inference costs by $0.02/minute

🏒 Utopia Optovision Pvt. Ltd. | Machine Learning Intern | Jan 2024 - Jan 2025

Industrial Computer Vision

  • Developed YOLOv8 + PaddleOCR pipeline achieving 15% accuracy improvement and 61% CER reduction (18%β†’7%)
  • Benchmarked Custom CNNs, R-CNN, and VLMs, selecting YOLO+OCR hybrid for <100ms latency
  • Optimized inference for resource-constrained CCTV hardware via ONNX export

πŸŽ“ Education & Certifications

πŸŽ“ MIT World Peace University | B.Tech, Electronics & Communication Engineering - AI/ML | 2021 - 2025

πŸ“œ Certifications:

  • βœ… AI Agents Fundamentals - HuggingFace
  • βœ… Google Cloud Computing Foundations - NPTEL
  • βœ… Computer Vision Bootcamp - OpenCV

🌟 Open Source Contributions

Active contributor to:

  • Roboflow - YOLO implementations and optimizations
  • HuggingFace - Model documentation and transformers
  • Computer vision libraries and tools

πŸ“Š GitHub Analytics

Streak Stats GitHub Stats
Activity Graph

πŸ“« Let's Connect

Building production CV systems | Open to collaboration on sports analytics & geospatial AI projects

Portfolio Email

Pinned Loading

  1. visual-search-engine visual-search-engine Public

    An AI-powered visual search engine that finds visually similar fashion items from a dataset of over 100,000 images with sub-100ms latency. The system uses CLIP embeddings for semantic understanding…

    Python 2

  2. sports-ai sports-ai Public

    computer vision and sports

    1

  3. cctv-video-anomaly-detection cctv-video-anomaly-detection Public

    A production-grade, deep-learning-based anomaly detection system for CCTV surveillance footage. This project uses a PyTorch-based Convolutional Autoencoder to achieve high precision in identifying …

    Python 5

  4. Text-Extraction-Project Text-Extraction-Project Public

    Extract text from images using YOLO and PaddleOCR

    Jupyter Notebook 1

  5. geospatial-ai geospatial-ai Public

  6. Aaryan2304 Aaryan2304 Public