Skip to content

nakulBageja/Gemini_Agentic_Shopping

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

11 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

DealLens AI πŸ›οΈ

Talk to your shopping assistant like never before! DealLens AI is a real-time conversational shopping companion powered by Gemini 2.0 Flash Live API.

Think "Alexa for shopping" but smarter – speak naturally about products you're considering, and get instant voice responses with visual deal comparisons. DealLens transforms how you discover better prices and make purchasing decisions.


✨ Key Features

🎀 Natural Voice Conversations: Low-latency, interruptible shopping discussions
πŸ›’ Real-time Price Discovery: "I see AirPods for Β£249 at Apple Store" β†’ Get instant alternatives
πŸ‘οΈ Visual Deal Cards: See price comparisons and savings at a glance
πŸ”Š Seamless Audio: Crystal clear responses with no breaking or distortion
↩️ Smart Interruptions: Change your mind mid-conversation, just like talking to a human
πŸ“± Clean UI: Icon-only interface with visual state indicators


Problem

When shopping in-store, users often wonder if they can find the same product cheaper elsewhere. Manually checking websites is time-consuming and breaks the shopping experience.


Solution

  • Users speak to the agent:

    "I see a PS5 for Β£500 at Sony store. Can you check for cheaper options?"

  • The agent parses intent and searches deals.
  • Agent responds via voice and visual cards:

    "Amazon has it for Β£469 and Argos for Β£479. You could save Β£31."

Planned enhancements include camera-based product recognition.


πŸ”§ Technical Architecture

Technologies: Python FastAPI, WebSocket, Gemini Live API, Web Audio API, Vanilla JS
Audio Processing: Unified 24kHz sample rate, seamless chunk scheduling, persistent microphone streams
Data Storage: JSON-based deal database for rapid prototyping
Category: Live Agents with multimodal Voice + Visual output

Architecture Overview

DealLens AI Architecture Complete system architecture showing real-time voice processing, Gemini Live API integration, and deal search capabilities

Recent Technical Achievements:

  • 🎡 Voice Breaking Eliminated: Seamless audio scheduling prevents gaps between response chunks
  • ⚑ 50% Faster Response: 3-second silence detection (down from 10s)
  • 🎀 No Permission Re-requests: Persistent microphone streams improve UX
  • πŸ“Š Web Audio API Compliant: Power-of-2 buffer sizes (8192) for optimal performance

πŸš€ Try DealLens AI Now

🌐 Live Demo (Cloud Deployment)

Production Backend: https://deallens-backend-553067044467.us-central1.run.app

Quick Start:

  1. Open the frontend locally: Download & serve frontend files
  2. Frontend automatically connects to cloud backend
  3. Click microphone, grant permissions, and start talking!

The production backend runs on Google Cloud Run with enterprise-grade reliability.

πŸ’» Local Development Setup

Step 1: Install Dependencies

cd app/backend
pip install -r requirements.txt

Step 2: Get Gemini API Key

  1. Visit Google AI Studio
  2. Create a free API key
  3. Create app/backend/.env file:
GEMINI_API_KEY=your_key_here

Step 3: Start Local Backend

cd app/backend
python main.py

Step 4: Open Frontend

Open: http://localhost:8000/static/index.html

Frontend automatically detects local vs cloud backend

Step 5: Test Connection

  1. Check "Connected!" status appears
  2. Click the microphone button
  3. Grant microphone permissions when prompted
  4. Say: "I found AirPods for Β£249, can you find cheaper?"

☁️ Cloud Deployment

The backend is deployed on Google Cloud Run for global accessibility:

  • URL: <>
  • Region: us-central1 (Iowa, USA)
  • Scaling: Auto-scaling 0-100 instances
  • Security: API keys in Secret Manager
  • Monitoring: Cloud Run metrics and logging

See Server Documentation for full deployment guide.


πŸ“– Documentation

Complete Technical Documentation


πŸ’¬ Example Conversations

Price Comparison:

πŸ—£οΈ "I found iPhone 15 Pro for Β£999 at Currys, can you find it cheaper?"
πŸ€– "Amazon has it for Β£949 and Very for Β£969. You could save Β£50 with Amazon!"

Product Discovery:

πŸ—£οΈ "What's the best deal on gaming headsets under Β£100?"
πŸ€– "Great question! I found the SteelSeries Arctis 7 for Β£89 at Game, down from Β£159!"

Smart Interruptions:

πŸ—£οΈ "Actually, I meant wireless headsets instead"
πŸ€– "Got it! For wireless, the Sony WH-1000XM4 is Β£279 at John Lewis..."


How It Works

  1. User speaks into the microphone on the web page
  2. Gemini Live API handles transcription & intent parsing in real-time
  3. Deal Search Tools query the product database for price comparisons
  4. User receives synchronized voice response + visual deal cards

❓ Troubleshooting

Audio Issues

  • No microphone access: Ensure HTTPS is used (or localhost). Check browser permissions in Settings.
  • Voice breaking/distortion: Clear browser cache and reload. Try Chrome/Edge for best compatibility.
  • Microphone not working: Test microphone with other apps. Check device isn't muted.

Connection Issues

  • "Disconnected" status: Ensure backend is running on localhost:8000. Check terminal for errors.
  • WebSocket errors: Try refreshing the page. Check firewall isn't blocking port 8000.
  • API errors: Verify Gemini API key is valid and has Live API access enabled.

Browser Compatibility

  • Recommended: Chrome, Edge (full Web Audio API support)
  • Limitations: Safari may have microphone permission issues
  • Mobile: Works best on mobile Chrome/Safari with HTTPS

Common Error Messages

  • "Microphone permission denied": Grant permissions in browser settings
  • "Not connected to backend": Start the Python backend server
  • "Audio processing failed": Check Web Audio API support in browser console

πŸ›£οΈ Roadmap

Phase 1 βœ… Voice-first shopping assistant with seamless audio
Phase 2 πŸ”„ Vision-enabled product recognition via camera
Phase 3 πŸ“‹ Enhanced deal database with real-time pricing APIs
Phase 4 🀝 Multi-retailer integrations and purchase capabilities


Demo Link

https://youtu.be/pR8CzluWsj8


πŸ“œ License

This project is licensed under the MIT License. See the LICENSE file for details.

🀝 Contributing

DealLens AI is developed by Nakul Bageja to explore practical applications of conversational AI in shopping. Contributions, suggestions, and feedback are welcome via Issues or Pull Requests.

Disclaimer: Product prices and availability are for demonstration purposes using sample data. This is a proof-of-concept showcasing Gemini Live API capabilities, created during Gemini Live Agent hackathon


Ready to revolutionize your shopping experience? πŸ›οΈ Start talking to DealLens AI today!

About

This project is made for Gemini Live Agent Challenge

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors