Skip to content

Latest commit

 

History

History
83 lines (56 loc) · 1.91 KB

File metadata and controls

83 lines (56 loc) · 1.91 KB

📊 Gemini Multimodal Visual QnA Chatbot App

An interactive Streamlit app powered by Google Gemini 1.5 Flash for visual question answering. Upload any image (e.g., charts, infographics, financial reports) and ask natural language questions to receive structured responses from Gemini.


🔗 Live App

👉 Launch App


🚀 Features

  • 📷 Upload image files (JPEG, PNG)
  • 💬 Ask any question about the image (e.g., "Summarize in 5 bullet points")
  • 🤖 Get AI-powered insights via Google Gemini
  • 🌐 Deploys on Streamlit Cloud in seconds

📸 App Demo

1. Uploading an Image

App UI

2. Gemini's Response

Response Demo


🛠️ How to Run Locally

1. Clone the repository

git clone https://github.com/your-username/Gemini-Multimodal-App.git
cd Gemini-Multimodal-App

2. Install dependencies

pip install -r requirements.txt

3. Run the app

streamlit run Gemini_visual_qna_app.py

4. API Key

This app uses Gemini-1.5 Flash model from Google’s Gemini API.
To run the app yourself, store your API key in Streamlit Secrets:

GEMINI_API_KEY = "your-google-api-key"

🔒 The key is accessed securely via st.secrets["GEMINI_API_KEY"].
No key input is required in the app UI.


📚 Technologies Used


📄 License

MIT License © 2025 Sanjana Shah


👤 Author

Sanjana Shah
✨ Machine Learning & Generative AI Enthusiast
📫 Connect on LinkedIn GitHub: @shahsanjanav


⭐ If you like this project, consider starring it on GitHub!