Skip to content

CodenesShuvankar/VisualQ-A

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Visual Question Answering (VQA) Project

My project involves creating a Visual Question Answering (VQA) system using Streamlit and a pre-trained BLIP (Bootstrapped Language-Image Pretraining) model from Salesforce. The system allows users to upload an image, ask a question about the image, and receive an AI-generated answer. The process leverages a model capable of understanding the visual content of the image along with natural language to provide answers.

Diagram:

Model Diagram

Steps in the Project Flow:

  1. User Interface (Streamlit Frontend): The user uploads an image and enters a question through a web interface created using Streamlit.
  2. Image Upload: Once the image is uploaded, it is displayed on the webpage. The image is also converted into RGB format using the Python Imaging Library (PIL).
  3. Question Input: The user inputs a question related to the uploaded image. For example, "What is in the image?" or "How many people are there?"
  4. Processor (Preprocessing): The uploaded image and the input question are passed into the BLIP processor, which tokenizes the text and converts the image into a format suitable for model inference.
  5. Model (BLIP for Question Answering): The BLIP model processes both the image and the question to generate a suitable response. If a GPU is available, the model runs on it for faster processing.
  6. Answer Generation: The model generates an answer to the question using its pre-trained language-image reasoning capabilities. It then decodes the result to provide a human-readable output.
  7. Answer Display: The generated answer is displayed back to the user on the Streamlit interface.
  8. Re-run Option: The user can choose to "Re-run" the app to upload a new image or ask a different question.

User Interface

Output

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages