Visual Question Answering (VQA) Project

My project involves creating a Visual Question Answering (VQA) system using Streamlit and a pre-trained BLIP (Bootstrapped Language-Image Pretraining) model from Salesforce. The system allows users to upload an image, ask a question about the image, and receive an AI-generated answer. The process leverages a model capable of understanding the visual content of the image along with natural language to provide answers.

Diagram:

Steps in the Project Flow:

User Interface (Streamlit Frontend): The user uploads an image and enters a question through a web interface created using Streamlit.
Image Upload: Once the image is uploaded, it is displayed on the webpage. The image is also converted into RGB format using the Python Imaging Library (PIL).
Question Input: The user inputs a question related to the uploaded image. For example, "What is in the image?" or "How many people are there?"
Processor (Preprocessing): The uploaded image and the input question are passed into the BLIP processor, which tokenizes the text and converts the image into a format suitable for model inference.
Model (BLIP for Question Answering): The BLIP model processes both the image and the question to generate a suitable response. If a GPU is available, the model runs on it for faster processing.
Answer Generation: The model generates an answer to the question using its pre-trained language-image reasoning capabilities. It then decodes the result to provide a human-readable output.
Answer Display: The generated answer is displayed back to the user on the Streamlit interface.
Re-run Option: The user can choose to "Re-run" the app to upload a new image or ask a different question.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.idea		.idea
ImageQA.py		ImageQA.py
Out.png		Out.png
README.md		README.md
Readme.pdf		Readme.pdf
Ui.png		Ui.png
dia.png		dia.png
requirement.txt		requirement.txt
run.py		run.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Visual Question Answering (VQA) Project

Diagram:

Steps in the Project Flow:

User Interface

Output

About

Uh oh!

Releases

Packages

Languages

CodenesShuvankar/VisualQ-A

Folders and files

Latest commit

History

Repository files navigation

Visual Question Answering (VQA) Project

Diagram:

Steps in the Project Flow:

User Interface

Output

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages