Name	Name	Last commit message	Last commit date
parent directory ..
README.md	README.md
openclip_requirements.txt	openclip_requirements.txt
run_openclip_vqa.py	run_openclip_vqa.py
run_pipeline.py	run_pipeline.py

Name

Last commit message

Last commit date

Visual Question Answering Examples

Single-HPU inference

The run_pipeline.py script showcases how to use the Transformers pipeline API to run visual question answering task on HPUs.

PT_HPU_LAZY_MODE=1 python3 run_pipeline.py \
    --model_name_or_path Salesforce/blip-vqa-capfilt-large \
    --image_path "https://storage.googleapis.com/sfr-vision-language-research/BLIP/demo.jpg" \
    --question "how many dogs are in the picture?" \
    --use_hpu_graphs \
    --bf16

OpenCLIP inference

The run_openclip_vqa.py can be used to run zero shot image classification with OpenCLIP Huggingface Models. The requirements for run_openclip_vqa.py can be installed with openclip_requirements.txt as follows:

pip install -r openclip_requirements.txt

By default, the script runs the sample outlined in BiomedCLIP-PubMedBERT_256-vit_base_patch16_224 notebook. One can also can also run other OpenCLIP models by specifying model, classifier labels and image URL(s) like so:

PT_HPU_LAZY_MODE=1 python run_openclip_vqa.py \
    --model_name_or_path laion/CLIP-ViT-g-14-laion2B-s12B-b42K \
    --labels "a dog" "a cat" \
    --image_path "http://images.cocodataset.org/val2017/000000039769.jpg" \
    --use_hpu_graphs \
    --bf16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

Visual Question Answering Examples

Single-HPU inference

OpenCLIP inference

FilesExpand file tree

visual-question-answering

Directory actions

More options

Directory actions

More options

Latest commit

History

visual-question-answering

Folders and files

parent directory

README.md

Visual Question Answering Examples

Single-HPU inference

OpenCLIP inference