-
Notifications
You must be signed in to change notification settings - Fork 0
image captions using blip. #204
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
|
||
| COPY requirements.txt / | ||
| RUN pip install -U pip | ||
| RUN pip install torch torchvision --index-url https://download.pytorch.org/whl/cpu |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's cleaner to keep requirements in requirements files. See how this is done in https://github.com/aperture-data/workflows/blob/main/base/docker/scripts/embeddings/requirements_cpu.txt
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks ! This is actually a TODO I care about. Actually, I was wondering on how to do this conditionally. Like if I want to specify CPU, cuda or metal at the top level and have images that optimizes for the architecture I am developing on. It'll be a major win. I still havent zeroed in, but as you can see this is WIP. Do you have any ideas for that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had a look around and I don't see a good solution that just does the right thing on any architecture. Forcing CPU versions comes closest. To do better, I think we would have to build and deploy multiple versions of the docker image with the device as a build argument.
apps/caption-image/app/images.py
Outdated
| query = [{ | ||
| "FindImage": { | ||
| "constraints": { | ||
| self.done_property: ["==", None] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be slightly more robust to say ["!=", True].
| processor = AutoProcessor.from_pretrained("Salesforce/blip-image-captioning-base") | ||
| model = BlipForConditionalGeneration.from_pretrained("Salesforce/blip-image-captioning-base") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't find any documentation that says these are thread-safe.
apps/caption-image/app/images.py
Outdated
| "UpdateImage": { | ||
| "ref": i + 1, | ||
| "properties": { | ||
| self.done_property: captions[i] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh. So not really a done property then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yea. should have changed after copying images from embedding extarction. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I updated the properties name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we distinguish between "not yet tried to generate a caption" and "I tried but couldn't do it"?
apps/caption-image/Dockerfile
Outdated
| RUN pip install torch torchvision --index-url https://download.pytorch.org/whl/cpu | ||
| RUN pip install --no-cache-dir -r /requirements.txt | ||
|
|
||
| COPY app/weights.py /app/weights.py |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is intended for cache warmup. It might be better to make that explicit in the name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sounds good. will cahnge.
| @@ -0,0 +1,11 @@ | |||
| import platform | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should these files really go at the top level?
Adds auto generation of image captions using BLIP.
https://huggingface.co/docs/transformers/main/en/model_doc/blip#transformers.BlipForConditionalGeneration
TODO:
Add tests: Adding a validation at build time with a basic script.