CLIPper

CLIPper is a reimplementation of CLIPseg, optimized for better performance for multi-class segmentation. The previous implementations both on the original CLIPseg repo and hugginface both require an in image for each text input. If you want to segment multiple classes for the same image you need to encode the image for each text input.

CLIPper fixes this by encoding the image only once and then encoding each text unput, and decoding for text input. The image encoder is the bulk of the inference time, so doing this once leads to great speed ups as you add classes, as shown in plot.

I do hope to commit this back to Huggingface, if they'll take it. I'm also working on a cpp implementation.

cpp

To build the model in c++ use:

cd clipper
cmake -S . -B build
cd build
cmake --build .
sudo cmake --install . --prefix /usr/local

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CLIPper

cpp

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

CLIPper

cpp