-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Open
Labels
Description
Can we adapt openclip to be able to train text/text contrastive models?
And beyond that maybe, text/test/image models ?
use case:
- train pure contrastive text models either for multilingual pairs, or mono lingual
- use similarity between text pairs and also with images to have a better text understanding while having a good text,image understanding
options:
- single tower for text1 and text2
- two towers
It would be nice to find a way to do this without making the code overly complicated.
It goes in a direction of supporting more modalities combination in openclip
A motivation is there are few good models for text,text surprisingly, even though the community on this is quite active
a related idea is image/image as inspired by https://arxiv.org/abs/2212.08045
reference of private models to beat:
Reactions are currently unavailable