File size: 419 Bytes
56fe6da
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
# Jina CLIP

The Jina CLIP implementation is hosted in this repository. The model uses:
* the EVA 02 architecture for the vision tower
* the Jina BERT with Flash Attention model as a text tower

To use the Jina CLIP model, the following packages are required:
* `torch`
* `timm`
* `transformers`
* `einops`
* `xformers` to use x-attention
* `flash-attn` to use flash attention
* `apex` to use fused layer normalization