jinaai
/

jina-clip-implementation

Inference Endpoints

🇪🇺 Region: EU

Model card Files Files and versions Community

Jina CLIP

Core implementation of Jina CLIP. The model uses:

the EVA 02 architecture for the vision tower
the Jina XLM RoBERTa with Flash Attention model as a text tower

Models that use this implementation

Requirements

To use the Jina CLIP source code, the following packages are required:

torch
timm
transformers
einops
xformers to use x-attention
flash-attn to use flash attention
apex to use fused layer normalization

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference API

Unable to determine this model’s pipeline type. Check the docs .