Transformers
PyTorch
clip
Inference Endpoints
visheratin commited on
Commit
87c2778
1 Parent(s): 0341e0d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +11 -1
README.md CHANGED
@@ -4,7 +4,17 @@ datasets:
4
  - visheratin/laion-coco-nllb
5
  ---
6
 
7
- The code to run the model:
 
 
 
 
 
 
 
 
 
 
8
 
9
  ```
10
  from transformers import AutoTokenizer, CLIPProcessor
 
4
  - visheratin/laion-coco-nllb
5
  ---
6
 
7
+ ## Model Summary
8
+
9
+ NLLB-CLIP is a model that combines a text encoder from the [NLLB model](https://huggingface.co/facebook/nllb-200-distilled-600M) and an image encoder from the
10
+ standard [CLIP](https://huggingface.co/openai/clip-vit-base-patch32). This allows us to extend the model capabilities
11
+ to 201 languages of the Flores-200. NLLB-CLIP sets state-of-the-art on the [Crossmodal-3600](https://google.github.io/crossmodal-3600/) dataset by performing very
12
+ well on low-resource languages. You can find more details about the model in the [paper](https://arxiv.org/abs/2309.01859).
13
+
14
+ ## How to use
15
+
16
+ The model [repo](https://huggingface.co/visheratin/nllb-clip-base/tree/main) contains the model code files that allow the use of NLLB-CLIP as any other model from the hub.
17
+ The interface is also compatible with CLIP models. Example code is below:
18
 
19
  ```
20
  from transformers import AutoTokenizer, CLIPProcessor