Slep commited on
Commit
75bec7c
1 Parent(s): 8ef8dba

Initial README (example, correct license...)

Browse files
Files changed (1) hide show
  1. README.md +33 -1
README.md CHANGED
@@ -1,3 +1,35 @@
1
  ---
2
- license: mit
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ license: cc-by-nc-4.0
3
  ---
4
+
5
+ # Conditional ViT - B/16 - Text
6
+
7
+ *Introduced in **Weakly-Supervised Conditional Embedding for Referred Visual Search**, Lepage et al. 2023*
8
+
9
+ [`Paper`](https://arxiv.org/abs/2306.02928) | [`Training Data`](https://huggingface.co/datasets/Slep/LAION-RVS-Fashion) | [`Training Code`](https://github.com/Simon-Lepage/CondViT-LRVSF) | [`Demo`](https://huggingface.co/spaces/Slep/CondViT-LRVSF-Demo)
10
+
11
+ ## General Infos
12
+
13
+ Model finetuned from CLIP ViT-B/16 on LRVSF at 224x224. The conditioning text is preprocessed by a frozen [Sentence T5-XL](https://huggingface.co/sentence-transformers/sentence-t5-xl).
14
+
15
+ Research use only.
16
+
17
+ ## How to Use
18
+
19
+ ```python
20
+ from PIL import Image
21
+ import requests
22
+ from transformers import AutoProcessor, AutoModel
23
+ import torch
24
+
25
+ model = AutoModel.from_pretrained("Slep/CondViT-B16-txt")
26
+ processor = AutoProcessor.from_pretrained("Slep/CondViT-B16-txt")
27
+
28
+ url = "https://huggingface.co/datasets/Slep/LAION-RVS-Fashion/resolve/main/assets/108856.0.jpg"
29
+ img = Image.open(requests.get(url, stream=True).raw)
30
+ txt = "a brown bag"
31
+
32
+ inputs = processor(images=[img], texts=[txt])
33
+ raw_embedding = model(**inputs)
34
+ normalized_embedding = torch.nn.functional.normalize(raw_embedding, dim=-1)
35
+ ```