SanghyukChun commited on
Commit
1d74372
1 Parent(s): f83b2bc

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +44 -3
README.md CHANGED
@@ -4,6 +4,47 @@ tags:
4
  - model_hub_mixin
5
  ---
6
 
7
- This model has been pushed to the Hub using ****:
8
- - Repo: [More Information Needed]
9
- - Docs: [More Information Needed]
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4
  - model_hub_mixin
5
  ---
6
 
7
+ ### Official implementation of PCME++ pre-trained model on CC3M, CC12M and RedCaps.
8
+
9
+ Zero-shot ImageNet-1k top-1 accuracy: 34.642% (slightly better than the paper score, 34.22%)
10
+
11
+ - Paper: https://openreview.net/forum?id=ft1mr3WlGM
12
+ - GitHub: https://github.com/naver-ai/pcmepp
13
+
14
+ ```python
15
+ import requests
16
+ from PIL import Image
17
+
18
+ import torch
19
+ from transformers import CLIPProcessor
20
+
21
+ # Check hf_models code here: https://github.com/naver-ai/pcmepp/tree/main/hf_models
22
+ from hf_models import HfPCMEPPModel, tokenize
23
+
24
+
25
+ processor = CLIPProcessor.from_pretrained("openai/clip-vit-base-patch16")
26
+ model = HfPCMEPPModel.from_pretrained("SanghyukChun/PCMEPP-ViT-B-16-CC3M-12M-RedCaps")
27
+
28
+
29
+ url = "http://images.cocodataset.org/val2017/000000039769.jpg"
30
+ image = Image.open(requests.get(url, stream=True).raw)
31
+ inputs = processor(images=image, return_tensors="pt", padding=True)
32
+ texts = ["a photo of a cat", "a photo of a dog"]
33
+ texts = tokenize(texts)
34
+
35
+ outputs = model(images=inputs["pixel_values"], texts=texts)
36
+ print("Logits:", outputs["image_features"] @ outputs["text_features"].T)
37
+ print("Image uncertainty: ", torch.exp(outputs["image_stds"]).mean(dim=-1))
38
+ print("Text uncertainty: ", torch.exp(outputs["text_stds"]).mean(dim=-1))
39
+ ```
40
+
41
+ ```
42
+ @inproceedings{
43
+ chun2024pcmepp,
44
+ title={Improved Probabilistic Image-Text Representations},
45
+ author={Sanghyuk Chun},
46
+ booktitle={The Twelfth International Conference on Learning Representations},
47
+ year={2024},
48
+ url={https://openreview.net/forum?id=ft1mr3WlGM}
49
+ }
50
+ ```