Update README.md
Browse files
README.md
CHANGED
@@ -63,11 +63,11 @@ from PIL import Image
|
|
63 |
text = 'a small red panda in a zoo'
|
64 |
image = Image.open('red_panda.jpg')
|
65 |
|
66 |
-
image_data =
|
67 |
-
text_data =
|
68 |
|
69 |
-
image_embedding = model.encode_image(image_data)
|
70 |
-
text_embedding = model.encode_text(text_data)
|
71 |
score, joint_embedding = model.encode_multimodal(
|
72 |
image_features=image_features,
|
73 |
text_features=text_features,
|
@@ -76,23 +76,6 @@ score, joint_embedding = model.encode_multimodal(
|
|
76 |
)
|
77 |
```
|
78 |
|
79 |
-
To get features:
|
80 |
-
|
81 |
-
```python
|
82 |
-
image_features, image_embedding = model.encode_image(image_data, return_features=True)
|
83 |
-
text_features, text_embedding = model.encode_text(text_data, return_features=True)
|
84 |
-
```
|
85 |
-
|
86 |
-
These features can later be used to produce joint multimodal encodings faster, as the first layers of the transformer can be skipped:
|
87 |
-
|
88 |
-
```python
|
89 |
-
joint_embedding = model.encode_multimodal(
|
90 |
-
image_features=image_features,
|
91 |
-
text_features=text_features,
|
92 |
-
attention_mask=text_data['attention_mask']
|
93 |
-
)
|
94 |
-
```
|
95 |
-
|
96 |
There are two options to calculate semantic compatibility between an image and a text: [Cosine Similarity](#cosine-similarity) and [Matching Score](#matching-score).
|
97 |
|
98 |
### Cosine Similarity
|
|
|
63 |
text = 'a small red panda in a zoo'
|
64 |
image = Image.open('red_panda.jpg')
|
65 |
|
66 |
+
image_data = processor.preprocess_image(image)
|
67 |
+
text_data = processor.preprocess_text(text)
|
68 |
|
69 |
+
image_features, image_embedding = model.encode_image(image_data, return_features=True)
|
70 |
+
text_features, text_embedding = model.encode_text(text_data, return_features=True)
|
71 |
score, joint_embedding = model.encode_multimodal(
|
72 |
image_features=image_features,
|
73 |
text_features=text_features,
|
|
|
76 |
)
|
77 |
```
|
78 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
79 |
There are two options to calculate semantic compatibility between an image and a text: [Cosine Similarity](#cosine-similarity) and [Matching Score](#matching-score).
|
80 |
|
81 |
### Cosine Similarity
|