kimihailv commited on
Commit
6eb861c
1 Parent(s): c785708

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -21
README.md CHANGED
@@ -63,11 +63,11 @@ from PIL import Image
63
  text = 'a small red panda in a zoo'
64
  image = Image.open('red_panda.jpg')
65
 
66
- image_data = model.preprocess_image(image)
67
- text_data = model.preprocess_text(text)
68
 
69
- image_embedding = model.encode_image(image_data)
70
- text_embedding = model.encode_text(text_data)
71
  score, joint_embedding = model.encode_multimodal(
72
  image_features=image_features,
73
  text_features=text_features,
@@ -76,23 +76,6 @@ score, joint_embedding = model.encode_multimodal(
76
  )
77
  ```
78
 
79
- To get features:
80
-
81
- ```python
82
- image_features, image_embedding = model.encode_image(image_data, return_features=True)
83
- text_features, text_embedding = model.encode_text(text_data, return_features=True)
84
- ```
85
-
86
- These features can later be used to produce joint multimodal encodings faster, as the first layers of the transformer can be skipped:
87
-
88
- ```python
89
- joint_embedding = model.encode_multimodal(
90
- image_features=image_features,
91
- text_features=text_features,
92
- attention_mask=text_data['attention_mask']
93
- )
94
- ```
95
-
96
  There are two options to calculate semantic compatibility between an image and a text: [Cosine Similarity](#cosine-similarity) and [Matching Score](#matching-score).
97
 
98
  ### Cosine Similarity
 
63
  text = 'a small red panda in a zoo'
64
  image = Image.open('red_panda.jpg')
65
 
66
+ image_data = processor.preprocess_image(image)
67
+ text_data = processor.preprocess_text(text)
68
 
69
+ image_features, image_embedding = model.encode_image(image_data, return_features=True)
70
+ text_features, text_embedding = model.encode_text(text_data, return_features=True)
71
  score, joint_embedding = model.encode_multimodal(
72
  image_features=image_features,
73
  text_features=text_features,
 
76
  )
77
  ```
78
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
79
  There are two options to calculate semantic compatibility between an image and a text: [Cosine Similarity](#cosine-similarity) and [Matching Score](#matching-score).
80
 
81
  ### Cosine Similarity