jncraton commited on
Commit
c50031e
1 Parent(s): 2579575

Upload folder using huggingface_hub

Browse files
Files changed (7) hide show
  1. README.md +2702 -0
  2. config.json +7 -0
  3. model.bin +3 -0
  4. special_tokens_map.json +37 -0
  5. tokenizer.json +0 -0
  6. tokenizer_config.json +57 -0
  7. vocabulary.json +0 -0
README.md ADDED
@@ -0,0 +1,2702 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ library_name: sentence-transformers
5
+ license: mit
6
+ pipeline_tag: sentence-similarity
7
+ tags:
8
+ - feature-extraction
9
+ - mteb
10
+ - sentence-similarity
11
+ - sentence-transformers
12
+
13
+ model-index:
14
+ - name: GIST-small-Embedding-v0
15
+ results:
16
+ - task:
17
+ type: Classification
18
+ dataset:
19
+ type: mteb/amazon_counterfactual
20
+ name: MTEB AmazonCounterfactualClassification (en)
21
+ config: en
22
+ split: test
23
+ revision: e8379541af4e31359cca9fbcf4b00f2671dba205
24
+ metrics:
25
+ - type: accuracy
26
+ value: 75.26865671641791
27
+ - type: ap
28
+ value: 38.25623793370476
29
+ - type: f1
30
+ value: 69.26434651320257
31
+ - task:
32
+ type: Classification
33
+ dataset:
34
+ type: mteb/amazon_polarity
35
+ name: MTEB AmazonPolarityClassification
36
+ config: default
37
+ split: test
38
+ revision: e2d317d38cd51312af73b3d32a06d1a08b442046
39
+ metrics:
40
+ - type: accuracy
41
+ value: 93.232225
42
+ - type: ap
43
+ value: 89.97936072879344
44
+ - type: f1
45
+ value: 93.22122653806187
46
+ - task:
47
+ type: Classification
48
+ dataset:
49
+ type: mteb/amazon_reviews_multi
50
+ name: MTEB AmazonReviewsClassification (en)
51
+ config: en
52
+ split: test
53
+ revision: 1399c76144fd37290681b995c656ef9b2e06e26d
54
+ metrics:
55
+ - type: accuracy
56
+ value: 49.715999999999994
57
+ - type: f1
58
+ value: 49.169789920136076
59
+ - task:
60
+ type: Retrieval
61
+ dataset:
62
+ type: arguana
63
+ name: MTEB ArguAna
64
+ config: default
65
+ split: test
66
+ revision: None
67
+ metrics:
68
+ - type: map_at_1
69
+ value: 34.922
70
+ - type: map_at_10
71
+ value: 50.524
72
+ - type: map_at_100
73
+ value: 51.247
74
+ - type: map_at_1000
75
+ value: 51.249
76
+ - type: map_at_3
77
+ value: 45.887
78
+ - type: map_at_5
79
+ value: 48.592999999999996
80
+ - type: mrr_at_1
81
+ value: 34.922
82
+ - type: mrr_at_10
83
+ value: 50.382000000000005
84
+ - type: mrr_at_100
85
+ value: 51.104000000000006
86
+ - type: mrr_at_1000
87
+ value: 51.105999999999995
88
+ - type: mrr_at_3
89
+ value: 45.733000000000004
90
+ - type: mrr_at_5
91
+ value: 48.428
92
+ - type: ndcg_at_1
93
+ value: 34.922
94
+ - type: ndcg_at_10
95
+ value: 59.12
96
+ - type: ndcg_at_100
97
+ value: 62.083999999999996
98
+ - type: ndcg_at_1000
99
+ value: 62.137
100
+ - type: ndcg_at_3
101
+ value: 49.616
102
+ - type: ndcg_at_5
103
+ value: 54.501
104
+ - type: precision_at_1
105
+ value: 34.922
106
+ - type: precision_at_10
107
+ value: 8.649
108
+ - type: precision_at_100
109
+ value: 0.991
110
+ - type: precision_at_1000
111
+ value: 0.1
112
+ - type: precision_at_3
113
+ value: 20.152
114
+ - type: precision_at_5
115
+ value: 14.466999999999999
116
+ - type: recall_at_1
117
+ value: 34.922
118
+ - type: recall_at_10
119
+ value: 86.48599999999999
120
+ - type: recall_at_100
121
+ value: 99.14699999999999
122
+ - type: recall_at_1000
123
+ value: 99.57300000000001
124
+ - type: recall_at_3
125
+ value: 60.455000000000005
126
+ - type: recall_at_5
127
+ value: 72.333
128
+ - task:
129
+ type: Clustering
130
+ dataset:
131
+ type: mteb/arxiv-clustering-p2p
132
+ name: MTEB ArxivClusteringP2P
133
+ config: default
134
+ split: test
135
+ revision: a122ad7f3f0291bf49cc6f4d32aa80929df69d5d
136
+ metrics:
137
+ - type: v_measure
138
+ value: 47.623282347623714
139
+ - task:
140
+ type: Clustering
141
+ dataset:
142
+ type: mteb/arxiv-clustering-s2s
143
+ name: MTEB ArxivClusteringS2S
144
+ config: default
145
+ split: test
146
+ revision: f910caf1a6075f7329cdf8c1a6135696f37dbd53
147
+ metrics:
148
+ - type: v_measure
149
+ value: 39.86487843524932
150
+ - task:
151
+ type: Reranking
152
+ dataset:
153
+ type: mteb/askubuntudupquestions-reranking
154
+ name: MTEB AskUbuntuDupQuestions
155
+ config: default
156
+ split: test
157
+ revision: 2000358ca161889fa9c082cb41daa8dcfb161a54
158
+ metrics:
159
+ - type: map
160
+ value: 62.3290291318171
161
+ - type: mrr
162
+ value: 75.2379853141626
163
+ - task:
164
+ type: STS
165
+ dataset:
166
+ type: mteb/biosses-sts
167
+ name: MTEB BIOSSES
168
+ config: default
169
+ split: test
170
+ revision: d3fb88f8f02e40887cd149695127462bbcf29b4a
171
+ metrics:
172
+ - type: cos_sim_pearson
173
+ value: 88.52002953574285
174
+ - type: cos_sim_spearman
175
+ value: 86.98752423842483
176
+ - type: euclidean_pearson
177
+ value: 86.89442688314197
178
+ - type: euclidean_spearman
179
+ value: 86.88631711307471
180
+ - type: manhattan_pearson
181
+ value: 87.03723618507175
182
+ - type: manhattan_spearman
183
+ value: 86.76041062975224
184
+ - task:
185
+ type: Classification
186
+ dataset:
187
+ type: mteb/banking77
188
+ name: MTEB Banking77Classification
189
+ config: default
190
+ split: test
191
+ revision: 0fd18e25b25c072e09e0d92ab615fda904d66300
192
+ metrics:
193
+ - type: accuracy
194
+ value: 86.64935064935065
195
+ - type: f1
196
+ value: 86.61903824934998
197
+ - task:
198
+ type: Clustering
199
+ dataset:
200
+ type: mteb/biorxiv-clustering-p2p
201
+ name: MTEB BiorxivClusteringP2P
202
+ config: default
203
+ split: test
204
+ revision: 65b79d1d13f80053f67aca9498d9402c2d9f1f40
205
+ metrics:
206
+ - type: v_measure
207
+ value: 39.21904455377494
208
+ - task:
209
+ type: Clustering
210
+ dataset:
211
+ type: mteb/biorxiv-clustering-s2s
212
+ name: MTEB BiorxivClusteringS2S
213
+ config: default
214
+ split: test
215
+ revision: 258694dd0231531bc1fd9de6ceb52a0853c6d908
216
+ metrics:
217
+ - type: v_measure
218
+ value: 35.43342755570654
219
+ - task:
220
+ type: Retrieval
221
+ dataset:
222
+ type: BeIR/cqadupstack
223
+ name: MTEB CQADupstackAndroidRetrieval
224
+ config: default
225
+ split: test
226
+ revision: None
227
+ metrics:
228
+ - type: map_at_1
229
+ value: 31.843
230
+ - type: map_at_10
231
+ value: 43.379
232
+ - type: map_at_100
233
+ value: 44.946999999999996
234
+ - type: map_at_1000
235
+ value: 45.078
236
+ - type: map_at_3
237
+ value: 39.598
238
+ - type: map_at_5
239
+ value: 41.746
240
+ - type: mrr_at_1
241
+ value: 39.199
242
+ - type: mrr_at_10
243
+ value: 49.672
244
+ - type: mrr_at_100
245
+ value: 50.321000000000005
246
+ - type: mrr_at_1000
247
+ value: 50.365
248
+ - type: mrr_at_3
249
+ value: 46.805
250
+ - type: mrr_at_5
251
+ value: 48.579
252
+ - type: ndcg_at_1
253
+ value: 39.199
254
+ - type: ndcg_at_10
255
+ value: 50.163999999999994
256
+ - type: ndcg_at_100
257
+ value: 55.418
258
+ - type: ndcg_at_1000
259
+ value: 57.353
260
+ - type: ndcg_at_3
261
+ value: 44.716
262
+ - type: ndcg_at_5
263
+ value: 47.268
264
+ - type: precision_at_1
265
+ value: 39.199
266
+ - type: precision_at_10
267
+ value: 9.757
268
+ - type: precision_at_100
269
+ value: 1.552
270
+ - type: precision_at_1000
271
+ value: 0.20500000000000002
272
+ - type: precision_at_3
273
+ value: 21.602
274
+ - type: precision_at_5
275
+ value: 15.479000000000001
276
+ - type: recall_at_1
277
+ value: 31.843
278
+ - type: recall_at_10
279
+ value: 62.743
280
+ - type: recall_at_100
281
+ value: 84.78099999999999
282
+ - type: recall_at_1000
283
+ value: 96.86099999999999
284
+ - type: recall_at_3
285
+ value: 46.927
286
+ - type: recall_at_5
287
+ value: 54.355
288
+ - task:
289
+ type: Retrieval
290
+ dataset:
291
+ type: BeIR/cqadupstack
292
+ name: MTEB CQADupstackEnglishRetrieval
293
+ config: default
294
+ split: test
295
+ revision: None
296
+ metrics:
297
+ - type: map_at_1
298
+ value: 29.321
299
+ - type: map_at_10
300
+ value: 39.062999999999995
301
+ - type: map_at_100
302
+ value: 40.403
303
+ - type: map_at_1000
304
+ value: 40.534
305
+ - type: map_at_3
306
+ value: 36.367
307
+ - type: map_at_5
308
+ value: 37.756
309
+ - type: mrr_at_1
310
+ value: 35.987
311
+ - type: mrr_at_10
312
+ value: 44.708999999999996
313
+ - type: mrr_at_100
314
+ value: 45.394
315
+ - type: mrr_at_1000
316
+ value: 45.436
317
+ - type: mrr_at_3
318
+ value: 42.463
319
+ - type: mrr_at_5
320
+ value: 43.663000000000004
321
+ - type: ndcg_at_1
322
+ value: 35.987
323
+ - type: ndcg_at_10
324
+ value: 44.585
325
+ - type: ndcg_at_100
326
+ value: 49.297999999999995
327
+ - type: ndcg_at_1000
328
+ value: 51.315
329
+ - type: ndcg_at_3
330
+ value: 40.569
331
+ - type: ndcg_at_5
332
+ value: 42.197
333
+ - type: precision_at_1
334
+ value: 35.987
335
+ - type: precision_at_10
336
+ value: 8.369
337
+ - type: precision_at_100
338
+ value: 1.366
339
+ - type: precision_at_1000
340
+ value: 0.184
341
+ - type: precision_at_3
342
+ value: 19.427
343
+ - type: precision_at_5
344
+ value: 13.58
345
+ - type: recall_at_1
346
+ value: 29.321
347
+ - type: recall_at_10
348
+ value: 54.333
349
+ - type: recall_at_100
350
+ value: 74.178
351
+ - type: recall_at_1000
352
+ value: 86.732
353
+ - type: recall_at_3
354
+ value: 42.46
355
+ - type: recall_at_5
356
+ value: 47.089999999999996
357
+ - task:
358
+ type: Retrieval
359
+ dataset:
360
+ type: BeIR/cqadupstack
361
+ name: MTEB CQADupstackGamingRetrieval
362
+ config: default
363
+ split: test
364
+ revision: None
365
+ metrics:
366
+ - type: map_at_1
367
+ value: 38.811
368
+ - type: map_at_10
369
+ value: 51.114000000000004
370
+ - type: map_at_100
371
+ value: 52.22
372
+ - type: map_at_1000
373
+ value: 52.275000000000006
374
+ - type: map_at_3
375
+ value: 47.644999999999996
376
+ - type: map_at_5
377
+ value: 49.675000000000004
378
+ - type: mrr_at_1
379
+ value: 44.389
380
+ - type: mrr_at_10
381
+ value: 54.459
382
+ - type: mrr_at_100
383
+ value: 55.208999999999996
384
+ - type: mrr_at_1000
385
+ value: 55.239000000000004
386
+ - type: mrr_at_3
387
+ value: 51.954
388
+ - type: mrr_at_5
389
+ value: 53.571999999999996
390
+ - type: ndcg_at_1
391
+ value: 44.389
392
+ - type: ndcg_at_10
393
+ value: 56.979
394
+ - type: ndcg_at_100
395
+ value: 61.266
396
+ - type: ndcg_at_1000
397
+ value: 62.315
398
+ - type: ndcg_at_3
399
+ value: 51.342
400
+ - type: ndcg_at_5
401
+ value: 54.33
402
+ - type: precision_at_1
403
+ value: 44.389
404
+ - type: precision_at_10
405
+ value: 9.26
406
+ - type: precision_at_100
407
+ value: 1.226
408
+ - type: precision_at_1000
409
+ value: 0.136
410
+ - type: precision_at_3
411
+ value: 22.926
412
+ - type: precision_at_5
413
+ value: 15.987000000000002
414
+ - type: recall_at_1
415
+ value: 38.811
416
+ - type: recall_at_10
417
+ value: 70.841
418
+ - type: recall_at_100
419
+ value: 89.218
420
+ - type: recall_at_1000
421
+ value: 96.482
422
+ - type: recall_at_3
423
+ value: 56.123999999999995
424
+ - type: recall_at_5
425
+ value: 63.322
426
+ - task:
427
+ type: Retrieval
428
+ dataset:
429
+ type: BeIR/cqadupstack
430
+ name: MTEB CQADupstackGisRetrieval
431
+ config: default
432
+ split: test
433
+ revision: None
434
+ metrics:
435
+ - type: map_at_1
436
+ value: 25.378
437
+ - type: map_at_10
438
+ value: 34.311
439
+ - type: map_at_100
440
+ value: 35.399
441
+ - type: map_at_1000
442
+ value: 35.482
443
+ - type: map_at_3
444
+ value: 31.917
445
+ - type: map_at_5
446
+ value: 33.275
447
+ - type: mrr_at_1
448
+ value: 27.683999999999997
449
+ - type: mrr_at_10
450
+ value: 36.575
451
+ - type: mrr_at_100
452
+ value: 37.492
453
+ - type: mrr_at_1000
454
+ value: 37.556
455
+ - type: mrr_at_3
456
+ value: 34.35
457
+ - type: mrr_at_5
458
+ value: 35.525
459
+ - type: ndcg_at_1
460
+ value: 27.683999999999997
461
+ - type: ndcg_at_10
462
+ value: 39.247
463
+ - type: ndcg_at_100
464
+ value: 44.424
465
+ - type: ndcg_at_1000
466
+ value: 46.478
467
+ - type: ndcg_at_3
468
+ value: 34.684
469
+ - type: ndcg_at_5
470
+ value: 36.886
471
+ - type: precision_at_1
472
+ value: 27.683999999999997
473
+ - type: precision_at_10
474
+ value: 5.989
475
+ - type: precision_at_100
476
+ value: 0.899
477
+ - type: precision_at_1000
478
+ value: 0.11199999999999999
479
+ - type: precision_at_3
480
+ value: 14.84
481
+ - type: precision_at_5
482
+ value: 10.215
483
+ - type: recall_at_1
484
+ value: 25.378
485
+ - type: recall_at_10
486
+ value: 52.195
487
+ - type: recall_at_100
488
+ value: 75.764
489
+ - type: recall_at_1000
490
+ value: 91.012
491
+ - type: recall_at_3
492
+ value: 39.885999999999996
493
+ - type: recall_at_5
494
+ value: 45.279
495
+ - task:
496
+ type: Retrieval
497
+ dataset:
498
+ type: BeIR/cqadupstack
499
+ name: MTEB CQADupstackMathematicaRetrieval
500
+ config: default
501
+ split: test
502
+ revision: None
503
+ metrics:
504
+ - type: map_at_1
505
+ value: 17.326
506
+ - type: map_at_10
507
+ value: 25.247000000000003
508
+ - type: map_at_100
509
+ value: 26.473000000000003
510
+ - type: map_at_1000
511
+ value: 26.579000000000004
512
+ - type: map_at_3
513
+ value: 22.466
514
+ - type: map_at_5
515
+ value: 24.113
516
+ - type: mrr_at_1
517
+ value: 21.393
518
+ - type: mrr_at_10
519
+ value: 30.187
520
+ - type: mrr_at_100
521
+ value: 31.089
522
+ - type: mrr_at_1000
523
+ value: 31.15
524
+ - type: mrr_at_3
525
+ value: 27.279999999999998
526
+ - type: mrr_at_5
527
+ value: 29.127
528
+ - type: ndcg_at_1
529
+ value: 21.393
530
+ - type: ndcg_at_10
531
+ value: 30.668
532
+ - type: ndcg_at_100
533
+ value: 36.543
534
+ - type: ndcg_at_1000
535
+ value: 39.181
536
+ - type: ndcg_at_3
537
+ value: 25.552000000000003
538
+ - type: ndcg_at_5
539
+ value: 28.176000000000002
540
+ - type: precision_at_1
541
+ value: 21.393
542
+ - type: precision_at_10
543
+ value: 5.784000000000001
544
+ - type: precision_at_100
545
+ value: 1.001
546
+ - type: precision_at_1000
547
+ value: 0.136
548
+ - type: precision_at_3
549
+ value: 12.231
550
+ - type: precision_at_5
551
+ value: 9.179
552
+ - type: recall_at_1
553
+ value: 17.326
554
+ - type: recall_at_10
555
+ value: 42.415000000000006
556
+ - type: recall_at_100
557
+ value: 68.605
558
+ - type: recall_at_1000
559
+ value: 87.694
560
+ - type: recall_at_3
561
+ value: 28.343
562
+ - type: recall_at_5
563
+ value: 35.086
564
+ - task:
565
+ type: Retrieval
566
+ dataset:
567
+ type: BeIR/cqadupstack
568
+ name: MTEB CQADupstackPhysicsRetrieval
569
+ config: default
570
+ split: test
571
+ revision: None
572
+ metrics:
573
+ - type: map_at_1
574
+ value: 29.069
575
+ - type: map_at_10
576
+ value: 40.027
577
+ - type: map_at_100
578
+ value: 41.308
579
+ - type: map_at_1000
580
+ value: 41.412
581
+ - type: map_at_3
582
+ value: 36.864000000000004
583
+ - type: map_at_5
584
+ value: 38.641999999999996
585
+ - type: mrr_at_1
586
+ value: 35.707
587
+ - type: mrr_at_10
588
+ value: 45.527
589
+ - type: mrr_at_100
590
+ value: 46.348
591
+ - type: mrr_at_1000
592
+ value: 46.392
593
+ - type: mrr_at_3
594
+ value: 43.086
595
+ - type: mrr_at_5
596
+ value: 44.645
597
+ - type: ndcg_at_1
598
+ value: 35.707
599
+ - type: ndcg_at_10
600
+ value: 46.117000000000004
601
+ - type: ndcg_at_100
602
+ value: 51.468
603
+ - type: ndcg_at_1000
604
+ value: 53.412000000000006
605
+ - type: ndcg_at_3
606
+ value: 41.224
607
+ - type: ndcg_at_5
608
+ value: 43.637
609
+ - type: precision_at_1
610
+ value: 35.707
611
+ - type: precision_at_10
612
+ value: 8.459999999999999
613
+ - type: precision_at_100
614
+ value: 1.2970000000000002
615
+ - type: precision_at_1000
616
+ value: 0.165
617
+ - type: precision_at_3
618
+ value: 19.731
619
+ - type: precision_at_5
620
+ value: 14.013
621
+ - type: recall_at_1
622
+ value: 29.069
623
+ - type: recall_at_10
624
+ value: 58.343999999999994
625
+ - type: recall_at_100
626
+ value: 81.296
627
+ - type: recall_at_1000
628
+ value: 93.974
629
+ - type: recall_at_3
630
+ value: 44.7
631
+ - type: recall_at_5
632
+ value: 50.88700000000001
633
+ - task:
634
+ type: Retrieval
635
+ dataset:
636
+ type: BeIR/cqadupstack
637
+ name: MTEB CQADupstackProgrammersRetrieval
638
+ config: default
639
+ split: test
640
+ revision: None
641
+ metrics:
642
+ - type: map_at_1
643
+ value: 23.905
644
+ - type: map_at_10
645
+ value: 33.983000000000004
646
+ - type: map_at_100
647
+ value: 35.372
648
+ - type: map_at_1000
649
+ value: 35.487
650
+ - type: map_at_3
651
+ value: 30.902
652
+ - type: map_at_5
653
+ value: 32.505
654
+ - type: mrr_at_1
655
+ value: 29.794999999999998
656
+ - type: mrr_at_10
657
+ value: 39.28
658
+ - type: mrr_at_100
659
+ value: 40.215
660
+ - type: mrr_at_1000
661
+ value: 40.276
662
+ - type: mrr_at_3
663
+ value: 36.701
664
+ - type: mrr_at_5
665
+ value: 38.105
666
+ - type: ndcg_at_1
667
+ value: 29.794999999999998
668
+ - type: ndcg_at_10
669
+ value: 40.041
670
+ - type: ndcg_at_100
671
+ value: 45.884
672
+ - type: ndcg_at_1000
673
+ value: 48.271
674
+ - type: ndcg_at_3
675
+ value: 34.931
676
+ - type: ndcg_at_5
677
+ value: 37.044
678
+ - type: precision_at_1
679
+ value: 29.794999999999998
680
+ - type: precision_at_10
681
+ value: 7.546
682
+ - type: precision_at_100
683
+ value: 1.216
684
+ - type: precision_at_1000
685
+ value: 0.158
686
+ - type: precision_at_3
687
+ value: 16.933
688
+ - type: precision_at_5
689
+ value: 12.1
690
+ - type: recall_at_1
691
+ value: 23.905
692
+ - type: recall_at_10
693
+ value: 52.945
694
+ - type: recall_at_100
695
+ value: 77.551
696
+ - type: recall_at_1000
697
+ value: 93.793
698
+ - type: recall_at_3
699
+ value: 38.364
700
+ - type: recall_at_5
701
+ value: 44.044
702
+ - task:
703
+ type: Retrieval
704
+ dataset:
705
+ type: BeIR/cqadupstack
706
+ name: MTEB CQADupstackRetrieval
707
+ config: default
708
+ split: test
709
+ revision: None
710
+ metrics:
711
+ - type: map_at_1
712
+ value: 25.24441666666667
713
+ - type: map_at_10
714
+ value: 34.4595
715
+ - type: map_at_100
716
+ value: 35.699999999999996
717
+ - type: map_at_1000
718
+ value: 35.8155
719
+ - type: map_at_3
720
+ value: 31.608333333333338
721
+ - type: map_at_5
722
+ value: 33.189416666666666
723
+ - type: mrr_at_1
724
+ value: 29.825250000000004
725
+ - type: mrr_at_10
726
+ value: 38.60875
727
+ - type: mrr_at_100
728
+ value: 39.46575
729
+ - type: mrr_at_1000
730
+ value: 39.52458333333333
731
+ - type: mrr_at_3
732
+ value: 36.145166666666675
733
+ - type: mrr_at_5
734
+ value: 37.57625
735
+ - type: ndcg_at_1
736
+ value: 29.825250000000004
737
+ - type: ndcg_at_10
738
+ value: 39.88741666666667
739
+ - type: ndcg_at_100
740
+ value: 45.17966666666667
741
+ - type: ndcg_at_1000
742
+ value: 47.440583333333336
743
+ - type: ndcg_at_3
744
+ value: 35.04591666666666
745
+ - type: ndcg_at_5
746
+ value: 37.32025
747
+ - type: precision_at_1
748
+ value: 29.825250000000004
749
+ - type: precision_at_10
750
+ value: 7.07225
751
+ - type: precision_at_100
752
+ value: 1.1462499999999998
753
+ - type: precision_at_1000
754
+ value: 0.15325
755
+ - type: precision_at_3
756
+ value: 16.18375
757
+ - type: precision_at_5
758
+ value: 11.526833333333334
759
+ - type: recall_at_1
760
+ value: 25.24441666666667
761
+ - type: recall_at_10
762
+ value: 51.744916666666676
763
+ - type: recall_at_100
764
+ value: 75.04574999999998
765
+ - type: recall_at_1000
766
+ value: 90.65558333333334
767
+ - type: recall_at_3
768
+ value: 38.28349999999999
769
+ - type: recall_at_5
770
+ value: 44.16591666666667
771
+ - task:
772
+ type: Retrieval
773
+ dataset:
774
+ type: BeIR/cqadupstack
775
+ name: MTEB CQADupstackStatsRetrieval
776
+ config: default
777
+ split: test
778
+ revision: None
779
+ metrics:
780
+ - type: map_at_1
781
+ value: 24.237000000000002
782
+ - type: map_at_10
783
+ value: 30.667
784
+ - type: map_at_100
785
+ value: 31.592
786
+ - type: map_at_1000
787
+ value: 31.688
788
+ - type: map_at_3
789
+ value: 28.810999999999996
790
+ - type: map_at_5
791
+ value: 29.788999999999998
792
+ - type: mrr_at_1
793
+ value: 26.840000000000003
794
+ - type: mrr_at_10
795
+ value: 33.305
796
+ - type: mrr_at_100
797
+ value: 34.089000000000006
798
+ - type: mrr_at_1000
799
+ value: 34.159
800
+ - type: mrr_at_3
801
+ value: 31.518
802
+ - type: mrr_at_5
803
+ value: 32.469
804
+ - type: ndcg_at_1
805
+ value: 26.840000000000003
806
+ - type: ndcg_at_10
807
+ value: 34.541
808
+ - type: ndcg_at_100
809
+ value: 39.206
810
+ - type: ndcg_at_1000
811
+ value: 41.592
812
+ - type: ndcg_at_3
813
+ value: 31.005
814
+ - type: ndcg_at_5
815
+ value: 32.554
816
+ - type: precision_at_1
817
+ value: 26.840000000000003
818
+ - type: precision_at_10
819
+ value: 5.3069999999999995
820
+ - type: precision_at_100
821
+ value: 0.8340000000000001
822
+ - type: precision_at_1000
823
+ value: 0.11199999999999999
824
+ - type: precision_at_3
825
+ value: 13.292000000000002
826
+ - type: precision_at_5
827
+ value: 9.049
828
+ - type: recall_at_1
829
+ value: 24.237000000000002
830
+ - type: recall_at_10
831
+ value: 43.862
832
+ - type: recall_at_100
833
+ value: 65.352
834
+ - type: recall_at_1000
835
+ value: 82.704
836
+ - type: recall_at_3
837
+ value: 34.009
838
+ - type: recall_at_5
839
+ value: 37.878
840
+ - task:
841
+ type: Retrieval
842
+ dataset:
843
+ type: BeIR/cqadupstack
844
+ name: MTEB CQADupstackTexRetrieval
845
+ config: default
846
+ split: test
847
+ revision: None
848
+ metrics:
849
+ - type: map_at_1
850
+ value: 16.482
851
+ - type: map_at_10
852
+ value: 23.249
853
+ - type: map_at_100
854
+ value: 24.388
855
+ - type: map_at_1000
856
+ value: 24.519
857
+ - type: map_at_3
858
+ value: 20.971
859
+ - type: map_at_5
860
+ value: 22.192
861
+ - type: mrr_at_1
862
+ value: 19.993
863
+ - type: mrr_at_10
864
+ value: 26.985
865
+ - type: mrr_at_100
866
+ value: 27.975
867
+ - type: mrr_at_1000
868
+ value: 28.052
869
+ - type: mrr_at_3
870
+ value: 24.954
871
+ - type: mrr_at_5
872
+ value: 26.070999999999998
873
+ - type: ndcg_at_1
874
+ value: 19.993
875
+ - type: ndcg_at_10
876
+ value: 27.656
877
+ - type: ndcg_at_100
878
+ value: 33.256
879
+ - type: ndcg_at_1000
880
+ value: 36.275
881
+ - type: ndcg_at_3
882
+ value: 23.644000000000002
883
+ - type: ndcg_at_5
884
+ value: 25.466
885
+ - type: precision_at_1
886
+ value: 19.993
887
+ - type: precision_at_10
888
+ value: 5.093
889
+ - type: precision_at_100
890
+ value: 0.932
891
+ - type: precision_at_1000
892
+ value: 0.13699999999999998
893
+ - type: precision_at_3
894
+ value: 11.149000000000001
895
+ - type: precision_at_5
896
+ value: 8.149000000000001
897
+ - type: recall_at_1
898
+ value: 16.482
899
+ - type: recall_at_10
900
+ value: 37.141999999999996
901
+ - type: recall_at_100
902
+ value: 62.696
903
+ - type: recall_at_1000
904
+ value: 84.333
905
+ - type: recall_at_3
906
+ value: 26.031
907
+ - type: recall_at_5
908
+ value: 30.660999999999998
909
+ - task:
910
+ type: Retrieval
911
+ dataset:
912
+ type: BeIR/cqadupstack
913
+ name: MTEB CQADupstackUnixRetrieval
914
+ config: default
915
+ split: test
916
+ revision: None
917
+ metrics:
918
+ - type: map_at_1
919
+ value: 24.887999999999998
920
+ - type: map_at_10
921
+ value: 34.101
922
+ - type: map_at_100
923
+ value: 35.27
924
+ - type: map_at_1000
925
+ value: 35.370000000000005
926
+ - type: map_at_3
927
+ value: 31.283
928
+ - type: map_at_5
929
+ value: 32.72
930
+ - type: mrr_at_1
931
+ value: 29.011
932
+ - type: mrr_at_10
933
+ value: 38.004
934
+ - type: mrr_at_100
935
+ value: 38.879000000000005
936
+ - type: mrr_at_1000
937
+ value: 38.938
938
+ - type: mrr_at_3
939
+ value: 35.571999999999996
940
+ - type: mrr_at_5
941
+ value: 36.789
942
+ - type: ndcg_at_1
943
+ value: 29.011
944
+ - type: ndcg_at_10
945
+ value: 39.586
946
+ - type: ndcg_at_100
947
+ value: 44.939
948
+ - type: ndcg_at_1000
949
+ value: 47.236
950
+ - type: ndcg_at_3
951
+ value: 34.4
952
+ - type: ndcg_at_5
953
+ value: 36.519
954
+ - type: precision_at_1
955
+ value: 29.011
956
+ - type: precision_at_10
957
+ value: 6.763
958
+ - type: precision_at_100
959
+ value: 1.059
960
+ - type: precision_at_1000
961
+ value: 0.13699999999999998
962
+ - type: precision_at_3
963
+ value: 15.609
964
+ - type: precision_at_5
965
+ value: 10.896
966
+ - type: recall_at_1
967
+ value: 24.887999999999998
968
+ - type: recall_at_10
969
+ value: 52.42
970
+ - type: recall_at_100
971
+ value: 75.803
972
+ - type: recall_at_1000
973
+ value: 91.725
974
+ - type: recall_at_3
975
+ value: 38.080999999999996
976
+ - type: recall_at_5
977
+ value: 43.47
978
+ - task:
979
+ type: Retrieval
980
+ dataset:
981
+ type: BeIR/cqadupstack
982
+ name: MTEB CQADupstackWebmastersRetrieval
983
+ config: default
984
+ split: test
985
+ revision: None
986
+ metrics:
987
+ - type: map_at_1
988
+ value: 23.953
989
+ - type: map_at_10
990
+ value: 32.649
991
+ - type: map_at_100
992
+ value: 34.181
993
+ - type: map_at_1000
994
+ value: 34.398
995
+ - type: map_at_3
996
+ value: 29.567
997
+ - type: map_at_5
998
+ value: 31.263
999
+ - type: mrr_at_1
1000
+ value: 29.051
1001
+ - type: mrr_at_10
1002
+ value: 37.419999999999995
1003
+ - type: mrr_at_100
1004
+ value: 38.396
1005
+ - type: mrr_at_1000
1006
+ value: 38.458
1007
+ - type: mrr_at_3
1008
+ value: 34.782999999999994
1009
+ - type: mrr_at_5
1010
+ value: 36.254999999999995
1011
+ - type: ndcg_at_1
1012
+ value: 29.051
1013
+ - type: ndcg_at_10
1014
+ value: 38.595
1015
+ - type: ndcg_at_100
1016
+ value: 44.6
1017
+ - type: ndcg_at_1000
1018
+ value: 47.158
1019
+ - type: ndcg_at_3
1020
+ value: 33.56
1021
+ - type: ndcg_at_5
1022
+ value: 35.870000000000005
1023
+ - type: precision_at_1
1024
+ value: 29.051
1025
+ - type: precision_at_10
1026
+ value: 7.53
1027
+ - type: precision_at_100
1028
+ value: 1.538
1029
+ - type: precision_at_1000
1030
+ value: 0.24
1031
+ - type: precision_at_3
1032
+ value: 15.744
1033
+ - type: precision_at_5
1034
+ value: 11.542
1035
+ - type: recall_at_1
1036
+ value: 23.953
1037
+ - type: recall_at_10
1038
+ value: 50.08200000000001
1039
+ - type: recall_at_100
1040
+ value: 77.364
1041
+ - type: recall_at_1000
1042
+ value: 93.57799999999999
1043
+ - type: recall_at_3
1044
+ value: 35.432
1045
+ - type: recall_at_5
1046
+ value: 41.875
1047
+ - task:
1048
+ type: Retrieval
1049
+ dataset:
1050
+ type: BeIR/cqadupstack
1051
+ name: MTEB CQADupstackWordpressRetrieval
1052
+ config: default
1053
+ split: test
1054
+ revision: None
1055
+ metrics:
1056
+ - type: map_at_1
1057
+ value: 17.72
1058
+ - type: map_at_10
1059
+ value: 25.724000000000004
1060
+ - type: map_at_100
1061
+ value: 26.846999999999998
1062
+ - type: map_at_1000
1063
+ value: 26.964
1064
+ - type: map_at_3
1065
+ value: 22.909
1066
+ - type: map_at_5
1067
+ value: 24.596999999999998
1068
+ - type: mrr_at_1
1069
+ value: 18.854000000000003
1070
+ - type: mrr_at_10
1071
+ value: 27.182000000000002
1072
+ - type: mrr_at_100
1073
+ value: 28.182000000000002
1074
+ - type: mrr_at_1000
1075
+ value: 28.274
1076
+ - type: mrr_at_3
1077
+ value: 24.276
1078
+ - type: mrr_at_5
1079
+ value: 26.115
1080
+ - type: ndcg_at_1
1081
+ value: 18.854000000000003
1082
+ - type: ndcg_at_10
1083
+ value: 30.470000000000002
1084
+ - type: ndcg_at_100
1085
+ value: 35.854
1086
+ - type: ndcg_at_1000
1087
+ value: 38.701
1088
+ - type: ndcg_at_3
1089
+ value: 24.924
1090
+ - type: ndcg_at_5
1091
+ value: 27.895999999999997
1092
+ - type: precision_at_1
1093
+ value: 18.854000000000003
1094
+ - type: precision_at_10
1095
+ value: 5.009
1096
+ - type: precision_at_100
1097
+ value: 0.835
1098
+ - type: precision_at_1000
1099
+ value: 0.117
1100
+ - type: precision_at_3
1101
+ value: 10.721
1102
+ - type: precision_at_5
1103
+ value: 8.133
1104
+ - type: recall_at_1
1105
+ value: 17.72
1106
+ - type: recall_at_10
1107
+ value: 43.617
1108
+ - type: recall_at_100
1109
+ value: 67.941
1110
+ - type: recall_at_1000
1111
+ value: 88.979
1112
+ - type: recall_at_3
1113
+ value: 29.044999999999998
1114
+ - type: recall_at_5
1115
+ value: 36.044
1116
+ - task:
1117
+ type: Retrieval
1118
+ dataset:
1119
+ type: climate-fever
1120
+ name: MTEB ClimateFEVER
1121
+ config: default
1122
+ split: test
1123
+ revision: None
1124
+ metrics:
1125
+ - type: map_at_1
1126
+ value: 13.427
1127
+ - type: map_at_10
1128
+ value: 22.935
1129
+ - type: map_at_100
1130
+ value: 24.808
1131
+ - type: map_at_1000
1132
+ value: 24.994
1133
+ - type: map_at_3
1134
+ value: 19.533
1135
+ - type: map_at_5
1136
+ value: 21.261
1137
+ - type: mrr_at_1
1138
+ value: 30.945
1139
+ - type: mrr_at_10
1140
+ value: 43.242000000000004
1141
+ - type: mrr_at_100
1142
+ value: 44.013999999999996
1143
+ - type: mrr_at_1000
1144
+ value: 44.048
1145
+ - type: mrr_at_3
1146
+ value: 40.109
1147
+ - type: mrr_at_5
1148
+ value: 42.059999999999995
1149
+ - type: ndcg_at_1
1150
+ value: 30.945
1151
+ - type: ndcg_at_10
1152
+ value: 31.828
1153
+ - type: ndcg_at_100
1154
+ value: 38.801
1155
+ - type: ndcg_at_1000
1156
+ value: 42.126999999999995
1157
+ - type: ndcg_at_3
1158
+ value: 26.922
1159
+ - type: ndcg_at_5
1160
+ value: 28.483999999999998
1161
+ - type: precision_at_1
1162
+ value: 30.945
1163
+ - type: precision_at_10
1164
+ value: 9.844
1165
+ - type: precision_at_100
1166
+ value: 1.7309999999999999
1167
+ - type: precision_at_1000
1168
+ value: 0.23500000000000001
1169
+ - type: precision_at_3
1170
+ value: 20.477999999999998
1171
+ - type: precision_at_5
1172
+ value: 15.27
1173
+ - type: recall_at_1
1174
+ value: 13.427
1175
+ - type: recall_at_10
1176
+ value: 37.141000000000005
1177
+ - type: recall_at_100
1178
+ value: 61.007
1179
+ - type: recall_at_1000
1180
+ value: 79.742
1181
+ - type: recall_at_3
1182
+ value: 24.431
1183
+ - type: recall_at_5
1184
+ value: 29.725
1185
+ - task:
1186
+ type: Retrieval
1187
+ dataset:
1188
+ type: dbpedia-entity
1189
+ name: MTEB DBPedia
1190
+ config: default
1191
+ split: test
1192
+ revision: None
1193
+ metrics:
1194
+ - type: map_at_1
1195
+ value: 9.122
1196
+ - type: map_at_10
1197
+ value: 18.799
1198
+ - type: map_at_100
1199
+ value: 25.724999999999998
1200
+ - type: map_at_1000
1201
+ value: 27.205000000000002
1202
+ - type: map_at_3
1203
+ value: 14.194999999999999
1204
+ - type: map_at_5
1205
+ value: 16.225
1206
+ - type: mrr_at_1
1207
+ value: 68.0
1208
+ - type: mrr_at_10
1209
+ value: 76.035
1210
+ - type: mrr_at_100
1211
+ value: 76.292
1212
+ - type: mrr_at_1000
1213
+ value: 76.297
1214
+ - type: mrr_at_3
1215
+ value: 74.458
1216
+ - type: mrr_at_5
1217
+ value: 75.558
1218
+ - type: ndcg_at_1
1219
+ value: 56.00000000000001
1220
+ - type: ndcg_at_10
1221
+ value: 39.761
1222
+ - type: ndcg_at_100
1223
+ value: 43.736999999999995
1224
+ - type: ndcg_at_1000
1225
+ value: 51.146
1226
+ - type: ndcg_at_3
1227
+ value: 45.921
1228
+ - type: ndcg_at_5
1229
+ value: 42.756
1230
+ - type: precision_at_1
1231
+ value: 68.0
1232
+ - type: precision_at_10
1233
+ value: 30.275000000000002
1234
+ - type: precision_at_100
1235
+ value: 9.343
1236
+ - type: precision_at_1000
1237
+ value: 1.8270000000000002
1238
+ - type: precision_at_3
1239
+ value: 49.167
1240
+ - type: precision_at_5
1241
+ value: 40.699999999999996
1242
+ - type: recall_at_1
1243
+ value: 9.122
1244
+ - type: recall_at_10
1245
+ value: 23.669999999999998
1246
+ - type: recall_at_100
1247
+ value: 48.719
1248
+ - type: recall_at_1000
1249
+ value: 72.033
1250
+ - type: recall_at_3
1251
+ value: 15.498999999999999
1252
+ - type: recall_at_5
1253
+ value: 18.657
1254
+ - task:
1255
+ type: Classification
1256
+ dataset:
1257
+ type: mteb/emotion
1258
+ name: MTEB EmotionClassification
1259
+ config: default
1260
+ split: test
1261
+ revision: 4f58c6b202a23cf9a4da393831edf4f9183cad37
1262
+ metrics:
1263
+ - type: accuracy
1264
+ value: 55.885000000000005
1265
+ - type: f1
1266
+ value: 50.70726446938571
1267
+ - task:
1268
+ type: Retrieval
1269
+ dataset:
1270
+ type: fever
1271
+ name: MTEB FEVER
1272
+ config: default
1273
+ split: test
1274
+ revision: None
1275
+ metrics:
1276
+ - type: map_at_1
1277
+ value: 75.709
1278
+ - type: map_at_10
1279
+ value: 83.345
1280
+ - type: map_at_100
1281
+ value: 83.557
1282
+ - type: map_at_1000
1283
+ value: 83.572
1284
+ - type: map_at_3
1285
+ value: 82.425
1286
+ - type: map_at_5
1287
+ value: 83.013
1288
+ - type: mrr_at_1
1289
+ value: 81.593
1290
+ - type: mrr_at_10
1291
+ value: 88.331
1292
+ - type: mrr_at_100
1293
+ value: 88.408
1294
+ - type: mrr_at_1000
1295
+ value: 88.41
1296
+ - type: mrr_at_3
1297
+ value: 87.714
1298
+ - type: mrr_at_5
1299
+ value: 88.122
1300
+ - type: ndcg_at_1
1301
+ value: 81.593
1302
+ - type: ndcg_at_10
1303
+ value: 86.925
1304
+ - type: ndcg_at_100
1305
+ value: 87.67
1306
+ - type: ndcg_at_1000
1307
+ value: 87.924
1308
+ - type: ndcg_at_3
1309
+ value: 85.5
1310
+ - type: ndcg_at_5
1311
+ value: 86.283
1312
+ - type: precision_at_1
1313
+ value: 81.593
1314
+ - type: precision_at_10
1315
+ value: 10.264
1316
+ - type: precision_at_100
1317
+ value: 1.084
1318
+ - type: precision_at_1000
1319
+ value: 0.11199999999999999
1320
+ - type: precision_at_3
1321
+ value: 32.388
1322
+ - type: precision_at_5
1323
+ value: 19.991
1324
+ - type: recall_at_1
1325
+ value: 75.709
1326
+ - type: recall_at_10
1327
+ value: 93.107
1328
+ - type: recall_at_100
1329
+ value: 96.024
1330
+ - type: recall_at_1000
1331
+ value: 97.603
1332
+ - type: recall_at_3
1333
+ value: 89.08500000000001
1334
+ - type: recall_at_5
1335
+ value: 91.15299999999999
1336
+ - task:
1337
+ type: Retrieval
1338
+ dataset:
1339
+ type: fiqa
1340
+ name: MTEB FiQA2018
1341
+ config: default
1342
+ split: test
1343
+ revision: None
1344
+ metrics:
1345
+ - type: map_at_1
1346
+ value: 19.121
1347
+ - type: map_at_10
1348
+ value: 31.78
1349
+ - type: map_at_100
1350
+ value: 33.497
1351
+ - type: map_at_1000
1352
+ value: 33.696
1353
+ - type: map_at_3
1354
+ value: 27.893
1355
+ - type: map_at_5
1356
+ value: 30.087000000000003
1357
+ - type: mrr_at_1
1358
+ value: 38.272
1359
+ - type: mrr_at_10
1360
+ value: 47.176
1361
+ - type: mrr_at_100
1362
+ value: 48.002
1363
+ - type: mrr_at_1000
1364
+ value: 48.044
1365
+ - type: mrr_at_3
1366
+ value: 45.086999999999996
1367
+ - type: mrr_at_5
1368
+ value: 46.337
1369
+ - type: ndcg_at_1
1370
+ value: 38.272
1371
+ - type: ndcg_at_10
1372
+ value: 39.145
1373
+ - type: ndcg_at_100
1374
+ value: 45.696999999999996
1375
+ - type: ndcg_at_1000
1376
+ value: 49.0
1377
+ - type: ndcg_at_3
1378
+ value: 36.148
1379
+ - type: ndcg_at_5
1380
+ value: 37.023
1381
+ - type: precision_at_1
1382
+ value: 38.272
1383
+ - type: precision_at_10
1384
+ value: 11.065
1385
+ - type: precision_at_100
1386
+ value: 1.7840000000000003
1387
+ - type: precision_at_1000
1388
+ value: 0.23600000000000002
1389
+ - type: precision_at_3
1390
+ value: 24.587999999999997
1391
+ - type: precision_at_5
1392
+ value: 18.056
1393
+ - type: recall_at_1
1394
+ value: 19.121
1395
+ - type: recall_at_10
1396
+ value: 44.857
1397
+ - type: recall_at_100
1398
+ value: 69.774
1399
+ - type: recall_at_1000
1400
+ value: 89.645
1401
+ - type: recall_at_3
1402
+ value: 32.588
1403
+ - type: recall_at_5
1404
+ value: 37.939
1405
+ - task:
1406
+ type: Retrieval
1407
+ dataset:
1408
+ type: hotpotqa
1409
+ name: MTEB HotpotQA
1410
+ config: default
1411
+ split: test
1412
+ revision: None
1413
+ metrics:
1414
+ - type: map_at_1
1415
+ value: 36.428
1416
+ - type: map_at_10
1417
+ value: 56.891999999999996
1418
+ - type: map_at_100
1419
+ value: 57.82899999999999
1420
+ - type: map_at_1000
1421
+ value: 57.896
1422
+ - type: map_at_3
1423
+ value: 53.762
1424
+ - type: map_at_5
1425
+ value: 55.718
1426
+ - type: mrr_at_1
1427
+ value: 72.856
1428
+ - type: mrr_at_10
1429
+ value: 79.245
1430
+ - type: mrr_at_100
1431
+ value: 79.515
1432
+ - type: mrr_at_1000
1433
+ value: 79.525
1434
+ - type: mrr_at_3
1435
+ value: 78.143
1436
+ - type: mrr_at_5
1437
+ value: 78.822
1438
+ - type: ndcg_at_1
1439
+ value: 72.856
1440
+ - type: ndcg_at_10
1441
+ value: 65.204
1442
+ - type: ndcg_at_100
1443
+ value: 68.552
1444
+ - type: ndcg_at_1000
1445
+ value: 69.902
1446
+ - type: ndcg_at_3
1447
+ value: 60.632
1448
+ - type: ndcg_at_5
1449
+ value: 63.161
1450
+ - type: precision_at_1
1451
+ value: 72.856
1452
+ - type: precision_at_10
1453
+ value: 13.65
1454
+ - type: precision_at_100
1455
+ value: 1.6260000000000001
1456
+ - type: precision_at_1000
1457
+ value: 0.181
1458
+ - type: precision_at_3
1459
+ value: 38.753
1460
+ - type: precision_at_5
1461
+ value: 25.251
1462
+ - type: recall_at_1
1463
+ value: 36.428
1464
+ - type: recall_at_10
1465
+ value: 68.25099999999999
1466
+ - type: recall_at_100
1467
+ value: 81.317
1468
+ - type: recall_at_1000
1469
+ value: 90.27
1470
+ - type: recall_at_3
1471
+ value: 58.13
1472
+ - type: recall_at_5
1473
+ value: 63.126000000000005
1474
+ - task:
1475
+ type: Classification
1476
+ dataset:
1477
+ type: mteb/imdb
1478
+ name: MTEB ImdbClassification
1479
+ config: default
1480
+ split: test
1481
+ revision: 3d86128a09e091d6018b6d26cad27f2739fc2db7
1482
+ metrics:
1483
+ - type: accuracy
1484
+ value: 89.4868
1485
+ - type: ap
1486
+ value: 84.88319192880247
1487
+ - type: f1
1488
+ value: 89.46144458052846
1489
+ - task:
1490
+ type: Retrieval
1491
+ dataset:
1492
+ type: msmarco
1493
+ name: MTEB MSMARCO
1494
+ config: default
1495
+ split: dev
1496
+ revision: None
1497
+ metrics:
1498
+ - type: map_at_1
1499
+ value: 21.282999999999998
1500
+ - type: map_at_10
1501
+ value: 33.045
1502
+ - type: map_at_100
1503
+ value: 34.238
1504
+ - type: map_at_1000
1505
+ value: 34.29
1506
+ - type: map_at_3
1507
+ value: 29.305999999999997
1508
+ - type: map_at_5
1509
+ value: 31.391000000000002
1510
+ - type: mrr_at_1
1511
+ value: 21.92
1512
+ - type: mrr_at_10
1513
+ value: 33.649
1514
+ - type: mrr_at_100
1515
+ value: 34.791
1516
+ - type: mrr_at_1000
1517
+ value: 34.837
1518
+ - type: mrr_at_3
1519
+ value: 30.0
1520
+ - type: mrr_at_5
1521
+ value: 32.039
1522
+ - type: ndcg_at_1
1523
+ value: 21.92
1524
+ - type: ndcg_at_10
1525
+ value: 39.729
1526
+ - type: ndcg_at_100
1527
+ value: 45.484
1528
+ - type: ndcg_at_1000
1529
+ value: 46.817
1530
+ - type: ndcg_at_3
1531
+ value: 32.084
1532
+ - type: ndcg_at_5
1533
+ value: 35.789
1534
+ - type: precision_at_1
1535
+ value: 21.92
1536
+ - type: precision_at_10
1537
+ value: 6.297
1538
+ - type: precision_at_100
1539
+ value: 0.918
1540
+ - type: precision_at_1000
1541
+ value: 0.10300000000000001
1542
+ - type: precision_at_3
1543
+ value: 13.639000000000001
1544
+ - type: precision_at_5
1545
+ value: 10.054
1546
+ - type: recall_at_1
1547
+ value: 21.282999999999998
1548
+ - type: recall_at_10
1549
+ value: 60.343999999999994
1550
+ - type: recall_at_100
1551
+ value: 86.981
1552
+ - type: recall_at_1000
1553
+ value: 97.205
1554
+ - type: recall_at_3
1555
+ value: 39.452999999999996
1556
+ - type: recall_at_5
1557
+ value: 48.333
1558
+ - task:
1559
+ type: Classification
1560
+ dataset:
1561
+ type: mteb/mtop_domain
1562
+ name: MTEB MTOPDomainClassification (en)
1563
+ config: en
1564
+ split: test
1565
+ revision: d80d48c1eb48d3562165c59d59d0034df9fff0bf
1566
+ metrics:
1567
+ - type: accuracy
1568
+ value: 95.47879616963064
1569
+ - type: f1
1570
+ value: 95.21800589958251
1571
+ - task:
1572
+ type: Classification
1573
+ dataset:
1574
+ type: mteb/mtop_intent
1575
+ name: MTEB MTOPIntentClassification (en)
1576
+ config: en
1577
+ split: test
1578
+ revision: ae001d0e6b1228650b7bd1c2c65fb50ad11a8aba
1579
+ metrics:
1580
+ - type: accuracy
1581
+ value: 79.09256725946192
1582
+ - type: f1
1583
+ value: 60.554043889452515
1584
+ - task:
1585
+ type: Classification
1586
+ dataset:
1587
+ type: mteb/amazon_massive_intent
1588
+ name: MTEB MassiveIntentClassification (en)
1589
+ config: en
1590
+ split: test
1591
+ revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7
1592
+ metrics:
1593
+ - type: accuracy
1594
+ value: 75.53463349024882
1595
+ - type: f1
1596
+ value: 73.14418495756476
1597
+ - task:
1598
+ type: Classification
1599
+ dataset:
1600
+ type: mteb/amazon_massive_scenario
1601
+ name: MTEB MassiveScenarioClassification (en)
1602
+ config: en
1603
+ split: test
1604
+ revision: 7d571f92784cd94a019292a1f45445077d0ef634
1605
+ metrics:
1606
+ - type: accuracy
1607
+ value: 79.22663080026899
1608
+ - type: f1
1609
+ value: 79.331456217501
1610
+ - task:
1611
+ type: Clustering
1612
+ dataset:
1613
+ type: mteb/medrxiv-clustering-p2p
1614
+ name: MTEB MedrxivClusteringP2P
1615
+ config: default
1616
+ split: test
1617
+ revision: e7a26af6f3ae46b30dde8737f02c07b1505bcc73
1618
+ metrics:
1619
+ - type: v_measure
1620
+ value: 34.50316010430136
1621
+ - task:
1622
+ type: Clustering
1623
+ dataset:
1624
+ type: mteb/medrxiv-clustering-s2s
1625
+ name: MTEB MedrxivClusteringS2S
1626
+ config: default
1627
+ split: test
1628
+ revision: 35191c8c0dca72d8ff3efcd72aa802307d469663
1629
+ metrics:
1630
+ - type: v_measure
1631
+ value: 32.15612040042282
1632
+ - task:
1633
+ type: Reranking
1634
+ dataset:
1635
+ type: mteb/mind_small
1636
+ name: MTEB MindSmallReranking
1637
+ config: default
1638
+ split: test
1639
+ revision: 3bdac13927fdc888b903db93b2ffdbd90b295a69
1640
+ metrics:
1641
+ - type: map
1642
+ value: 32.36227552557184
1643
+ - type: mrr
1644
+ value: 33.57901344209811
1645
+ - task:
1646
+ type: Retrieval
1647
+ dataset:
1648
+ type: nfcorpus
1649
+ name: MTEB NFCorpus
1650
+ config: default
1651
+ split: test
1652
+ revision: None
1653
+ metrics:
1654
+ - type: map_at_1
1655
+ value: 5.6610000000000005
1656
+ - type: map_at_10
1657
+ value: 12.992
1658
+ - type: map_at_100
1659
+ value: 16.756999999999998
1660
+ - type: map_at_1000
1661
+ value: 18.25
1662
+ - type: map_at_3
1663
+ value: 9.471
1664
+ - type: map_at_5
1665
+ value: 11.116
1666
+ - type: mrr_at_1
1667
+ value: 43.653
1668
+ - type: mrr_at_10
1669
+ value: 53.388999999999996
1670
+ - type: mrr_at_100
1671
+ value: 53.982
1672
+ - type: mrr_at_1000
1673
+ value: 54.033
1674
+ - type: mrr_at_3
1675
+ value: 51.858000000000004
1676
+ - type: mrr_at_5
1677
+ value: 53.019000000000005
1678
+ - type: ndcg_at_1
1679
+ value: 41.641
1680
+ - type: ndcg_at_10
1681
+ value: 34.691
1682
+ - type: ndcg_at_100
1683
+ value: 32.305
1684
+ - type: ndcg_at_1000
1685
+ value: 41.132999999999996
1686
+ - type: ndcg_at_3
1687
+ value: 40.614
1688
+ - type: ndcg_at_5
1689
+ value: 38.456
1690
+ - type: precision_at_1
1691
+ value: 43.344
1692
+ - type: precision_at_10
1693
+ value: 25.881999999999998
1694
+ - type: precision_at_100
1695
+ value: 8.483
1696
+ - type: precision_at_1000
1697
+ value: 2.131
1698
+ - type: precision_at_3
1699
+ value: 38.803
1700
+ - type: precision_at_5
1701
+ value: 33.87
1702
+ - type: recall_at_1
1703
+ value: 5.6610000000000005
1704
+ - type: recall_at_10
1705
+ value: 16.826
1706
+ - type: recall_at_100
1707
+ value: 32.939
1708
+ - type: recall_at_1000
1709
+ value: 65.161
1710
+ - type: recall_at_3
1711
+ value: 10.756
1712
+ - type: recall_at_5
1713
+ value: 13.331000000000001
1714
+ - task:
1715
+ type: Retrieval
1716
+ dataset:
1717
+ type: nq
1718
+ name: MTEB NQ
1719
+ config: default
1720
+ split: test
1721
+ revision: None
1722
+ metrics:
1723
+ - type: map_at_1
1724
+ value: 26.692
1725
+ - type: map_at_10
1726
+ value: 41.065000000000005
1727
+ - type: map_at_100
1728
+ value: 42.235
1729
+ - type: map_at_1000
1730
+ value: 42.27
1731
+ - type: map_at_3
1732
+ value: 36.635
1733
+ - type: map_at_5
1734
+ value: 39.219
1735
+ - type: mrr_at_1
1736
+ value: 30.214000000000002
1737
+ - type: mrr_at_10
1738
+ value: 43.443
1739
+ - type: mrr_at_100
1740
+ value: 44.326
1741
+ - type: mrr_at_1000
1742
+ value: 44.352000000000004
1743
+ - type: mrr_at_3
1744
+ value: 39.623999999999995
1745
+ - type: mrr_at_5
1746
+ value: 41.898
1747
+ - type: ndcg_at_1
1748
+ value: 30.214000000000002
1749
+ - type: ndcg_at_10
1750
+ value: 48.692
1751
+ - type: ndcg_at_100
1752
+ value: 53.671
1753
+ - type: ndcg_at_1000
1754
+ value: 54.522000000000006
1755
+ - type: ndcg_at_3
1756
+ value: 40.245
1757
+ - type: ndcg_at_5
1758
+ value: 44.580999999999996
1759
+ - type: precision_at_1
1760
+ value: 30.214000000000002
1761
+ - type: precision_at_10
1762
+ value: 8.3
1763
+ - type: precision_at_100
1764
+ value: 1.1079999999999999
1765
+ - type: precision_at_1000
1766
+ value: 0.11900000000000001
1767
+ - type: precision_at_3
1768
+ value: 18.521
1769
+ - type: precision_at_5
1770
+ value: 13.627
1771
+ - type: recall_at_1
1772
+ value: 26.692
1773
+ - type: recall_at_10
1774
+ value: 69.699
1775
+ - type: recall_at_100
1776
+ value: 91.425
1777
+ - type: recall_at_1000
1778
+ value: 97.78099999999999
1779
+ - type: recall_at_3
1780
+ value: 47.711
1781
+ - type: recall_at_5
1782
+ value: 57.643
1783
+ - task:
1784
+ type: Retrieval
1785
+ dataset:
1786
+ type: quora
1787
+ name: MTEB QuoraRetrieval
1788
+ config: default
1789
+ split: test
1790
+ revision: None
1791
+ metrics:
1792
+ - type: map_at_1
1793
+ value: 70.962
1794
+ - type: map_at_10
1795
+ value: 84.772
1796
+ - type: map_at_100
1797
+ value: 85.402
1798
+ - type: map_at_1000
1799
+ value: 85.418
1800
+ - type: map_at_3
1801
+ value: 81.89
1802
+ - type: map_at_5
1803
+ value: 83.685
1804
+ - type: mrr_at_1
1805
+ value: 81.67
1806
+ - type: mrr_at_10
1807
+ value: 87.681
1808
+ - type: mrr_at_100
1809
+ value: 87.792
1810
+ - type: mrr_at_1000
1811
+ value: 87.79299999999999
1812
+ - type: mrr_at_3
1813
+ value: 86.803
1814
+ - type: mrr_at_5
1815
+ value: 87.392
1816
+ - type: ndcg_at_1
1817
+ value: 81.69
1818
+ - type: ndcg_at_10
1819
+ value: 88.429
1820
+ - type: ndcg_at_100
1821
+ value: 89.66
1822
+ - type: ndcg_at_1000
1823
+ value: 89.762
1824
+ - type: ndcg_at_3
1825
+ value: 85.75
1826
+ - type: ndcg_at_5
1827
+ value: 87.20700000000001
1828
+ - type: precision_at_1
1829
+ value: 81.69
1830
+ - type: precision_at_10
1831
+ value: 13.395000000000001
1832
+ - type: precision_at_100
1833
+ value: 1.528
1834
+ - type: precision_at_1000
1835
+ value: 0.157
1836
+ - type: precision_at_3
1837
+ value: 37.507000000000005
1838
+ - type: precision_at_5
1839
+ value: 24.614
1840
+ - type: recall_at_1
1841
+ value: 70.962
1842
+ - type: recall_at_10
1843
+ value: 95.339
1844
+ - type: recall_at_100
1845
+ value: 99.543
1846
+ - type: recall_at_1000
1847
+ value: 99.984
1848
+ - type: recall_at_3
1849
+ value: 87.54899999999999
1850
+ - type: recall_at_5
1851
+ value: 91.726
1852
+ - task:
1853
+ type: Clustering
1854
+ dataset:
1855
+ type: mteb/reddit-clustering
1856
+ name: MTEB RedditClustering
1857
+ config: default
1858
+ split: test
1859
+ revision: 24640382cdbf8abc73003fb0fa6d111a705499eb
1860
+ metrics:
1861
+ - type: v_measure
1862
+ value: 55.506631779239555
1863
+ - task:
1864
+ type: Clustering
1865
+ dataset:
1866
+ type: mteb/reddit-clustering-p2p
1867
+ name: MTEB RedditClusteringP2P
1868
+ config: default
1869
+ split: test
1870
+ revision: 282350215ef01743dc01b456c7f5241fa8937f16
1871
+ metrics:
1872
+ - type: v_measure
1873
+ value: 60.63731341848479
1874
+ - task:
1875
+ type: Retrieval
1876
+ dataset:
1877
+ type: scidocs
1878
+ name: MTEB SCIDOCS
1879
+ config: default
1880
+ split: test
1881
+ revision: None
1882
+ metrics:
1883
+ - type: map_at_1
1884
+ value: 4.852
1885
+ - type: map_at_10
1886
+ value: 13.175
1887
+ - type: map_at_100
1888
+ value: 15.623999999999999
1889
+ - type: map_at_1000
1890
+ value: 16.002
1891
+ - type: map_at_3
1892
+ value: 9.103
1893
+ - type: map_at_5
1894
+ value: 11.068999999999999
1895
+ - type: mrr_at_1
1896
+ value: 23.9
1897
+ - type: mrr_at_10
1898
+ value: 35.847
1899
+ - type: mrr_at_100
1900
+ value: 36.968
1901
+ - type: mrr_at_1000
1902
+ value: 37.018
1903
+ - type: mrr_at_3
1904
+ value: 32.300000000000004
1905
+ - type: mrr_at_5
1906
+ value: 34.14
1907
+ - type: ndcg_at_1
1908
+ value: 23.9
1909
+ - type: ndcg_at_10
1910
+ value: 21.889
1911
+ - type: ndcg_at_100
1912
+ value: 30.903000000000002
1913
+ - type: ndcg_at_1000
1914
+ value: 36.992000000000004
1915
+ - type: ndcg_at_3
1916
+ value: 20.274
1917
+ - type: ndcg_at_5
1918
+ value: 17.773
1919
+ - type: precision_at_1
1920
+ value: 23.9
1921
+ - type: precision_at_10
1922
+ value: 11.61
1923
+ - type: precision_at_100
1924
+ value: 2.4539999999999997
1925
+ - type: precision_at_1000
1926
+ value: 0.391
1927
+ - type: precision_at_3
1928
+ value: 19.133
1929
+ - type: precision_at_5
1930
+ value: 15.740000000000002
1931
+ - type: recall_at_1
1932
+ value: 4.852
1933
+ - type: recall_at_10
1934
+ value: 23.507
1935
+ - type: recall_at_100
1936
+ value: 49.775000000000006
1937
+ - type: recall_at_1000
1938
+ value: 79.308
1939
+ - type: recall_at_3
1940
+ value: 11.637
1941
+ - type: recall_at_5
1942
+ value: 15.947
1943
+ - task:
1944
+ type: STS
1945
+ dataset:
1946
+ type: mteb/sickr-sts
1947
+ name: MTEB SICK-R
1948
+ config: default
1949
+ split: test
1950
+ revision: a6ea5a8cab320b040a23452cc28066d9beae2cee
1951
+ metrics:
1952
+ - type: cos_sim_pearson
1953
+ value: 86.03345827446948
1954
+ - type: cos_sim_spearman
1955
+ value: 80.53174518259549
1956
+ - type: euclidean_pearson
1957
+ value: 83.44538971660883
1958
+ - type: euclidean_spearman
1959
+ value: 80.57344324098692
1960
+ - type: manhattan_pearson
1961
+ value: 83.36528808195459
1962
+ - type: manhattan_spearman
1963
+ value: 80.48931287157902
1964
+ - task:
1965
+ type: STS
1966
+ dataset:
1967
+ type: mteb/sts12-sts
1968
+ name: MTEB STS12
1969
+ config: default
1970
+ split: test
1971
+ revision: a0d554a64d88156834ff5ae9920b964011b16384
1972
+ metrics:
1973
+ - type: cos_sim_pearson
1974
+ value: 85.21363088257881
1975
+ - type: cos_sim_spearman
1976
+ value: 75.56589127055523
1977
+ - type: euclidean_pearson
1978
+ value: 82.32868324521908
1979
+ - type: euclidean_spearman
1980
+ value: 75.31928550664554
1981
+ - type: manhattan_pearson
1982
+ value: 82.31332875713211
1983
+ - type: manhattan_spearman
1984
+ value: 75.35376322099196
1985
+ - task:
1986
+ type: STS
1987
+ dataset:
1988
+ type: mteb/sts13-sts
1989
+ name: MTEB STS13
1990
+ config: default
1991
+ split: test
1992
+ revision: 7e90230a92c190f1bf69ae9002b8cea547a64cca
1993
+ metrics:
1994
+ - type: cos_sim_pearson
1995
+ value: 85.09085593258487
1996
+ - type: cos_sim_spearman
1997
+ value: 86.26355088415221
1998
+ - type: euclidean_pearson
1999
+ value: 85.49646115361156
2000
+ - type: euclidean_spearman
2001
+ value: 86.20652472228703
2002
+ - type: manhattan_pearson
2003
+ value: 85.44084081123815
2004
+ - type: manhattan_spearman
2005
+ value: 86.1162623448951
2006
+ - task:
2007
+ type: STS
2008
+ dataset:
2009
+ type: mteb/sts14-sts
2010
+ name: MTEB STS14
2011
+ config: default
2012
+ split: test
2013
+ revision: 6031580fec1f6af667f0bd2da0a551cf4f0b2375
2014
+ metrics:
2015
+ - type: cos_sim_pearson
2016
+ value: 84.68250248349368
2017
+ - type: cos_sim_spearman
2018
+ value: 82.29883673695083
2019
+ - type: euclidean_pearson
2020
+ value: 84.17633035446019
2021
+ - type: euclidean_spearman
2022
+ value: 82.19990511264791
2023
+ - type: manhattan_pearson
2024
+ value: 84.17408410692279
2025
+ - type: manhattan_spearman
2026
+ value: 82.249873895981
2027
+ - task:
2028
+ type: STS
2029
+ dataset:
2030
+ type: mteb/sts15-sts
2031
+ name: MTEB STS15
2032
+ config: default
2033
+ split: test
2034
+ revision: ae752c7c21bf194d8b67fd573edf7ae58183cbe3
2035
+ metrics:
2036
+ - type: cos_sim_pearson
2037
+ value: 87.31878760045024
2038
+ - type: cos_sim_spearman
2039
+ value: 88.7364409031183
2040
+ - type: euclidean_pearson
2041
+ value: 88.230537618603
2042
+ - type: euclidean_spearman
2043
+ value: 88.76484309646318
2044
+ - type: manhattan_pearson
2045
+ value: 88.17689071136469
2046
+ - type: manhattan_spearman
2047
+ value: 88.72809249037928
2048
+ - task:
2049
+ type: STS
2050
+ dataset:
2051
+ type: mteb/sts16-sts
2052
+ name: MTEB STS16
2053
+ config: default
2054
+ split: test
2055
+ revision: 4d8694f8f0e0100860b497b999b3dbed754a0513
2056
+ metrics:
2057
+ - type: cos_sim_pearson
2058
+ value: 83.41078559110638
2059
+ - type: cos_sim_spearman
2060
+ value: 85.27439135411049
2061
+ - type: euclidean_pearson
2062
+ value: 84.5333571592088
2063
+ - type: euclidean_spearman
2064
+ value: 85.25645460575957
2065
+ - type: manhattan_pearson
2066
+ value: 84.38428921610226
2067
+ - type: manhattan_spearman
2068
+ value: 85.07796040798796
2069
+ - task:
2070
+ type: STS
2071
+ dataset:
2072
+ type: mteb/sts17-crosslingual-sts
2073
+ name: MTEB STS17 (en-en)
2074
+ config: en-en
2075
+ split: test
2076
+ revision: af5e6fb845001ecf41f4c1e033ce921939a2a68d
2077
+ metrics:
2078
+ - type: cos_sim_pearson
2079
+ value: 88.82374132382576
2080
+ - type: cos_sim_spearman
2081
+ value: 89.02101343562433
2082
+ - type: euclidean_pearson
2083
+ value: 89.50729765458932
2084
+ - type: euclidean_spearman
2085
+ value: 89.04184772869253
2086
+ - type: manhattan_pearson
2087
+ value: 89.51737904059856
2088
+ - type: manhattan_spearman
2089
+ value: 89.12925950440676
2090
+ - task:
2091
+ type: STS
2092
+ dataset:
2093
+ type: mteb/sts22-crosslingual-sts
2094
+ name: MTEB STS22 (en)
2095
+ config: en
2096
+ split: test
2097
+ revision: 6d1ba47164174a496b7fa5d3569dae26a6813b80
2098
+ metrics:
2099
+ - type: cos_sim_pearson
2100
+ value: 67.56051823873482
2101
+ - type: cos_sim_spearman
2102
+ value: 68.50988748185463
2103
+ - type: euclidean_pearson
2104
+ value: 69.16524346147456
2105
+ - type: euclidean_spearman
2106
+ value: 68.61859952449579
2107
+ - type: manhattan_pearson
2108
+ value: 69.10618915706995
2109
+ - type: manhattan_spearman
2110
+ value: 68.36401769459522
2111
+ - task:
2112
+ type: STS
2113
+ dataset:
2114
+ type: mteb/stsbenchmark-sts
2115
+ name: MTEB STSBenchmark
2116
+ config: default
2117
+ split: test
2118
+ revision: b0fddb56ed78048fa8b90373c8a3cfc37b684831
2119
+ metrics:
2120
+ - type: cos_sim_pearson
2121
+ value: 85.4159693872625
2122
+ - type: cos_sim_spearman
2123
+ value: 87.07819121764247
2124
+ - type: euclidean_pearson
2125
+ value: 87.03013260863153
2126
+ - type: euclidean_spearman
2127
+ value: 87.06547293631309
2128
+ - type: manhattan_pearson
2129
+ value: 86.8129744446062
2130
+ - type: manhattan_spearman
2131
+ value: 86.88494096335627
2132
+ - task:
2133
+ type: Reranking
2134
+ dataset:
2135
+ type: mteb/scidocs-reranking
2136
+ name: MTEB SciDocsRR
2137
+ config: default
2138
+ split: test
2139
+ revision: d3c5e1fc0b855ab6097bf1cda04dd73947d7caab
2140
+ metrics:
2141
+ - type: map
2142
+ value: 86.47758088996575
2143
+ - type: mrr
2144
+ value: 96.17891458577733
2145
+ - task:
2146
+ type: Retrieval
2147
+ dataset:
2148
+ type: scifact
2149
+ name: MTEB SciFact
2150
+ config: default
2151
+ split: test
2152
+ revision: None
2153
+ metrics:
2154
+ - type: map_at_1
2155
+ value: 57.538999999999994
2156
+ - type: map_at_10
2157
+ value: 66.562
2158
+ - type: map_at_100
2159
+ value: 67.254
2160
+ - type: map_at_1000
2161
+ value: 67.284
2162
+ - type: map_at_3
2163
+ value: 63.722
2164
+ - type: map_at_5
2165
+ value: 65.422
2166
+ - type: mrr_at_1
2167
+ value: 60.0
2168
+ - type: mrr_at_10
2169
+ value: 67.354
2170
+ - type: mrr_at_100
2171
+ value: 67.908
2172
+ - type: mrr_at_1000
2173
+ value: 67.93299999999999
2174
+ - type: mrr_at_3
2175
+ value: 65.056
2176
+ - type: mrr_at_5
2177
+ value: 66.43900000000001
2178
+ - type: ndcg_at_1
2179
+ value: 60.0
2180
+ - type: ndcg_at_10
2181
+ value: 70.858
2182
+ - type: ndcg_at_100
2183
+ value: 73.67099999999999
2184
+ - type: ndcg_at_1000
2185
+ value: 74.26700000000001
2186
+ - type: ndcg_at_3
2187
+ value: 65.911
2188
+ - type: ndcg_at_5
2189
+ value: 68.42200000000001
2190
+ - type: precision_at_1
2191
+ value: 60.0
2192
+ - type: precision_at_10
2193
+ value: 9.4
2194
+ - type: precision_at_100
2195
+ value: 1.083
2196
+ - type: precision_at_1000
2197
+ value: 0.11299999999999999
2198
+ - type: precision_at_3
2199
+ value: 25.444
2200
+ - type: precision_at_5
2201
+ value: 17.0
2202
+ - type: recall_at_1
2203
+ value: 57.538999999999994
2204
+ - type: recall_at_10
2205
+ value: 83.233
2206
+ - type: recall_at_100
2207
+ value: 95.667
2208
+ - type: recall_at_1000
2209
+ value: 100.0
2210
+ - type: recall_at_3
2211
+ value: 69.883
2212
+ - type: recall_at_5
2213
+ value: 76.19399999999999
2214
+ - task:
2215
+ type: PairClassification
2216
+ dataset:
2217
+ type: mteb/sprintduplicatequestions-pairclassification
2218
+ name: MTEB SprintDuplicateQuestions
2219
+ config: default
2220
+ split: test
2221
+ revision: d66bd1f72af766a5cc4b0ca5e00c162f89e8cc46
2222
+ metrics:
2223
+ - type: cos_sim_accuracy
2224
+ value: 99.82574257425742
2225
+ - type: cos_sim_ap
2226
+ value: 95.78722833053911
2227
+ - type: cos_sim_f1
2228
+ value: 90.94650205761316
2229
+ - type: cos_sim_precision
2230
+ value: 93.64406779661016
2231
+ - type: cos_sim_recall
2232
+ value: 88.4
2233
+ - type: dot_accuracy
2234
+ value: 99.83366336633664
2235
+ - type: dot_ap
2236
+ value: 95.89733601612964
2237
+ - type: dot_f1
2238
+ value: 91.41981613891727
2239
+ - type: dot_precision
2240
+ value: 93.42379958246346
2241
+ - type: dot_recall
2242
+ value: 89.5
2243
+ - type: euclidean_accuracy
2244
+ value: 99.82574257425742
2245
+ - type: euclidean_ap
2246
+ value: 95.75227035138846
2247
+ - type: euclidean_f1
2248
+ value: 90.96509240246407
2249
+ - type: euclidean_precision
2250
+ value: 93.45991561181435
2251
+ - type: euclidean_recall
2252
+ value: 88.6
2253
+ - type: manhattan_accuracy
2254
+ value: 99.82574257425742
2255
+ - type: manhattan_ap
2256
+ value: 95.76278266220176
2257
+ - type: manhattan_f1
2258
+ value: 91.08409321175279
2259
+ - type: manhattan_precision
2260
+ value: 92.29979466119097
2261
+ - type: manhattan_recall
2262
+ value: 89.9
2263
+ - type: max_accuracy
2264
+ value: 99.83366336633664
2265
+ - type: max_ap
2266
+ value: 95.89733601612964
2267
+ - type: max_f1
2268
+ value: 91.41981613891727
2269
+ - task:
2270
+ type: Clustering
2271
+ dataset:
2272
+ type: mteb/stackexchange-clustering
2273
+ name: MTEB StackExchangeClustering
2274
+ config: default
2275
+ split: test
2276
+ revision: 6cbc1f7b2bc0622f2e39d2c77fa502909748c259
2277
+ metrics:
2278
+ - type: v_measure
2279
+ value: 61.905425988638605
2280
+ - task:
2281
+ type: Clustering
2282
+ dataset:
2283
+ type: mteb/stackexchange-clustering-p2p
2284
+ name: MTEB StackExchangeClusteringP2P
2285
+ config: default
2286
+ split: test
2287
+ revision: 815ca46b2622cec33ccafc3735d572c266efdb44
2288
+ metrics:
2289
+ - type: v_measure
2290
+ value: 36.159589881679736
2291
+ - task:
2292
+ type: Reranking
2293
+ dataset:
2294
+ type: mteb/stackoverflowdupquestions-reranking
2295
+ name: MTEB StackOverflowDupQuestions
2296
+ config: default
2297
+ split: test
2298
+ revision: e185fbe320c72810689fc5848eb6114e1ef5ec69
2299
+ metrics:
2300
+ - type: map
2301
+ value: 53.0605499476397
2302
+ - type: mrr
2303
+ value: 53.91594516594517
2304
+ - task:
2305
+ type: Summarization
2306
+ dataset:
2307
+ type: mteb/summeval
2308
+ name: MTEB SummEval
2309
+ config: default
2310
+ split: test
2311
+ revision: cda12ad7615edc362dbf25a00fdd61d3b1eaf93c
2312
+ metrics:
2313
+ - type: cos_sim_pearson
2314
+ value: 30.202718009067
2315
+ - type: cos_sim_spearman
2316
+ value: 31.136199912366987
2317
+ - type: dot_pearson
2318
+ value: 30.66329011927951
2319
+ - type: dot_spearman
2320
+ value: 30.107664909625107
2321
+ - task:
2322
+ type: Retrieval
2323
+ dataset:
2324
+ type: trec-covid
2325
+ name: MTEB TRECCOVID
2326
+ config: default
2327
+ split: test
2328
+ revision: None
2329
+ metrics:
2330
+ - type: map_at_1
2331
+ value: 0.209
2332
+ - type: map_at_10
2333
+ value: 1.712
2334
+ - type: map_at_100
2335
+ value: 9.464
2336
+ - type: map_at_1000
2337
+ value: 23.437
2338
+ - type: map_at_3
2339
+ value: 0.609
2340
+ - type: map_at_5
2341
+ value: 0.9440000000000001
2342
+ - type: mrr_at_1
2343
+ value: 78.0
2344
+ - type: mrr_at_10
2345
+ value: 86.833
2346
+ - type: mrr_at_100
2347
+ value: 86.833
2348
+ - type: mrr_at_1000
2349
+ value: 86.833
2350
+ - type: mrr_at_3
2351
+ value: 85.333
2352
+ - type: mrr_at_5
2353
+ value: 86.833
2354
+ - type: ndcg_at_1
2355
+ value: 74.0
2356
+ - type: ndcg_at_10
2357
+ value: 69.14
2358
+ - type: ndcg_at_100
2359
+ value: 53.047999999999995
2360
+ - type: ndcg_at_1000
2361
+ value: 48.577
2362
+ - type: ndcg_at_3
2363
+ value: 75.592
2364
+ - type: ndcg_at_5
2365
+ value: 72.509
2366
+ - type: precision_at_1
2367
+ value: 78.0
2368
+ - type: precision_at_10
2369
+ value: 73.0
2370
+ - type: precision_at_100
2371
+ value: 54.44
2372
+ - type: precision_at_1000
2373
+ value: 21.326
2374
+ - type: precision_at_3
2375
+ value: 80.667
2376
+ - type: precision_at_5
2377
+ value: 77.2
2378
+ - type: recall_at_1
2379
+ value: 0.209
2380
+ - type: recall_at_10
2381
+ value: 1.932
2382
+ - type: recall_at_100
2383
+ value: 13.211999999999998
2384
+ - type: recall_at_1000
2385
+ value: 45.774
2386
+ - type: recall_at_3
2387
+ value: 0.644
2388
+ - type: recall_at_5
2389
+ value: 1.0290000000000001
2390
+ - task:
2391
+ type: Retrieval
2392
+ dataset:
2393
+ type: webis-touche2020
2394
+ name: MTEB Touche2020
2395
+ config: default
2396
+ split: test
2397
+ revision: None
2398
+ metrics:
2399
+ - type: map_at_1
2400
+ value: 2.609
2401
+ - type: map_at_10
2402
+ value: 8.334999999999999
2403
+ - type: map_at_100
2404
+ value: 14.604000000000001
2405
+ - type: map_at_1000
2406
+ value: 16.177
2407
+ - type: map_at_3
2408
+ value: 4.87
2409
+ - type: map_at_5
2410
+ value: 6.3149999999999995
2411
+ - type: mrr_at_1
2412
+ value: 32.653
2413
+ - type: mrr_at_10
2414
+ value: 45.047
2415
+ - type: mrr_at_100
2416
+ value: 45.808
2417
+ - type: mrr_at_1000
2418
+ value: 45.808
2419
+ - type: mrr_at_3
2420
+ value: 41.497
2421
+ - type: mrr_at_5
2422
+ value: 43.231
2423
+ - type: ndcg_at_1
2424
+ value: 30.612000000000002
2425
+ - type: ndcg_at_10
2426
+ value: 21.193
2427
+ - type: ndcg_at_100
2428
+ value: 34.97
2429
+ - type: ndcg_at_1000
2430
+ value: 46.69
2431
+ - type: ndcg_at_3
2432
+ value: 24.823
2433
+ - type: ndcg_at_5
2434
+ value: 22.872999999999998
2435
+ - type: precision_at_1
2436
+ value: 32.653
2437
+ - type: precision_at_10
2438
+ value: 17.959
2439
+ - type: precision_at_100
2440
+ value: 7.4079999999999995
2441
+ - type: precision_at_1000
2442
+ value: 1.537
2443
+ - type: precision_at_3
2444
+ value: 25.85
2445
+ - type: precision_at_5
2446
+ value: 22.448999999999998
2447
+ - type: recall_at_1
2448
+ value: 2.609
2449
+ - type: recall_at_10
2450
+ value: 13.63
2451
+ - type: recall_at_100
2452
+ value: 47.014
2453
+ - type: recall_at_1000
2454
+ value: 83.176
2455
+ - type: recall_at_3
2456
+ value: 5.925
2457
+ - type: recall_at_5
2458
+ value: 8.574
2459
+ - task:
2460
+ type: Classification
2461
+ dataset:
2462
+ type: mteb/toxic_conversations_50k
2463
+ name: MTEB ToxicConversationsClassification
2464
+ config: default
2465
+ split: test
2466
+ revision: d7c0de2777da35d6aae2200a62c6e0e5af397c4c
2467
+ metrics:
2468
+ - type: accuracy
2469
+ value: 72.80239999999999
2470
+ - type: ap
2471
+ value: 15.497911013214791
2472
+ - type: f1
2473
+ value: 56.258411577947285
2474
+ - task:
2475
+ type: Classification
2476
+ dataset:
2477
+ type: mteb/tweet_sentiment_extraction
2478
+ name: MTEB TweetSentimentExtractionClassification
2479
+ config: default
2480
+ split: test
2481
+ revision: d604517c81ca91fe16a244d1248fc021f9ecee7a
2482
+ metrics:
2483
+ - type: accuracy
2484
+ value: 61.00452744765139
2485
+ - type: f1
2486
+ value: 61.42228624410908
2487
+ - task:
2488
+ type: Clustering
2489
+ dataset:
2490
+ type: mteb/twentynewsgroups-clustering
2491
+ name: MTEB TwentyNewsgroupsClustering
2492
+ config: default
2493
+ split: test
2494
+ revision: 6125ec4e24fa026cec8a478383ee943acfbd5449
2495
+ metrics:
2496
+ - type: v_measure
2497
+ value: 50.00516915962345
2498
+ - task:
2499
+ type: PairClassification
2500
+ dataset:
2501
+ type: mteb/twittersemeval2015-pairclassification
2502
+ name: MTEB TwitterSemEval2015
2503
+ config: default
2504
+ split: test
2505
+ revision: 70970daeab8776df92f5ea462b6173c0b46fd2d1
2506
+ metrics:
2507
+ - type: cos_sim_accuracy
2508
+ value: 85.62317458425225
2509
+ - type: cos_sim_ap
2510
+ value: 72.95115658063823
2511
+ - type: cos_sim_f1
2512
+ value: 66.78976523344764
2513
+ - type: cos_sim_precision
2514
+ value: 66.77215189873418
2515
+ - type: cos_sim_recall
2516
+ value: 66.80738786279683
2517
+ - type: dot_accuracy
2518
+ value: 85.62317458425225
2519
+ - type: dot_ap
2520
+ value: 73.10385271517778
2521
+ - type: dot_f1
2522
+ value: 66.94853829427399
2523
+ - type: dot_precision
2524
+ value: 61.74242424242424
2525
+ - type: dot_recall
2526
+ value: 73.11345646437995
2527
+ - type: euclidean_accuracy
2528
+ value: 85.65893783155511
2529
+ - type: euclidean_ap
2530
+ value: 72.87428208473992
2531
+ - type: euclidean_f1
2532
+ value: 66.70919994896005
2533
+ - type: euclidean_precision
2534
+ value: 64.5910551025451
2535
+ - type: euclidean_recall
2536
+ value: 68.97097625329816
2537
+ - type: manhattan_accuracy
2538
+ value: 85.59933241938367
2539
+ - type: manhattan_ap
2540
+ value: 72.67282695064966
2541
+ - type: manhattan_f1
2542
+ value: 66.67537215983286
2543
+ - type: manhattan_precision
2544
+ value: 66.00310237849017
2545
+ - type: manhattan_recall
2546
+ value: 67.36147757255937
2547
+ - type: max_accuracy
2548
+ value: 85.65893783155511
2549
+ - type: max_ap
2550
+ value: 73.10385271517778
2551
+ - type: max_f1
2552
+ value: 66.94853829427399
2553
+ - task:
2554
+ type: PairClassification
2555
+ dataset:
2556
+ type: mteb/twitterurlcorpus-pairclassification
2557
+ name: MTEB TwitterURLCorpus
2558
+ config: default
2559
+ split: test
2560
+ revision: 8b6510b0b1fa4e4c4f879467980e9be563ec1cdf
2561
+ metrics:
2562
+ - type: cos_sim_accuracy
2563
+ value: 88.69096130709822
2564
+ - type: cos_sim_ap
2565
+ value: 85.30326978668063
2566
+ - type: cos_sim_f1
2567
+ value: 77.747088683189
2568
+ - type: cos_sim_precision
2569
+ value: 75.4491451753115
2570
+ - type: cos_sim_recall
2571
+ value: 80.189405605174
2572
+ - type: dot_accuracy
2573
+ value: 88.43870066363954
2574
+ - type: dot_ap
2575
+ value: 84.62999949222983
2576
+ - type: dot_f1
2577
+ value: 77.3074661963551
2578
+ - type: dot_precision
2579
+ value: 73.93871239808828
2580
+ - type: dot_recall
2581
+ value: 80.99784416384355
2582
+ - type: euclidean_accuracy
2583
+ value: 88.70066363953894
2584
+ - type: euclidean_ap
2585
+ value: 85.34184508966621
2586
+ - type: euclidean_f1
2587
+ value: 77.76871756856931
2588
+ - type: euclidean_precision
2589
+ value: 74.97855917667239
2590
+ - type: euclidean_recall
2591
+ value: 80.77456113335386
2592
+ - type: manhattan_accuracy
2593
+ value: 88.68319944114566
2594
+ - type: manhattan_ap
2595
+ value: 85.3026464242333
2596
+ - type: manhattan_f1
2597
+ value: 77.66561049296294
2598
+ - type: manhattan_precision
2599
+ value: 74.4665818849795
2600
+ - type: manhattan_recall
2601
+ value: 81.15183246073299
2602
+ - type: max_accuracy
2603
+ value: 88.70066363953894
2604
+ - type: max_ap
2605
+ value: 85.34184508966621
2606
+ - type: max_f1
2607
+ value: 77.76871756856931
2608
+ ---
2609
+ <h1 align="center">GIST small Embedding v0</h1>
2610
+
2611
+ *GISTEmbed: Guided In-sample Selection of Training Negatives for Text Embedding Fine-tuning*
2612
+
2613
+ The model is fine-tuned on top of the [BAAI/bge-small-en-v1.5](https://huggingface.co/BAAI/bge-small-en-v1.5) using the [MEDI dataset](https://github.com/xlang-ai/instructor-embedding.git) augmented with mined triplets from the [MTEB Classification](https://huggingface.co/mteb) training dataset (excluding data from the Amazon Polarity Classification task).
2614
+
2615
+ The model does not require any instruction for generating embeddings. This means that queries for retrieval tasks can be directly encoded without crafting instructions.
2616
+
2617
+ Technical paper: [GISTEmbed: Guided In-sample Selection of Training Negatives for Text Embedding Fine-tuning](https://arxiv.org/abs/2402.16829)
2618
+
2619
+
2620
+ # Data
2621
+
2622
+ The dataset used is a compilation of the MEDI and MTEB Classification training datasets. Third-party datasets may be subject to additional terms and conditions under their associated licenses. A HuggingFace Dataset version of the compiled dataset, and the specific revision used to train the model, is available:
2623
+
2624
+ - Dataset: [avsolatorio/medi-data-mteb_avs_triplets](https://huggingface.co/datasets/avsolatorio/medi-data-mteb_avs_triplets)
2625
+ - Revision: 238a0499b6e6b690cc64ea56fde8461daa8341bb
2626
+
2627
+ The dataset contains a `task_type` key, which can be used to select only the mteb classification tasks (prefixed with `mteb_`).
2628
+
2629
+ The **MEDI Dataset** is published in the following paper: [One Embedder, Any Task: Instruction-Finetuned Text Embeddings](https://arxiv.org/abs/2212.09741).
2630
+
2631
+ The MTEB Benchmark results of the GIST embedding model, compared with the base model, suggest that the fine-tuning dataset has perturbed the model considerably, which resulted in significant improvements in certain tasks while adversely degrading performance in some.
2632
+
2633
+ The retrieval performance for the TRECCOVID task is of note. The fine-tuning dataset does not contain significant knowledge about COVID-19, which could have caused the observed performance degradation. We found some evidence, detailed in the paper, that thematic coverage of the fine-tuning data can affect downstream performance.
2634
+
2635
+ # Usage
2636
+
2637
+ The model can be easily loaded using the Sentence Transformers library.
2638
+
2639
+ ```Python
2640
+ import torch.nn.functional as F
2641
+ from sentence_transformers import SentenceTransformer
2642
+
2643
+ revision = None # Replace with the specific revision to ensure reproducibility if the model is updated.
2644
+
2645
+ model = SentenceTransformer("avsolatorio/GIST-small-Embedding-v0", revision=revision)
2646
+
2647
+ texts = [
2648
+ "Illustration of the REaLTabFormer model. The left block shows the non-relational tabular data model using GPT-2 with a causal LM head. In contrast, the right block shows how a relational dataset's child table is modeled using a sequence-to-sequence (Seq2Seq) model. The Seq2Seq model uses the observations in the parent table to condition the generation of the observations in the child table. The trained GPT-2 model on the parent table, with weights frozen, is also used as the encoder in the Seq2Seq model.",
2649
+ "Predicting human mobility holds significant practical value, with applications ranging from enhancing disaster risk planning to simulating epidemic spread. In this paper, we present the GeoFormer, a decoder-only transformer model adapted from the GPT architecture to forecast human mobility.",
2650
+ "As the economies of Southeast Asia continue adopting digital technologies, policy makers increasingly ask how to prepare the workforce for emerging labor demands. However, little is known about the skills that workers need to adapt to these changes"
2651
+ ]
2652
+
2653
+ # Compute embeddings
2654
+ embeddings = model.encode(texts, convert_to_tensor=True)
2655
+
2656
+ # Compute cosine-similarity for each pair of sentences
2657
+ scores = F.cosine_similarity(embeddings.unsqueeze(1), embeddings.unsqueeze(0), dim=-1)
2658
+
2659
+ print(scores.cpu().numpy())
2660
+ ```
2661
+
2662
+ # Training Parameters
2663
+
2664
+ Below are the training parameters used to fine-tune the model:
2665
+
2666
+ ```
2667
+ Epochs = 40
2668
+ Warmup ratio = 0.1
2669
+ Learning rate = 5e-6
2670
+ Batch size = 16
2671
+ Checkpoint step = 102000
2672
+ Contrastive loss temperature = 0.01
2673
+ ```
2674
+
2675
+
2676
+ # Evaluation
2677
+
2678
+ The model was evaluated using the [MTEB Evaluation](https://huggingface.co/mteb) suite.
2679
+
2680
+
2681
+ # Citation
2682
+
2683
+ Please cite our work if you use GISTEmbed or the datasets we published in your projects or research. 🤗
2684
+
2685
+ ```
2686
+ @article{solatorio2024gistembed,
2687
+ title={GISTEmbed: Guided In-sample Selection of Training Negatives for Text Embedding Fine-tuning},
2688
+ author={Aivin V. Solatorio},
2689
+ journal={arXiv preprint arXiv:2402.16829},
2690
+ year={2024},
2691
+ URL={https://arxiv.org/abs/2402.16829}
2692
+ eprint={2402.16829},
2693
+ archivePrefix={arXiv},
2694
+ primaryClass={cs.LG}
2695
+ }
2696
+ ```
2697
+
2698
+ # Acknowledgements
2699
+
2700
+ This work is supported by the "KCP IV - Exploring Data Use in the Development Economics Literature using Large Language Models (AI and LLMs)" project funded by the [Knowledge for Change Program (KCP)](https://www.worldbank.org/en/programs/knowledge-for-change) of the World Bank - RA-P503405-RESE-TF0C3444.
2701
+
2702
+ The findings, interpretations, and conclusions expressed in this material are entirely those of the authors. They do not necessarily represent the views of the International Bank for Reconstruction and Development/World Bank and its affiliated organizations, or those of the Executive Directors of the World Bank or the governments they represent.
config.json ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token": "<s>",
3
+ "eos_token": "</s>",
4
+ "layer_norm_epsilon": 1e-12,
5
+ "multi_query_attention": false,
6
+ "unk_token": "[UNK]"
7
+ }
model.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:fbd5ef825e9c99e4a552f31967baf4dc6088c4011c15b3b004eb5e029cdb99e0
3
+ size 34433909
special_tokens_map.json ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cls_token": {
3
+ "content": "[CLS]",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "mask_token": {
10
+ "content": "[MASK]",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "pad_token": {
17
+ "content": "[PAD]",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "sep_token": {
24
+ "content": "[SEP]",
25
+ "lstrip": false,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "unk_token": {
31
+ "content": "[UNK]",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ }
37
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,57 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "[PAD]",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "100": {
12
+ "content": "[UNK]",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "101": {
20
+ "content": "[CLS]",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "102": {
28
+ "content": "[SEP]",
29
+ "lstrip": false,
30
+ "normalized": false,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "103": {
36
+ "content": "[MASK]",
37
+ "lstrip": false,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ }
43
+ },
44
+ "clean_up_tokenization_spaces": true,
45
+ "cls_token": "[CLS]",
46
+ "do_basic_tokenize": true,
47
+ "do_lower_case": true,
48
+ "mask_token": "[MASK]",
49
+ "model_max_length": 512,
50
+ "never_split": null,
51
+ "pad_token": "[PAD]",
52
+ "sep_token": "[SEP]",
53
+ "strip_accents": null,
54
+ "tokenize_chinese_chars": true,
55
+ "tokenizer_class": "BertTokenizer",
56
+ "unk_token": "[UNK]"
57
+ }
vocabulary.json ADDED
The diff for this file is too large to render. See raw diff