Qi Liu commited on
Commit
ae13eea
1 Parent(s): 4dd8502

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1107 -0
README.md CHANGED
@@ -1,3 +1,1110 @@
1
  ---
 
 
 
 
 
 
 
2
  license: apache-2.0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ tags:
3
+ - finetuner
4
+ - mteb
5
+ - sentence-transformers
6
+ - feature-extraction
7
+ - sentence-similarity
8
+ - alibi
9
  license: apache-2.0
10
+ language:
11
+ - en
12
+ - zh
13
+ model-index:
14
+ - name: jina-embeddings-v2-base-zh
15
+ results:
16
+ - task:
17
+ type: STS
18
+ dataset:
19
+ type: C-MTEB/AFQMC
20
+ name: MTEB AFQMC
21
+ config: default
22
+ split: validation
23
+ revision: None
24
+ metrics:
25
+ - type: cos_sim_pearson
26
+ value: 48.51403119231363
27
+ - type: cos_sim_spearman
28
+ value: 50.5928547846445
29
+ - type: euclidean_pearson
30
+ value: 48.750436310559074
31
+ - type: euclidean_spearman
32
+ value: 50.50950238691385
33
+ - type: manhattan_pearson
34
+ value: 48.7866189440328
35
+ - type: manhattan_spearman
36
+ value: 50.58692402017165
37
+ - task:
38
+ type: STS
39
+ dataset:
40
+ type: C-MTEB/ATEC
41
+ name: MTEB ATEC
42
+ config: default
43
+ split: test
44
+ revision: None
45
+ metrics:
46
+ - type: cos_sim_pearson
47
+ value: 50.25985700105725
48
+ - type: cos_sim_spearman
49
+ value: 51.28815934593989
50
+ - type: euclidean_pearson
51
+ value: 52.70329248799904
52
+ - type: euclidean_spearman
53
+ value: 50.94101139559258
54
+ - type: manhattan_pearson
55
+ value: 52.6647237400892
56
+ - type: manhattan_spearman
57
+ value: 50.922441325406176
58
+ - task:
59
+ type: Classification
60
+ dataset:
61
+ type: mteb/amazon_reviews_multi
62
+ name: MTEB AmazonReviewsClassification (zh)
63
+ config: zh
64
+ split: test
65
+ revision: 1399c76144fd37290681b995c656ef9b2e06e26d
66
+ metrics:
67
+ - type: accuracy
68
+ value: 34.944
69
+ - type: f1
70
+ value: 34.06478860660109
71
+ - task:
72
+ type: STS
73
+ dataset:
74
+ type: C-MTEB/BQ
75
+ name: MTEB BQ
76
+ config: default
77
+ split: test
78
+ revision: None
79
+ metrics:
80
+ - type: cos_sim_pearson
81
+ value: 65.15667035488342
82
+ - type: cos_sim_spearman
83
+ value: 66.07110142081
84
+ - type: euclidean_pearson
85
+ value: 60.447598102249714
86
+ - type: euclidean_spearman
87
+ value: 61.826575796578766
88
+ - type: manhattan_pearson
89
+ value: 60.39364279354984
90
+ - type: manhattan_spearman
91
+ value: 61.78743491223281
92
+ - task:
93
+ type: Clustering
94
+ dataset:
95
+ type: C-MTEB/CLSClusteringP2P
96
+ name: MTEB CLSClusteringP2P
97
+ config: default
98
+ split: test
99
+ revision: None
100
+ metrics:
101
+ - type: v_measure
102
+ value: 39.96714175391701
103
+ - task:
104
+ type: Clustering
105
+ dataset:
106
+ type: C-MTEB/CLSClusteringS2S
107
+ name: MTEB CLSClusteringS2S
108
+ config: default
109
+ split: test
110
+ revision: None
111
+ metrics:
112
+ - type: v_measure
113
+ value: 38.39863566717934
114
+ - task:
115
+ type: Reranking
116
+ dataset:
117
+ type: C-MTEB/CMedQAv1-reranking
118
+ name: MTEB CMedQAv1
119
+ config: default
120
+ split: test
121
+ revision: None
122
+ metrics:
123
+ - type: map
124
+ value: 83.63680381780644
125
+ - type: mrr
126
+ value: 86.16476190476192
127
+ - task:
128
+ type: Reranking
129
+ dataset:
130
+ type: C-MTEB/CMedQAv2-reranking
131
+ name: MTEB CMedQAv2
132
+ config: default
133
+ split: test
134
+ revision: None
135
+ metrics:
136
+ - type: map
137
+ value: 83.74350667859487
138
+ - type: mrr
139
+ value: 86.10388888888889
140
+ - task:
141
+ type: Retrieval
142
+ dataset:
143
+ type: C-MTEB/CmedqaRetrieval
144
+ name: MTEB CmedqaRetrieval
145
+ config: default
146
+ split: dev
147
+ revision: None
148
+ metrics:
149
+ - type: map_at_1
150
+ value: 22.072
151
+ - type: map_at_10
152
+ value: 32.942
153
+ - type: map_at_100
154
+ value: 34.768
155
+ - type: map_at_1000
156
+ value: 34.902
157
+ - type: map_at_3
158
+ value: 29.357
159
+ - type: map_at_5
160
+ value: 31.236000000000004
161
+ - type: mrr_at_1
162
+ value: 34.259
163
+ - type: mrr_at_10
164
+ value: 41.957
165
+ - type: mrr_at_100
166
+ value: 42.982
167
+ - type: mrr_at_1000
168
+ value: 43.042
169
+ - type: mrr_at_3
170
+ value: 39.722
171
+ - type: mrr_at_5
172
+ value: 40.898
173
+ - type: ndcg_at_1
174
+ value: 34.259
175
+ - type: ndcg_at_10
176
+ value: 39.153
177
+ - type: ndcg_at_100
178
+ value: 46.493
179
+ - type: ndcg_at_1000
180
+ value: 49.01
181
+ - type: ndcg_at_3
182
+ value: 34.636
183
+ - type: ndcg_at_5
184
+ value: 36.278
185
+ - type: precision_at_1
186
+ value: 34.259
187
+ - type: precision_at_10
188
+ value: 8.815000000000001
189
+ - type: precision_at_100
190
+ value: 1.474
191
+ - type: precision_at_1000
192
+ value: 0.179
193
+ - type: precision_at_3
194
+ value: 19.73
195
+ - type: precision_at_5
196
+ value: 14.174000000000001
197
+ - type: recall_at_1
198
+ value: 22.072
199
+ - type: recall_at_10
200
+ value: 48.484
201
+ - type: recall_at_100
202
+ value: 79.035
203
+ - type: recall_at_1000
204
+ value: 96.15
205
+ - type: recall_at_3
206
+ value: 34.607
207
+ - type: recall_at_5
208
+ value: 40.064
209
+ - task:
210
+ type: PairClassification
211
+ dataset:
212
+ type: C-MTEB/CMNLI
213
+ name: MTEB Cmnli
214
+ config: default
215
+ split: validation
216
+ revision: None
217
+ metrics:
218
+ - type: cos_sim_accuracy
219
+ value: 76.7047504509922
220
+ - type: cos_sim_ap
221
+ value: 85.26649874800871
222
+ - type: cos_sim_f1
223
+ value: 78.13528724646915
224
+ - type: cos_sim_precision
225
+ value: 71.57587548638132
226
+ - type: cos_sim_recall
227
+ value: 86.01823708206688
228
+ - type: dot_accuracy
229
+ value: 70.13830426939266
230
+ - type: dot_ap
231
+ value: 77.01510412382171
232
+ - type: dot_f1
233
+ value: 73.56710042713817
234
+ - type: dot_precision
235
+ value: 63.955094991364426
236
+ - type: dot_recall
237
+ value: 86.57937806873977
238
+ - type: euclidean_accuracy
239
+ value: 75.53818400481059
240
+ - type: euclidean_ap
241
+ value: 84.34668448241264
242
+ - type: euclidean_f1
243
+ value: 77.51741608613047
244
+ - type: euclidean_precision
245
+ value: 70.65614777756399
246
+ - type: euclidean_recall
247
+ value: 85.85457096095394
248
+ - type: manhattan_accuracy
249
+ value: 75.49007817197835
250
+ - type: manhattan_ap
251
+ value: 84.40297506704299
252
+ - type: manhattan_f1
253
+ value: 77.63185324160932
254
+ - type: manhattan_precision
255
+ value: 70.03949595636637
256
+ - type: manhattan_recall
257
+ value: 87.07037643207856
258
+ - type: max_accuracy
259
+ value: 76.7047504509922
260
+ - type: max_ap
261
+ value: 85.26649874800871
262
+ - type: max_f1
263
+ value: 78.13528724646915
264
+ - task:
265
+ type: Retrieval
266
+ dataset:
267
+ type: C-MTEB/CovidRetrieval
268
+ name: MTEB CovidRetrieval
269
+ config: default
270
+ split: dev
271
+ revision: None
272
+ metrics:
273
+ - type: map_at_1
274
+ value: 69.178
275
+ - type: map_at_10
276
+ value: 77.523
277
+ - type: map_at_100
278
+ value: 77.793
279
+ - type: map_at_1000
280
+ value: 77.79899999999999
281
+ - type: map_at_3
282
+ value: 75.878
283
+ - type: map_at_5
284
+ value: 76.849
285
+ - type: mrr_at_1
286
+ value: 69.44200000000001
287
+ - type: mrr_at_10
288
+ value: 77.55
289
+ - type: mrr_at_100
290
+ value: 77.819
291
+ - type: mrr_at_1000
292
+ value: 77.826
293
+ - type: mrr_at_3
294
+ value: 75.957
295
+ - type: mrr_at_5
296
+ value: 76.916
297
+ - type: ndcg_at_1
298
+ value: 69.44200000000001
299
+ - type: ndcg_at_10
300
+ value: 81.217
301
+ - type: ndcg_at_100
302
+ value: 82.45
303
+ - type: ndcg_at_1000
304
+ value: 82.636
305
+ - type: ndcg_at_3
306
+ value: 77.931
307
+ - type: ndcg_at_5
308
+ value: 79.655
309
+ - type: precision_at_1
310
+ value: 69.44200000000001
311
+ - type: precision_at_10
312
+ value: 9.357
313
+ - type: precision_at_100
314
+ value: 0.993
315
+ - type: precision_at_1000
316
+ value: 0.101
317
+ - type: precision_at_3
318
+ value: 28.1
319
+ - type: precision_at_5
320
+ value: 17.724
321
+ - type: recall_at_1
322
+ value: 69.178
323
+ - type: recall_at_10
324
+ value: 92.624
325
+ - type: recall_at_100
326
+ value: 98.209
327
+ - type: recall_at_1000
328
+ value: 99.684
329
+ - type: recall_at_3
330
+ value: 83.772
331
+ - type: recall_at_5
332
+ value: 87.882
333
+ - task:
334
+ type: Retrieval
335
+ dataset:
336
+ type: C-MTEB/DuRetrieval
337
+ name: MTEB DuRetrieval
338
+ config: default
339
+ split: dev
340
+ revision: None
341
+ metrics:
342
+ - type: map_at_1
343
+ value: 25.163999999999998
344
+ - type: map_at_10
345
+ value: 76.386
346
+ - type: map_at_100
347
+ value: 79.339
348
+ - type: map_at_1000
349
+ value: 79.39500000000001
350
+ - type: map_at_3
351
+ value: 52.959
352
+ - type: map_at_5
353
+ value: 66.59
354
+ - type: mrr_at_1
355
+ value: 87.9
356
+ - type: mrr_at_10
357
+ value: 91.682
358
+ - type: mrr_at_100
359
+ value: 91.747
360
+ - type: mrr_at_1000
361
+ value: 91.751
362
+ - type: mrr_at_3
363
+ value: 91.267
364
+ - type: mrr_at_5
365
+ value: 91.527
366
+ - type: ndcg_at_1
367
+ value: 87.9
368
+ - type: ndcg_at_10
369
+ value: 84.569
370
+ - type: ndcg_at_100
371
+ value: 87.83800000000001
372
+ - type: ndcg_at_1000
373
+ value: 88.322
374
+ - type: ndcg_at_3
375
+ value: 83.473
376
+ - type: ndcg_at_5
377
+ value: 82.178
378
+ - type: precision_at_1
379
+ value: 87.9
380
+ - type: precision_at_10
381
+ value: 40.605000000000004
382
+ - type: precision_at_100
383
+ value: 4.752
384
+ - type: precision_at_1000
385
+ value: 0.488
386
+ - type: precision_at_3
387
+ value: 74.9
388
+ - type: precision_at_5
389
+ value: 62.96000000000001
390
+ - type: recall_at_1
391
+ value: 25.163999999999998
392
+ - type: recall_at_10
393
+ value: 85.97399999999999
394
+ - type: recall_at_100
395
+ value: 96.63000000000001
396
+ - type: recall_at_1000
397
+ value: 99.016
398
+ - type: recall_at_3
399
+ value: 55.611999999999995
400
+ - type: recall_at_5
401
+ value: 71.936
402
+ - task:
403
+ type: Retrieval
404
+ dataset:
405
+ type: C-MTEB/EcomRetrieval
406
+ name: MTEB EcomRetrieval
407
+ config: default
408
+ split: dev
409
+ revision: None
410
+ metrics:
411
+ - type: map_at_1
412
+ value: 48.6
413
+ - type: map_at_10
414
+ value: 58.831
415
+ - type: map_at_100
416
+ value: 59.427
417
+ - type: map_at_1000
418
+ value: 59.44199999999999
419
+ - type: map_at_3
420
+ value: 56.383
421
+ - type: map_at_5
422
+ value: 57.753
423
+ - type: mrr_at_1
424
+ value: 48.6
425
+ - type: mrr_at_10
426
+ value: 58.831
427
+ - type: mrr_at_100
428
+ value: 59.427
429
+ - type: mrr_at_1000
430
+ value: 59.44199999999999
431
+ - type: mrr_at_3
432
+ value: 56.383
433
+ - type: mrr_at_5
434
+ value: 57.753
435
+ - type: ndcg_at_1
436
+ value: 48.6
437
+ - type: ndcg_at_10
438
+ value: 63.951
439
+ - type: ndcg_at_100
440
+ value: 66.72200000000001
441
+ - type: ndcg_at_1000
442
+ value: 67.13900000000001
443
+ - type: ndcg_at_3
444
+ value: 58.882
445
+ - type: ndcg_at_5
446
+ value: 61.373
447
+ - type: precision_at_1
448
+ value: 48.6
449
+ - type: precision_at_10
450
+ value: 8.01
451
+ - type: precision_at_100
452
+ value: 0.928
453
+ - type: precision_at_1000
454
+ value: 0.096
455
+ - type: precision_at_3
456
+ value: 22.033
457
+ - type: precision_at_5
458
+ value: 14.44
459
+ - type: recall_at_1
460
+ value: 48.6
461
+ - type: recall_at_10
462
+ value: 80.10000000000001
463
+ - type: recall_at_100
464
+ value: 92.80000000000001
465
+ - type: recall_at_1000
466
+ value: 96.1
467
+ - type: recall_at_3
468
+ value: 66.10000000000001
469
+ - type: recall_at_5
470
+ value: 72.2
471
+ - task:
472
+ type: Classification
473
+ dataset:
474
+ type: C-MTEB/IFlyTek-classification
475
+ name: MTEB IFlyTek
476
+ config: default
477
+ split: validation
478
+ revision: None
479
+ metrics:
480
+ - type: accuracy
481
+ value: 47.36437091188918
482
+ - type: f1
483
+ value: 36.60946954228577
484
+ - task:
485
+ type: Classification
486
+ dataset:
487
+ type: C-MTEB/JDReview-classification
488
+ name: MTEB JDReview
489
+ config: default
490
+ split: test
491
+ revision: None
492
+ metrics:
493
+ - type: accuracy
494
+ value: 79.5684803001876
495
+ - type: ap
496
+ value: 42.671935929201524
497
+ - type: f1
498
+ value: 73.31912729103752
499
+ - task:
500
+ type: STS
501
+ dataset:
502
+ type: C-MTEB/LCQMC
503
+ name: MTEB LCQMC
504
+ config: default
505
+ split: test
506
+ revision: None
507
+ metrics:
508
+ - type: cos_sim_pearson
509
+ value: 68.62670112113864
510
+ - type: cos_sim_spearman
511
+ value: 75.74009123170768
512
+ - type: euclidean_pearson
513
+ value: 73.93002595958237
514
+ - type: euclidean_spearman
515
+ value: 75.35222935003587
516
+ - type: manhattan_pearson
517
+ value: 73.89870445158144
518
+ - type: manhattan_spearman
519
+ value: 75.31714936339398
520
+ - task:
521
+ type: Reranking
522
+ dataset:
523
+ type: C-MTEB/Mmarco-reranking
524
+ name: MTEB MMarcoReranking
525
+ config: default
526
+ split: dev
527
+ revision: None
528
+ metrics:
529
+ - type: map
530
+ value: 31.5372713650176
531
+ - type: mrr
532
+ value: 30.163095238095238
533
+ - task:
534
+ type: Retrieval
535
+ dataset:
536
+ type: C-MTEB/MMarcoRetrieval
537
+ name: MTEB MMarcoRetrieval
538
+ config: default
539
+ split: dev
540
+ revision: None
541
+ metrics:
542
+ - type: map_at_1
543
+ value: 65.054
544
+ - type: map_at_10
545
+ value: 74.156
546
+ - type: map_at_100
547
+ value: 74.523
548
+ - type: map_at_1000
549
+ value: 74.535
550
+ - type: map_at_3
551
+ value: 72.269
552
+ - type: map_at_5
553
+ value: 73.41
554
+ - type: mrr_at_1
555
+ value: 67.24900000000001
556
+ - type: mrr_at_10
557
+ value: 74.78399999999999
558
+ - type: mrr_at_100
559
+ value: 75.107
560
+ - type: mrr_at_1000
561
+ value: 75.117
562
+ - type: mrr_at_3
563
+ value: 73.13499999999999
564
+ - type: mrr_at_5
565
+ value: 74.13499999999999
566
+ - type: ndcg_at_1
567
+ value: 67.24900000000001
568
+ - type: ndcg_at_10
569
+ value: 77.96300000000001
570
+ - type: ndcg_at_100
571
+ value: 79.584
572
+ - type: ndcg_at_1000
573
+ value: 79.884
574
+ - type: ndcg_at_3
575
+ value: 74.342
576
+ - type: ndcg_at_5
577
+ value: 76.278
578
+ - type: precision_at_1
579
+ value: 67.24900000000001
580
+ - type: precision_at_10
581
+ value: 9.466
582
+ - type: precision_at_100
583
+ value: 1.027
584
+ - type: precision_at_1000
585
+ value: 0.105
586
+ - type: precision_at_3
587
+ value: 27.955999999999996
588
+ - type: precision_at_5
589
+ value: 17.817
590
+ - type: recall_at_1
591
+ value: 65.054
592
+ - type: recall_at_10
593
+ value: 89.113
594
+ - type: recall_at_100
595
+ value: 96.369
596
+ - type: recall_at_1000
597
+ value: 98.714
598
+ - type: recall_at_3
599
+ value: 79.45400000000001
600
+ - type: recall_at_5
601
+ value: 84.06
602
+ - task:
603
+ type: Classification
604
+ dataset:
605
+ type: mteb/amazon_massive_intent
606
+ name: MTEB MassiveIntentClassification (zh-CN)
607
+ config: zh-CN
608
+ split: test
609
+ revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7
610
+ metrics:
611
+ - type: accuracy
612
+ value: 68.1977135171486
613
+ - type: f1
614
+ value: 67.23114308718404
615
+ - task:
616
+ type: Classification
617
+ dataset:
618
+ type: mteb/amazon_massive_scenario
619
+ name: MTEB MassiveScenarioClassification (zh-CN)
620
+ config: zh-CN
621
+ split: test
622
+ revision: 7d571f92784cd94a019292a1f45445077d0ef634
623
+ metrics:
624
+ - type: accuracy
625
+ value: 71.92669804976462
626
+ - type: f1
627
+ value: 72.90628475628779
628
+ - task:
629
+ type: Retrieval
630
+ dataset:
631
+ type: C-MTEB/MedicalRetrieval
632
+ name: MTEB MedicalRetrieval
633
+ config: default
634
+ split: dev
635
+ revision: None
636
+ metrics:
637
+ - type: map_at_1
638
+ value: 49.2
639
+ - type: map_at_10
640
+ value: 54.539
641
+ - type: map_at_100
642
+ value: 55.135
643
+ - type: map_at_1000
644
+ value: 55.19199999999999
645
+ - type: map_at_3
646
+ value: 53.383
647
+ - type: map_at_5
648
+ value: 54.142999999999994
649
+ - type: mrr_at_1
650
+ value: 49.2
651
+ - type: mrr_at_10
652
+ value: 54.539
653
+ - type: mrr_at_100
654
+ value: 55.135999999999996
655
+ - type: mrr_at_1000
656
+ value: 55.19199999999999
657
+ - type: mrr_at_3
658
+ value: 53.383
659
+ - type: mrr_at_5
660
+ value: 54.142999999999994
661
+ - type: ndcg_at_1
662
+ value: 49.2
663
+ - type: ndcg_at_10
664
+ value: 57.123000000000005
665
+ - type: ndcg_at_100
666
+ value: 60.21300000000001
667
+ - type: ndcg_at_1000
668
+ value: 61.915
669
+ - type: ndcg_at_3
670
+ value: 54.772
671
+ - type: ndcg_at_5
672
+ value: 56.157999999999994
673
+ - type: precision_at_1
674
+ value: 49.2
675
+ - type: precision_at_10
676
+ value: 6.52
677
+ - type: precision_at_100
678
+ value: 0.8009999999999999
679
+ - type: precision_at_1000
680
+ value: 0.094
681
+ - type: precision_at_3
682
+ value: 19.6
683
+ - type: precision_at_5
684
+ value: 12.44
685
+ - type: recall_at_1
686
+ value: 49.2
687
+ - type: recall_at_10
688
+ value: 65.2
689
+ - type: recall_at_100
690
+ value: 80.10000000000001
691
+ - type: recall_at_1000
692
+ value: 93.89999999999999
693
+ - type: recall_at_3
694
+ value: 58.8
695
+ - type: recall_at_5
696
+ value: 62.2
697
+ - task:
698
+ type: Classification
699
+ dataset:
700
+ type: C-MTEB/MultilingualSentiment-classification
701
+ name: MTEB MultilingualSentiment
702
+ config: default
703
+ split: validation
704
+ revision: None
705
+ metrics:
706
+ - type: accuracy
707
+ value: 63.29333333333334
708
+ - type: f1
709
+ value: 63.03293854259612
710
+ - task:
711
+ type: PairClassification
712
+ dataset:
713
+ type: C-MTEB/OCNLI
714
+ name: MTEB Ocnli
715
+ config: default
716
+ split: validation
717
+ revision: None
718
+ metrics:
719
+ - type: cos_sim_accuracy
720
+ value: 75.69030860855442
721
+ - type: cos_sim_ap
722
+ value: 80.6157833772759
723
+ - type: cos_sim_f1
724
+ value: 77.87524366471735
725
+ - type: cos_sim_precision
726
+ value: 72.3076923076923
727
+ - type: cos_sim_recall
728
+ value: 84.37170010559663
729
+ - type: dot_accuracy
730
+ value: 67.78559826746074
731
+ - type: dot_ap
732
+ value: 72.00871467527499
733
+ - type: dot_f1
734
+ value: 72.58722247394654
735
+ - type: dot_precision
736
+ value: 63.57142857142857
737
+ - type: dot_recall
738
+ value: 84.58289334741288
739
+ - type: euclidean_accuracy
740
+ value: 75.20303194369248
741
+ - type: euclidean_ap
742
+ value: 80.98587256415605
743
+ - type: euclidean_f1
744
+ value: 77.26396917148362
745
+ - type: euclidean_precision
746
+ value: 71.03631532329496
747
+ - type: euclidean_recall
748
+ value: 84.68848996832101
749
+ - type: manhattan_accuracy
750
+ value: 75.20303194369248
751
+ - type: manhattan_ap
752
+ value: 80.93460699513219
753
+ - type: manhattan_f1
754
+ value: 77.124773960217
755
+ - type: manhattan_precision
756
+ value: 67.43083003952569
757
+ - type: manhattan_recall
758
+ value: 90.07391763463569
759
+ - type: max_accuracy
760
+ value: 75.69030860855442
761
+ - type: max_ap
762
+ value: 80.98587256415605
763
+ - type: max_f1
764
+ value: 77.87524366471735
765
+ - task:
766
+ type: Classification
767
+ dataset:
768
+ type: C-MTEB/OnlineShopping-classification
769
+ name: MTEB OnlineShopping
770
+ config: default
771
+ split: test
772
+ revision: None
773
+ metrics:
774
+ - type: accuracy
775
+ value: 87.00000000000001
776
+ - type: ap
777
+ value: 83.24372135949511
778
+ - type: f1
779
+ value: 86.95554191530607
780
+ - task:
781
+ type: STS
782
+ dataset:
783
+ type: C-MTEB/PAWSX
784
+ name: MTEB PAWSX
785
+ config: default
786
+ split: test
787
+ revision: None
788
+ metrics:
789
+ - type: cos_sim_pearson
790
+ value: 37.57616811591219
791
+ - type: cos_sim_spearman
792
+ value: 41.490259084930045
793
+ - type: euclidean_pearson
794
+ value: 38.9155043692188
795
+ - type: euclidean_spearman
796
+ value: 39.16056534305623
797
+ - type: manhattan_pearson
798
+ value: 38.76569892264335
799
+ - type: manhattan_spearman
800
+ value: 38.99891685590743
801
+ - task:
802
+ type: STS
803
+ dataset:
804
+ type: C-MTEB/QBQTC
805
+ name: MTEB QBQTC
806
+ config: default
807
+ split: test
808
+ revision: None
809
+ metrics:
810
+ - type: cos_sim_pearson
811
+ value: 35.44858610359665
812
+ - type: cos_sim_spearman
813
+ value: 38.11128146262466
814
+ - type: euclidean_pearson
815
+ value: 31.928644189822457
816
+ - type: euclidean_spearman
817
+ value: 34.384936631696554
818
+ - type: manhattan_pearson
819
+ value: 31.90586687414376
820
+ - type: manhattan_spearman
821
+ value: 34.35770153777186
822
+ - task:
823
+ type: STS
824
+ dataset:
825
+ type: mteb/sts22-crosslingual-sts
826
+ name: MTEB STS22 (zh)
827
+ config: zh
828
+ split: test
829
+ revision: 6d1ba47164174a496b7fa5d3569dae26a6813b80
830
+ metrics:
831
+ - type: cos_sim_pearson
832
+ value: 66.54931957553592
833
+ - type: cos_sim_spearman
834
+ value: 69.25068863016632
835
+ - type: euclidean_pearson
836
+ value: 50.26525596106869
837
+ - type: euclidean_spearman
838
+ value: 63.83352741910006
839
+ - type: manhattan_pearson
840
+ value: 49.98798282198196
841
+ - type: manhattan_spearman
842
+ value: 63.87649521907841
843
+ - task:
844
+ type: STS
845
+ dataset:
846
+ type: C-MTEB/STSB
847
+ name: MTEB STSB
848
+ config: default
849
+ split: test
850
+ revision: None
851
+ metrics:
852
+ - type: cos_sim_pearson
853
+ value: 82.52782476625825
854
+ - type: cos_sim_spearman
855
+ value: 82.55618986168398
856
+ - type: euclidean_pearson
857
+ value: 78.48190631687673
858
+ - type: euclidean_spearman
859
+ value: 78.39479731354655
860
+ - type: manhattan_pearson
861
+ value: 78.51176592165885
862
+ - type: manhattan_spearman
863
+ value: 78.42363787303265
864
+ - task:
865
+ type: Reranking
866
+ dataset:
867
+ type: C-MTEB/T2Reranking
868
+ name: MTEB T2Reranking
869
+ config: default
870
+ split: dev
871
+ revision: None
872
+ metrics:
873
+ - type: map
874
+ value: 67.36693873615643
875
+ - type: mrr
876
+ value: 77.83847701797939
877
+ - task:
878
+ type: Retrieval
879
+ dataset:
880
+ type: C-MTEB/T2Retrieval
881
+ name: MTEB T2Retrieval
882
+ config: default
883
+ split: dev
884
+ revision: None
885
+ metrics:
886
+ - type: map_at_1
887
+ value: 25.795
888
+ - type: map_at_10
889
+ value: 72.258
890
+ - type: map_at_100
891
+ value: 76.049
892
+ - type: map_at_1000
893
+ value: 76.134
894
+ - type: map_at_3
895
+ value: 50.697
896
+ - type: map_at_5
897
+ value: 62.324999999999996
898
+ - type: mrr_at_1
899
+ value: 86.634
900
+ - type: mrr_at_10
901
+ value: 89.792
902
+ - type: mrr_at_100
903
+ value: 89.91900000000001
904
+ - type: mrr_at_1000
905
+ value: 89.923
906
+ - type: mrr_at_3
907
+ value: 89.224
908
+ - type: mrr_at_5
909
+ value: 89.608
910
+ - type: ndcg_at_1
911
+ value: 86.634
912
+ - type: ndcg_at_10
913
+ value: 80.589
914
+ - type: ndcg_at_100
915
+ value: 84.812
916
+ - type: ndcg_at_1000
917
+ value: 85.662
918
+ - type: ndcg_at_3
919
+ value: 82.169
920
+ - type: ndcg_at_5
921
+ value: 80.619
922
+ - type: precision_at_1
923
+ value: 86.634
924
+ - type: precision_at_10
925
+ value: 40.389
926
+ - type: precision_at_100
927
+ value: 4.93
928
+ - type: precision_at_1000
929
+ value: 0.513
930
+ - type: precision_at_3
931
+ value: 72.104
932
+ - type: precision_at_5
933
+ value: 60.425
934
+ - type: recall_at_1
935
+ value: 25.795
936
+ - type: recall_at_10
937
+ value: 79.565
938
+ - type: recall_at_100
939
+ value: 93.24799999999999
940
+ - type: recall_at_1000
941
+ value: 97.595
942
+ - type: recall_at_3
943
+ value: 52.583999999999996
944
+ - type: recall_at_5
945
+ value: 66.175
946
+ - task:
947
+ type: Classification
948
+ dataset:
949
+ type: C-MTEB/TNews-classification
950
+ name: MTEB TNews
951
+ config: default
952
+ split: validation
953
+ revision: None
954
+ metrics:
955
+ - type: accuracy
956
+ value: 47.648999999999994
957
+ - type: f1
958
+ value: 46.28925837008413
959
+ - task:
960
+ type: Clustering
961
+ dataset:
962
+ type: C-MTEB/ThuNewsClusteringP2P
963
+ name: MTEB ThuNewsClusteringP2P
964
+ config: default
965
+ split: test
966
+ revision: None
967
+ metrics:
968
+ - type: v_measure
969
+ value: 54.07641891287953
970
+ - task:
971
+ type: Clustering
972
+ dataset:
973
+ type: C-MTEB/ThuNewsClusteringS2S
974
+ name: MTEB ThuNewsClusteringS2S
975
+ config: default
976
+ split: test
977
+ revision: None
978
+ metrics:
979
+ - type: v_measure
980
+ value: 53.423702062353954
981
+ - task:
982
+ type: Retrieval
983
+ dataset:
984
+ type: C-MTEB/VideoRetrieval
985
+ name: MTEB VideoRetrieval
986
+ config: default
987
+ split: dev
988
+ revision: None
989
+ metrics:
990
+ - type: map_at_1
991
+ value: 55.7
992
+ - type: map_at_10
993
+ value: 65.923
994
+ - type: map_at_100
995
+ value: 66.42
996
+ - type: map_at_1000
997
+ value: 66.431
998
+ - type: map_at_3
999
+ value: 63.9
1000
+ - type: map_at_5
1001
+ value: 65.225
1002
+ - type: mrr_at_1
1003
+ value: 55.60000000000001
1004
+ - type: mrr_at_10
1005
+ value: 65.873
1006
+ - type: mrr_at_100
1007
+ value: 66.36999999999999
1008
+ - type: mrr_at_1000
1009
+ value: 66.381
1010
+ - type: mrr_at_3
1011
+ value: 63.849999999999994
1012
+ - type: mrr_at_5
1013
+ value: 65.17500000000001
1014
+ - type: ndcg_at_1
1015
+ value: 55.7
1016
+ - type: ndcg_at_10
1017
+ value: 70.621
1018
+ - type: ndcg_at_100
1019
+ value: 72.944
1020
+ - type: ndcg_at_1000
1021
+ value: 73.25399999999999
1022
+ - type: ndcg_at_3
1023
+ value: 66.547
1024
+ - type: ndcg_at_5
1025
+ value: 68.93599999999999
1026
+ - type: precision_at_1
1027
+ value: 55.7
1028
+ - type: precision_at_10
1029
+ value: 8.52
1030
+ - type: precision_at_100
1031
+ value: 0.958
1032
+ - type: precision_at_1000
1033
+ value: 0.098
1034
+ - type: precision_at_3
1035
+ value: 24.733
1036
+ - type: precision_at_5
1037
+ value: 16
1038
+ - type: recall_at_1
1039
+ value: 55.7
1040
+ - type: recall_at_10
1041
+ value: 85.2
1042
+ - type: recall_at_100
1043
+ value: 95.8
1044
+ - type: recall_at_1000
1045
+ value: 98.3
1046
+ - type: recall_at_3
1047
+ value: 74.2
1048
+ - type: recall_at_5
1049
+ value: 80
1050
+ - task:
1051
+ type: Classification
1052
+ dataset:
1053
+ type: C-MTEB/waimai-classification
1054
+ name: MTEB Waimai
1055
+ config: default
1056
+ split: test
1057
+ revision: None
1058
+ metrics:
1059
+ - type: accuracy
1060
+ value: 84.54
1061
+ - type: ap
1062
+ value: 66.13603199670062
1063
+ - type: f1
1064
+ value: 82.61420654584116
1065
  ---
1066
+
1067
+ <!-- TODO: add evaluation results here -->
1068
+ <br><br>
1069
+
1070
+ <p align="center">
1071
+ <img src="https://github.com/jina-ai/finetuner/blob/main/docs/_static/finetuner-logo-ani.svg?raw=true" alt="Finetuner logo: Finetuner helps you to create experiments in order to improve embeddings on search tasks. It accompanies you to deliver the last mile of performance-tuning for neural search applications." width="150px">
1072
+ </p>
1073
+
1074
+
1075
+ <p align="center">
1076
+ <b>The text embedding set trained by <a href="https://jina.ai/"><b>Jina AI</b></a>, <a href="https://github.com/jina-ai/finetuner"><b>Finetuner</b></a> team.</b>
1077
+ </p>
1078
+
1079
+
1080
+ ## Intended Usage & Model Info
1081
+
1082
+ `jina-embeddings-v2-base-zh` is a Chinese/English bilingual text **embedding model** supporting **8192 sequence length**. Our model has 161 million parameters.
1083
+ We have designed it for high performance in cross-language applications and trained it specifically to support mixed Chinese-English input without bias.
1084
+
1085
+
1086
+ You can use the embedding model either via the Jina AI's [Embedding platform](https://jina.ai/embeddings/), AWS SageMaker or in your private deployments.
1087
+
1088
+ ## Usage Jina Embedding API
1089
+
1090
+ The following code snippet shows the usage of the Jina Embedding API:
1091
+ ```
1092
+ curl https://api.jina.ai/v1/embeddings \
1093
+ -H "Content-Type: application/json" \
1094
+ -H "Authorization: Bearer jina_xxxxxxx" \
1095
+ -d '{
1096
+ "input": ["你的输入可以是纯中文", "or purely in English", "or like mixture of 中文 and 英文"],
1097
+ "model": "jina-embeddings-v2-base-zh"
1098
+ }'
1099
+
1100
+ ```
1101
+
1102
+ Get your free API key on: https://jina.ai/embeddings/
1103
+
1104
+ ## Opensource
1105
+
1106
+ We will opensource the full model in a few days!
1107
+
1108
+ ## Contact
1109
+
1110
+ Join our [Discord community](https://discord.jina.ai) and chat with other community members about ideas.