metadata
pipeline_tag: sentence-similarity
tags:
- sentence-transformers
- feature-extraction
- sentence-similarity
- mteb
model-index:
- name: cai-stellaris-text-embeddings
results:
- task:
type: Classification
dataset:
type: mteb/amazon_counterfactual
name: MTEB AmazonCounterfactualClassification (en)
config: en
split: test
revision: e8379541af4e31359cca9fbcf4b00f2671dba205
metrics:
- type: accuracy
value: 64.86567164179104
- type: ap
value: 28.30760041689409
- type: f1
value: 59.08589995918376
- task:
type: Classification
dataset:
type: mteb/amazon_polarity
name: MTEB AmazonPolarityClassification
config: default
split: test
revision: e2d317d38cd51312af73b3d32a06d1a08b442046
metrics:
- type: accuracy
value: 65.168625
- type: ap
value: 60.131922961382166
- type: f1
value: 65.02463910192814
- task:
type: Classification
dataset:
type: mteb/amazon_reviews_multi
name: MTEB AmazonReviewsClassification (en)
config: en
split: test
revision: 1399c76144fd37290681b995c656ef9b2e06e26d
metrics:
- type: accuracy
value: 31.016
- type: f1
value: 30.501226228002924
- task:
type: Retrieval
dataset:
type: arguana
name: MTEB ArguAna
config: default
split: test
revision: None
metrics:
- type: map_at_1
value: 24.609
- type: map_at_10
value: 38.793
- type: map_at_100
value: 40.074
- type: map_at_1000
value: 40.083
- type: map_at_3
value: 33.736
- type: map_at_5
value: 36.642
- type: mrr_at_1
value: 25.533
- type: mrr_at_10
value: 39.129999999999995
- type: mrr_at_100
value: 40.411
- type: mrr_at_1000
value: 40.42
- type: mrr_at_3
value: 34.033
- type: mrr_at_5
value: 36.956
- type: ndcg_at_1
value: 24.609
- type: ndcg_at_10
value: 47.288000000000004
- type: ndcg_at_100
value: 52.654999999999994
- type: ndcg_at_1000
value: 52.88699999999999
- type: ndcg_at_3
value: 36.86
- type: ndcg_at_5
value: 42.085
- type: precision_at_1
value: 24.609
- type: precision_at_10
value: 7.468
- type: precision_at_100
value: 0.979
- type: precision_at_1000
value: 0.1
- type: precision_at_3
value: 15.315000000000001
- type: precision_at_5
value: 11.721
- type: recall_at_1
value: 24.609
- type: recall_at_10
value: 74.68
- type: recall_at_100
value: 97.866
- type: recall_at_1000
value: 99.644
- type: recall_at_3
value: 45.946
- type: recall_at_5
value: 58.606
- task:
type: Clustering
dataset:
type: mteb/arxiv-clustering-p2p
name: MTEB ArxivClusteringP2P
config: default
split: test
revision: a122ad7f3f0291bf49cc6f4d32aa80929df69d5d
metrics:
- type: v_measure
value: 42.014046191286525
- task:
type: Clustering
dataset:
type: mteb/arxiv-clustering-s2s
name: MTEB ArxivClusteringS2S
config: default
split: test
revision: f910caf1a6075f7329cdf8c1a6135696f37dbd53
metrics:
- type: v_measure
value: 31.406159641263052
- task:
type: Reranking
dataset:
type: mteb/askubuntudupquestions-reranking
name: MTEB AskUbuntuDupQuestions
config: default
split: test
revision: 2000358ca161889fa9c082cb41daa8dcfb161a54
metrics:
- type: map
value: 60.35266033223575
- type: mrr
value: 72.66796376907179
- task:
type: Classification
dataset:
type: mteb/banking77
name: MTEB Banking77Classification
config: default
split: test
revision: 0fd18e25b25c072e09e0d92ab615fda904d66300
metrics:
- type: accuracy
value: 74.12337662337661
- type: f1
value: 73.12122145084057
- task:
type: Clustering
dataset:
type: mteb/biorxiv-clustering-p2p
name: MTEB BiorxivClusteringP2P
config: default
split: test
revision: 65b79d1d13f80053f67aca9498d9402c2d9f1f40
metrics:
- type: v_measure
value: 34.72513663347855
- task:
type: Clustering
dataset:
type: mteb/biorxiv-clustering-s2s
name: MTEB BiorxivClusteringS2S
config: default
split: test
revision: 258694dd0231531bc1fd9de6ceb52a0853c6d908
metrics:
- type: v_measure
value: 29.280150859689826
- task:
type: Retrieval
dataset:
type: BeIR/cqadupstack
name: MTEB CQADupstackAndroidRetrieval
config: default
split: test
revision: None
metrics:
- type: map_at_1
value: 21.787
- type: map_at_10
value: 30.409000000000002
- type: map_at_100
value: 31.947
- type: map_at_1000
value: 32.09
- type: map_at_3
value: 27.214
- type: map_at_5
value: 28.810999999999996
- type: mrr_at_1
value: 27.039
- type: mrr_at_10
value: 35.581
- type: mrr_at_100
value: 36.584
- type: mrr_at_1000
value: 36.645
- type: mrr_at_3
value: 32.713
- type: mrr_at_5
value: 34.272999999999996
- type: ndcg_at_1
value: 27.039
- type: ndcg_at_10
value: 36.157000000000004
- type: ndcg_at_100
value: 42.598
- type: ndcg_at_1000
value: 45.207
- type: ndcg_at_3
value: 30.907
- type: ndcg_at_5
value: 33.068
- type: precision_at_1
value: 27.039
- type: precision_at_10
value: 7.295999999999999
- type: precision_at_100
value: 1.303
- type: precision_at_1000
value: 0.186
- type: precision_at_3
value: 14.926
- type: precision_at_5
value: 11.044
- type: recall_at_1
value: 21.787
- type: recall_at_10
value: 47.693999999999996
- type: recall_at_100
value: 75.848
- type: recall_at_1000
value: 92.713
- type: recall_at_3
value: 32.92
- type: recall_at_5
value: 38.794000000000004
- task:
type: Retrieval
dataset:
type: BeIR/cqadupstack
name: MTEB CQADupstackEnglishRetrieval
config: default
split: test
revision: None
metrics:
- type: map_at_1
value: 24.560000000000002
- type: map_at_10
value: 34.756
- type: map_at_100
value: 36.169000000000004
- type: map_at_1000
value: 36.298
- type: map_at_3
value: 31.592
- type: map_at_5
value: 33.426
- type: mrr_at_1
value: 31.274
- type: mrr_at_10
value: 40.328
- type: mrr_at_100
value: 41.125
- type: mrr_at_1000
value: 41.171
- type: mrr_at_3
value: 37.866
- type: mrr_at_5
value: 39.299
- type: ndcg_at_1
value: 31.338
- type: ndcg_at_10
value: 40.696
- type: ndcg_at_100
value: 45.922000000000004
- type: ndcg_at_1000
value: 47.982
- type: ndcg_at_3
value: 36.116
- type: ndcg_at_5
value: 38.324000000000005
- type: precision_at_1
value: 31.338
- type: precision_at_10
value: 8.083
- type: precision_at_100
value: 1.4040000000000001
- type: precision_at_1000
value: 0.189
- type: precision_at_3
value: 18.089
- type: precision_at_5
value: 13.159
- type: recall_at_1
value: 24.560000000000002
- type: recall_at_10
value: 51.832
- type: recall_at_100
value: 74.26899999999999
- type: recall_at_1000
value: 87.331
- type: recall_at_3
value: 38.086999999999996
- type: recall_at_5
value: 44.294
- task:
type: Retrieval
dataset:
type: BeIR/cqadupstack
name: MTEB CQADupstackGamingRetrieval
config: default
split: test
revision: None
metrics:
- type: map_at_1
value: 27.256999999999998
- type: map_at_10
value: 38.805
- type: map_at_100
value: 40.04
- type: map_at_1000
value: 40.117000000000004
- type: map_at_3
value: 35.425000000000004
- type: map_at_5
value: 37.317
- type: mrr_at_1
value: 31.912000000000003
- type: mrr_at_10
value: 42.045
- type: mrr_at_100
value: 42.956
- type: mrr_at_1000
value: 43.004
- type: mrr_at_3
value: 39.195
- type: mrr_at_5
value: 40.866
- type: ndcg_at_1
value: 31.912000000000003
- type: ndcg_at_10
value: 44.826
- type: ndcg_at_100
value: 49.85
- type: ndcg_at_1000
value: 51.562
- type: ndcg_at_3
value: 38.845
- type: ndcg_at_5
value: 41.719
- type: precision_at_1
value: 31.912000000000003
- type: precision_at_10
value: 7.768
- type: precision_at_100
value: 1.115
- type: precision_at_1000
value: 0.131
- type: precision_at_3
value: 18.015
- type: precision_at_5
value: 12.814999999999998
- type: recall_at_1
value: 27.256999999999998
- type: recall_at_10
value: 59.611999999999995
- type: recall_at_100
value: 81.324
- type: recall_at_1000
value: 93.801
- type: recall_at_3
value: 43.589
- type: recall_at_5
value: 50.589
- task:
type: Retrieval
dataset:
type: BeIR/cqadupstack
name: MTEB CQADupstackGisRetrieval
config: default
split: test
revision: None
metrics:
- type: map_at_1
value: 15.588
- type: map_at_10
value: 22.936999999999998
- type: map_at_100
value: 24.015
- type: map_at_1000
value: 24.127000000000002
- type: map_at_3
value: 20.47
- type: map_at_5
value: 21.799
- type: mrr_at_1
value: 16.723
- type: mrr_at_10
value: 24.448
- type: mrr_at_100
value: 25.482
- type: mrr_at_1000
value: 25.568999999999996
- type: mrr_at_3
value: 21.94
- type: mrr_at_5
value: 23.386000000000003
- type: ndcg_at_1
value: 16.723
- type: ndcg_at_10
value: 27.451999999999998
- type: ndcg_at_100
value: 33.182
- type: ndcg_at_1000
value: 36.193999999999996
- type: ndcg_at_3
value: 22.545
- type: ndcg_at_5
value: 24.837
- type: precision_at_1
value: 16.723
- type: precision_at_10
value: 4.5760000000000005
- type: precision_at_100
value: 0.7929999999999999
- type: precision_at_1000
value: 0.11
- type: precision_at_3
value: 9.944
- type: precision_at_5
value: 7.321999999999999
- type: recall_at_1
value: 15.588
- type: recall_at_10
value: 40.039
- type: recall_at_100
value: 67.17699999999999
- type: recall_at_1000
value: 90.181
- type: recall_at_3
value: 26.663999999999998
- type: recall_at_5
value: 32.144
- task:
type: Retrieval
dataset:
type: BeIR/cqadupstack
name: MTEB CQADupstackMathematicaRetrieval
config: default
split: test
revision: None
metrics:
- type: map_at_1
value: 12.142999999999999
- type: map_at_10
value: 18.355
- type: map_at_100
value: 19.611
- type: map_at_1000
value: 19.750999999999998
- type: map_at_3
value: 16.073999999999998
- type: map_at_5
value: 17.187
- type: mrr_at_1
value: 15.547
- type: mrr_at_10
value: 22.615
- type: mrr_at_100
value: 23.671
- type: mrr_at_1000
value: 23.759
- type: mrr_at_3
value: 20.149
- type: mrr_at_5
value: 21.437
- type: ndcg_at_1
value: 15.547
- type: ndcg_at_10
value: 22.985
- type: ndcg_at_100
value: 29.192
- type: ndcg_at_1000
value: 32.448
- type: ndcg_at_3
value: 18.503
- type: ndcg_at_5
value: 20.322000000000003
- type: precision_at_1
value: 15.547
- type: precision_at_10
value: 4.49
- type: precision_at_100
value: 0.8840000000000001
- type: precision_at_1000
value: 0.129
- type: precision_at_3
value: 8.872
- type: precision_at_5
value: 6.741
- type: recall_at_1
value: 12.142999999999999
- type: recall_at_10
value: 33.271
- type: recall_at_100
value: 60.95399999999999
- type: recall_at_1000
value: 83.963
- type: recall_at_3
value: 20.645
- type: recall_at_5
value: 25.34
- task:
type: Retrieval
dataset:
type: BeIR/cqadupstack
name: MTEB CQADupstackPhysicsRetrieval
config: default
split: test
revision: None
metrics:
- type: map_at_1
value: 22.09
- type: map_at_10
value: 30.220000000000002
- type: map_at_100
value: 31.741999999999997
- type: map_at_1000
value: 31.878
- type: map_at_3
value: 27.455000000000002
- type: map_at_5
value: 28.808
- type: mrr_at_1
value: 27.718999999999998
- type: mrr_at_10
value: 35.476
- type: mrr_at_100
value: 36.53
- type: mrr_at_1000
value: 36.602000000000004
- type: mrr_at_3
value: 33.157
- type: mrr_at_5
value: 34.36
- type: ndcg_at_1
value: 27.718999999999998
- type: ndcg_at_10
value: 35.547000000000004
- type: ndcg_at_100
value: 42.079
- type: ndcg_at_1000
value: 44.861000000000004
- type: ndcg_at_3
value: 30.932
- type: ndcg_at_5
value: 32.748
- type: precision_at_1
value: 27.718999999999998
- type: precision_at_10
value: 6.795
- type: precision_at_100
value: 1.194
- type: precision_at_1000
value: 0.163
- type: precision_at_3
value: 14.758
- type: precision_at_5
value: 10.549
- type: recall_at_1
value: 22.09
- type: recall_at_10
value: 46.357
- type: recall_at_100
value: 74.002
- type: recall_at_1000
value: 92.99199999999999
- type: recall_at_3
value: 33.138
- type: recall_at_5
value: 38.034
- task:
type: Retrieval
dataset:
type: BeIR/cqadupstack
name: MTEB CQADupstackProgrammersRetrieval
config: default
split: test
revision: None
metrics:
- type: map_at_1
value: 16.904
- type: map_at_10
value: 25.075999999999997
- type: map_at_100
value: 26.400000000000002
- type: map_at_1000
value: 26.525
- type: map_at_3
value: 22.191
- type: map_at_5
value: 23.947
- type: mrr_at_1
value: 21.461
- type: mrr_at_10
value: 29.614
- type: mrr_at_100
value: 30.602
- type: mrr_at_1000
value: 30.677
- type: mrr_at_3
value: 27.017000000000003
- type: mrr_at_5
value: 28.626
- type: ndcg_at_1
value: 21.461
- type: ndcg_at_10
value: 30.304
- type: ndcg_at_100
value: 36.521
- type: ndcg_at_1000
value: 39.366
- type: ndcg_at_3
value: 25.267
- type: ndcg_at_5
value: 27.918
- type: precision_at_1
value: 21.461
- type: precision_at_10
value: 5.868
- type: precision_at_100
value: 1.072
- type: precision_at_1000
value: 0.151
- type: precision_at_3
value: 12.291
- type: precision_at_5
value: 9.429
- type: recall_at_1
value: 16.904
- type: recall_at_10
value: 41.521
- type: recall_at_100
value: 68.919
- type: recall_at_1000
value: 88.852
- type: recall_at_3
value: 27.733999999999998
- type: recall_at_5
value: 34.439
- task:
type: Retrieval
dataset:
type: BeIR/cqadupstack
name: MTEB CQADupstackRetrieval
config: default
split: test
revision: None
metrics:
- type: map_at_1
value: 18.327916666666667
- type: map_at_10
value: 26.068
- type: map_at_100
value: 27.358833333333333
- type: map_at_1000
value: 27.491583333333335
- type: map_at_3
value: 23.45508333333333
- type: map_at_5
value: 24.857916666666664
- type: mrr_at_1
value: 22.05066666666667
- type: mrr_at_10
value: 29.805083333333332
- type: mrr_at_100
value: 30.80283333333333
- type: mrr_at_1000
value: 30.876166666666666
- type: mrr_at_3
value: 27.381083333333333
- type: mrr_at_5
value: 28.72441666666667
- type: ndcg_at_1
value: 22.056000000000004
- type: ndcg_at_10
value: 31.029416666666666
- type: ndcg_at_100
value: 36.90174999999999
- type: ndcg_at_1000
value: 39.716249999999995
- type: ndcg_at_3
value: 26.35533333333333
- type: ndcg_at_5
value: 28.471500000000006
- type: precision_at_1
value: 22.056000000000004
- type: precision_at_10
value: 5.7645833333333325
- type: precision_at_100
value: 1.0406666666666666
- type: precision_at_1000
value: 0.14850000000000002
- type: precision_at_3
value: 12.391416666666666
- type: precision_at_5
value: 9.112499999999999
- type: recall_at_1
value: 18.327916666666667
- type: recall_at_10
value: 42.15083333333333
- type: recall_at_100
value: 68.38666666666666
- type: recall_at_1000
value: 88.24183333333333
- type: recall_at_3
value: 29.094416666666667
- type: recall_at_5
value: 34.48716666666666
- task:
type: Retrieval
dataset:
type: BeIR/cqadupstack
name: MTEB CQADupstackStatsRetrieval
config: default
split: test
revision: None
metrics:
- type: map_at_1
value: 15.009
- type: map_at_10
value: 21.251
- type: map_at_100
value: 22.337
- type: map_at_1000
value: 22.455
- type: map_at_3
value: 19.241
- type: map_at_5
value: 20.381
- type: mrr_at_1
value: 17.638
- type: mrr_at_10
value: 24.184
- type: mrr_at_100
value: 25.156
- type: mrr_at_1000
value: 25.239
- type: mrr_at_3
value: 22.29
- type: mrr_at_5
value: 23.363999999999997
- type: ndcg_at_1
value: 17.638
- type: ndcg_at_10
value: 25.269000000000002
- type: ndcg_at_100
value: 30.781999999999996
- type: ndcg_at_1000
value: 33.757
- type: ndcg_at_3
value: 21.457
- type: ndcg_at_5
value: 23.293
- type: precision_at_1
value: 17.638
- type: precision_at_10
value: 4.294
- type: precision_at_100
value: 0.771
- type: precision_at_1000
value: 0.11100000000000002
- type: precision_at_3
value: 9.815999999999999
- type: precision_at_5
value: 7.086
- type: recall_at_1
value: 15.009
- type: recall_at_10
value: 35.014
- type: recall_at_100
value: 60.45399999999999
- type: recall_at_1000
value: 82.416
- type: recall_at_3
value: 24.131
- type: recall_at_5
value: 28.846
- task:
type: Retrieval
dataset:
type: BeIR/cqadupstack
name: MTEB CQADupstackTexRetrieval
config: default
split: test
revision: None
metrics:
- type: map_at_1
value: 12.518
- type: map_at_10
value: 18.226
- type: map_at_100
value: 19.355
- type: map_at_1000
value: 19.496
- type: map_at_3
value: 16.243
- type: map_at_5
value: 17.288999999999998
- type: mrr_at_1
value: 15.382000000000001
- type: mrr_at_10
value: 21.559
- type: mrr_at_100
value: 22.587
- type: mrr_at_1000
value: 22.677
- type: mrr_at_3
value: 19.597
- type: mrr_at_5
value: 20.585
- type: ndcg_at_1
value: 15.382000000000001
- type: ndcg_at_10
value: 22.198
- type: ndcg_at_100
value: 27.860000000000003
- type: ndcg_at_1000
value: 31.302999999999997
- type: ndcg_at_3
value: 18.541
- type: ndcg_at_5
value: 20.089000000000002
- type: precision_at_1
value: 15.382000000000001
- type: precision_at_10
value: 4.178
- type: precision_at_100
value: 0.8380000000000001
- type: precision_at_1000
value: 0.132
- type: precision_at_3
value: 8.866999999999999
- type: precision_at_5
value: 6.476
- type: recall_at_1
value: 12.518
- type: recall_at_10
value: 31.036
- type: recall_at_100
value: 56.727000000000004
- type: recall_at_1000
value: 81.66799999999999
- type: recall_at_3
value: 20.610999999999997
- type: recall_at_5
value: 24.744
- task:
type: Retrieval
dataset:
type: BeIR/cqadupstack
name: MTEB CQADupstackUnixRetrieval
config: default
split: test
revision: None
metrics:
- type: map_at_1
value: 18.357
- type: map_at_10
value: 25.384
- type: map_at_100
value: 26.640000000000004
- type: map_at_1000
value: 26.762999999999998
- type: map_at_3
value: 22.863
- type: map_at_5
value: 24.197
- type: mrr_at_1
value: 21.735
- type: mrr_at_10
value: 29.069
- type: mrr_at_100
value: 30.119
- type: mrr_at_1000
value: 30.194
- type: mrr_at_3
value: 26.663999999999998
- type: mrr_at_5
value: 27.904
- type: ndcg_at_1
value: 21.735
- type: ndcg_at_10
value: 30.153999999999996
- type: ndcg_at_100
value: 36.262
- type: ndcg_at_1000
value: 39.206
- type: ndcg_at_3
value: 25.365
- type: ndcg_at_5
value: 27.403
- type: precision_at_1
value: 21.735
- type: precision_at_10
value: 5.354
- type: precision_at_100
value: 0.958
- type: precision_at_1000
value: 0.134
- type: precision_at_3
value: 11.567
- type: precision_at_5
value: 8.469999999999999
- type: recall_at_1
value: 18.357
- type: recall_at_10
value: 41.205000000000005
- type: recall_at_100
value: 68.30000000000001
- type: recall_at_1000
value: 89.294
- type: recall_at_3
value: 27.969
- type: recall_at_5
value: 32.989000000000004
- task:
type: Retrieval
dataset:
type: BeIR/cqadupstack
name: MTEB CQADupstackWebmastersRetrieval
config: default
split: test
revision: None
metrics:
- type: map_at_1
value: 18.226
- type: map_at_10
value: 25.766
- type: map_at_100
value: 27.345000000000002
- type: map_at_1000
value: 27.575
- type: map_at_3
value: 22.945999999999998
- type: map_at_5
value: 24.383
- type: mrr_at_1
value: 21.542
- type: mrr_at_10
value: 29.448
- type: mrr_at_100
value: 30.509999999999998
- type: mrr_at_1000
value: 30.575000000000003
- type: mrr_at_3
value: 26.482
- type: mrr_at_5
value: 28.072999999999997
- type: ndcg_at_1
value: 21.542
- type: ndcg_at_10
value: 31.392999999999997
- type: ndcg_at_100
value: 37.589
- type: ndcg_at_1000
value: 40.717
- type: ndcg_at_3
value: 26.179000000000002
- type: ndcg_at_5
value: 28.557
- type: precision_at_1
value: 21.542
- type: precision_at_10
value: 6.462
- type: precision_at_100
value: 1.415
- type: precision_at_1000
value: 0.234
- type: precision_at_3
value: 12.187000000000001
- type: precision_at_5
value: 9.605
- type: recall_at_1
value: 18.226
- type: recall_at_10
value: 42.853
- type: recall_at_100
value: 70.97200000000001
- type: recall_at_1000
value: 91.662
- type: recall_at_3
value: 28.555999999999997
- type: recall_at_5
value: 34.203
- task:
type: Retrieval
dataset:
type: BeIR/cqadupstack
name: MTEB CQADupstackWordpressRetrieval
config: default
split: test
revision: None
metrics:
- type: map_at_1
value: 15.495999999999999
- type: map_at_10
value: 21.631
- type: map_at_100
value: 22.705000000000002
- type: map_at_1000
value: 22.823999999999998
- type: map_at_3
value: 19.747
- type: map_at_5
value: 20.75
- type: mrr_at_1
value: 16.636
- type: mrr_at_10
value: 23.294
- type: mrr_at_100
value: 24.312
- type: mrr_at_1000
value: 24.401999999999997
- type: mrr_at_3
value: 21.503
- type: mrr_at_5
value: 22.52
- type: ndcg_at_1
value: 16.636
- type: ndcg_at_10
value: 25.372
- type: ndcg_at_100
value: 30.984
- type: ndcg_at_1000
value: 33.992
- type: ndcg_at_3
value: 21.607000000000003
- type: ndcg_at_5
value: 23.380000000000003
- type: precision_at_1
value: 16.636
- type: precision_at_10
value: 4.011
- type: precision_at_100
value: 0.741
- type: precision_at_1000
value: 0.11199999999999999
- type: precision_at_3
value: 9.365
- type: precision_at_5
value: 6.654
- type: recall_at_1
value: 15.495999999999999
- type: recall_at_10
value: 35.376000000000005
- type: recall_at_100
value: 61.694
- type: recall_at_1000
value: 84.029
- type: recall_at_3
value: 25.089
- type: recall_at_5
value: 29.43
- task:
type: Retrieval
dataset:
type: climate-fever
name: MTEB ClimateFEVER
config: default
split: test
revision: None
metrics:
- type: map_at_1
value: 4.662
- type: map_at_10
value: 8.638
- type: map_at_100
value: 9.86
- type: map_at_1000
value: 10.032
- type: map_at_3
value: 6.793
- type: map_at_5
value: 7.761
- type: mrr_at_1
value: 10.684000000000001
- type: mrr_at_10
value: 17.982
- type: mrr_at_100
value: 19.152
- type: mrr_at_1000
value: 19.231
- type: mrr_at_3
value: 15.113999999999999
- type: mrr_at_5
value: 16.658
- type: ndcg_at_1
value: 10.684000000000001
- type: ndcg_at_10
value: 13.483
- type: ndcg_at_100
value: 19.48
- type: ndcg_at_1000
value: 23.232
- type: ndcg_at_3
value: 9.75
- type: ndcg_at_5
value: 11.208
- type: precision_at_1
value: 10.684000000000001
- type: precision_at_10
value: 4.573
- type: precision_at_100
value: 1.085
- type: precision_at_1000
value: 0.17600000000000002
- type: precision_at_3
value: 7.514
- type: precision_at_5
value: 6.241
- type: recall_at_1
value: 4.662
- type: recall_at_10
value: 18.125
- type: recall_at_100
value: 39.675
- type: recall_at_1000
value: 61.332
- type: recall_at_3
value: 9.239
- type: recall_at_5
value: 12.863
- task:
type: Retrieval
dataset:
type: dbpedia-entity
name: MTEB DBPedia
config: default
split: test
revision: None
metrics:
- type: map_at_1
value: 3.869
- type: map_at_10
value: 8.701
- type: map_at_100
value: 11.806999999999999
- type: map_at_1000
value: 12.676000000000002
- type: map_at_3
value: 6.3100000000000005
- type: map_at_5
value: 7.471
- type: mrr_at_1
value: 38.5
- type: mrr_at_10
value: 48.754
- type: mrr_at_100
value: 49.544
- type: mrr_at_1000
value: 49.568
- type: mrr_at_3
value: 46.167
- type: mrr_at_5
value: 47.679
- type: ndcg_at_1
value: 30.5
- type: ndcg_at_10
value: 22.454
- type: ndcg_at_100
value: 25.380999999999997
- type: ndcg_at_1000
value: 31.582
- type: ndcg_at_3
value: 25.617
- type: ndcg_at_5
value: 24.254
- type: precision_at_1
value: 38.5
- type: precision_at_10
value: 18.4
- type: precision_at_100
value: 6.02
- type: precision_at_1000
value: 1.34
- type: precision_at_3
value: 29.083
- type: precision_at_5
value: 24.85
- type: recall_at_1
value: 3.869
- type: recall_at_10
value: 12.902
- type: recall_at_100
value: 30.496000000000002
- type: recall_at_1000
value: 51.066
- type: recall_at_3
value: 7.396
- type: recall_at_5
value: 9.852
- task:
type: Classification
dataset:
type: mteb/emotion
name: MTEB EmotionClassification
config: default
split: test
revision: 4f58c6b202a23cf9a4da393831edf4f9183cad37
metrics:
- type: accuracy
value: 36.705000000000005
- type: f1
value: 32.72625967901387
- task:
type: Classification
dataset:
type: mteb/imdb
name: MTEB ImdbClassification
config: default
split: test
revision: 3d86128a09e091d6018b6d26cad27f2739fc2db7
metrics:
- type: accuracy
value: 66.89840000000001
- type: ap
value: 61.43175045563333
- type: f1
value: 66.67945656405962
- task:
type: Classification
dataset:
type: mteb/mtop_domain
name: MTEB MTOPDomainClassification (en)
config: en
split: test
revision: d80d48c1eb48d3562165c59d59d0034df9fff0bf
metrics:
- type: accuracy
value: 89.12676698586411
- type: f1
value: 88.48426641357668
- task:
type: Classification
dataset:
type: mteb/mtop_intent
name: MTEB MTOPIntentClassification (en)
config: en
split: test
revision: ae001d0e6b1228650b7bd1c2c65fb50ad11a8aba
metrics:
- type: accuracy
value: 62.61513907888736
- type: f1
value: 40.96251281624023
- task:
type: Classification
dataset:
type: mteb/amazon_massive_intent
name: MTEB MassiveIntentClassification (en)
config: en
split: test
revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7
metrics:
- type: accuracy
value: 61.95359784801614
- type: f1
value: 58.85654625260125
- task:
type: Classification
dataset:
type: mteb/amazon_massive_scenario
name: MTEB MassiveScenarioClassification (en)
config: en
split: test
revision: 7d571f92784cd94a019292a1f45445077d0ef634
metrics:
- type: accuracy
value: 70.1983860121049
- type: f1
value: 68.73455379435487
- task:
type: Clustering
dataset:
type: mteb/medrxiv-clustering-p2p
name: MTEB MedrxivClusteringP2P
config: default
split: test
revision: e7a26af6f3ae46b30dde8737f02c07b1505bcc73
metrics:
- type: v_measure
value: 31.772017072895846
- task:
type: Clustering
dataset:
type: mteb/medrxiv-clustering-s2s
name: MTEB MedrxivClusteringS2S
config: default
split: test
revision: 35191c8c0dca72d8ff3efcd72aa802307d469663
metrics:
- type: v_measure
value: 30.944581802089044
- task:
type: Reranking
dataset:
type: mteb/mind_small
name: MTEB MindSmallReranking
config: default
split: test
revision: 3bdac13927fdc888b903db93b2ffdbd90b295a69
metrics:
- type: map
value: 30.977328237697133
- type: mrr
value: 32.02612207306447
- task:
type: Clustering
dataset:
type: mteb/reddit-clustering
name: MTEB RedditClustering
config: default
split: test
revision: 24640382cdbf8abc73003fb0fa6d111a705499eb
metrics:
- type: v_measure
value: 43.08588418858767
- task:
type: Clustering
dataset:
type: mteb/reddit-clustering-p2p
name: MTEB RedditClusteringP2P
config: default
split: test
revision: 282350215ef01743dc01b456c7f5241fa8937f16
metrics:
- type: v_measure
value: 56.53785276450797
- task:
type: Reranking
dataset:
type: mteb/scidocs-reranking
name: MTEB SciDocsRR
config: default
split: test
revision: d3c5e1fc0b855ab6097bf1cda04dd73947d7caab
metrics:
- type: map
value: 81.44882719207659
- type: mrr
value: 94.71082022552609
- task:
type: PairClassification
dataset:
type: mteb/sprintduplicatequestions-pairclassification
name: MTEB SprintDuplicateQuestions
config: default
split: test
revision: d66bd1f72af766a5cc4b0ca5e00c162f89e8cc46
metrics:
- type: cos_sim_accuracy
value: 99.77821782178218
- type: cos_sim_ap
value: 93.22909989796688
- type: cos_sim_f1
value: 88.41778697001035
- type: cos_sim_precision
value: 91.54175588865097
- type: cos_sim_recall
value: 85.5
- type: dot_accuracy
value: 99.77821782178218
- type: dot_ap
value: 93.2290998979669
- type: dot_f1
value: 88.41778697001035
- type: dot_precision
value: 91.54175588865097
- type: dot_recall
value: 85.5
- type: euclidean_accuracy
value: 99.77821782178218
- type: euclidean_ap
value: 93.2290998979669
- type: euclidean_f1
value: 88.41778697001035
- type: euclidean_precision
value: 91.54175588865097
- type: euclidean_recall
value: 85.5
- type: manhattan_accuracy
value: 99.77524752475247
- type: manhattan_ap
value: 93.18492132451668
- type: manhattan_f1
value: 88.19552782111285
- type: manhattan_precision
value: 91.87432286023835
- type: manhattan_recall
value: 84.8
- type: max_accuracy
value: 99.77821782178218
- type: max_ap
value: 93.2290998979669
- type: max_f1
value: 88.41778697001035
- task:
type: Clustering
dataset:
type: mteb/stackexchange-clustering
name: MTEB StackExchangeClustering
config: default
split: test
revision: 6cbc1f7b2bc0622f2e39d2c77fa502909748c259
metrics:
- type: v_measure
value: 48.225188905490285
- task:
type: Clustering
dataset:
type: mteb/stackexchange-clustering-p2p
name: MTEB StackExchangeClusteringP2P
config: default
split: test
revision: 815ca46b2622cec33ccafc3735d572c266efdb44
metrics:
- type: v_measure
value: 34.76195959924048
- task:
type: Reranking
dataset:
type: mteb/stackoverflowdupquestions-reranking
name: MTEB StackOverflowDupQuestions
config: default
split: test
revision: e185fbe320c72810689fc5848eb6114e1ef5ec69
metrics:
- type: map
value: 48.16986372261003
- type: mrr
value: 48.7718837535014
- task:
type: Classification
dataset:
type: mteb/toxic_conversations_50k
name: MTEB ToxicConversationsClassification
config: default
split: test
revision: d7c0de2777da35d6aae2200a62c6e0e5af397c4c
metrics:
- type: accuracy
value: 63.567200000000014
- type: ap
value: 11.412292644030266
- type: f1
value: 49.102043399207716
- task:
type: Classification
dataset:
type: mteb/tweet_sentiment_extraction
name: MTEB TweetSentimentExtractionClassification
config: default
split: test
revision: d604517c81ca91fe16a244d1248fc021f9ecee7a
metrics:
- type: accuracy
value: 51.04414261460101
- type: f1
value: 51.22880449155832
- task:
type: Clustering
dataset:
type: mteb/twentynewsgroups-clustering
name: MTEB TwentyNewsgroupsClustering
config: default
split: test
revision: 6125ec4e24fa026cec8a478383ee943acfbd5449
metrics:
- type: v_measure
value: 34.35595440606073
- task:
type: PairClassification
dataset:
type: mteb/twittersemeval2015-pairclassification
name: MTEB TwitterSemEval2015
config: default
split: test
revision: 70970daeab8776df92f5ea462b6173c0b46fd2d1
metrics:
- type: cos_sim_accuracy
value: 84.6754485307266
- type: cos_sim_ap
value: 69.6007143804539
- type: cos_sim_f1
value: 65.99822312476202
- type: cos_sim_precision
value: 63.58522866226461
- type: cos_sim_recall
value: 68.60158311345647
- type: dot_accuracy
value: 84.6754485307266
- type: dot_ap
value: 69.60070881520775
- type: dot_f1
value: 65.99822312476202
- type: dot_precision
value: 63.58522866226461
- type: dot_recall
value: 68.60158311345647
- type: euclidean_accuracy
value: 84.6754485307266
- type: euclidean_ap
value: 69.60071394457518
- type: euclidean_f1
value: 65.99822312476202
- type: euclidean_precision
value: 63.58522866226461
- type: euclidean_recall
value: 68.60158311345647
- type: manhattan_accuracy
value: 84.6754485307266
- type: manhattan_ap
value: 69.57324451019119
- type: manhattan_f1
value: 65.7235045917101
- type: manhattan_precision
value: 62.04311152764761
- type: manhattan_recall
value: 69.86807387862797
- type: max_accuracy
value: 84.6754485307266
- type: max_ap
value: 69.6007143804539
- type: max_f1
value: 65.99822312476202
- task:
type: PairClassification
dataset:
type: mteb/twitterurlcorpus-pairclassification
name: MTEB TwitterURLCorpus
config: default
split: test
revision: 8b6510b0b1fa4e4c4f879467980e9be563ec1cdf
metrics:
- type: cos_sim_accuracy
value: 87.63922847052432
- type: cos_sim_ap
value: 83.48934190421085
- type: cos_sim_f1
value: 75.42265503384861
- type: cos_sim_precision
value: 71.17868124359413
- type: cos_sim_recall
value: 80.20480443486295
- type: dot_accuracy
value: 87.63922847052432
- type: dot_ap
value: 83.4893468701264
- type: dot_f1
value: 75.42265503384861
- type: dot_precision
value: 71.17868124359413
- type: dot_recall
value: 80.20480443486295
- type: euclidean_accuracy
value: 87.63922847052432
- type: euclidean_ap
value: 83.48934073168017
- type: euclidean_f1
value: 75.42265503384861
- type: euclidean_precision
value: 71.17868124359413
- type: euclidean_recall
value: 80.20480443486295
- type: manhattan_accuracy
value: 87.66251406838204
- type: manhattan_ap
value: 83.46319621504654
- type: manhattan_f1
value: 75.41883304448297
- type: manhattan_precision
value: 71.0089747076421
- type: manhattan_recall
value: 80.41268863566368
- type: max_accuracy
value: 87.66251406838204
- type: max_ap
value: 83.4893468701264
- type: max_f1
value: 75.42265503384861
{MODEL_NAME}
This is a sentence-transformers model: It maps sentences & paragraphs to a 768 dimensional dense vector space and can be used for tasks like clustering or semantic search.
Usage (Sentence-Transformers)
Using this model becomes easy when you have sentence-transformers installed:
pip install -U sentence-transformers
Then you can use the model like this:
from sentence_transformers import SentenceTransformer
sentences = ["This is an example sentence", "Each sentence is converted"]
model = SentenceTransformer('{MODEL_NAME}')
embeddings = model.encode(sentences)
print(embeddings)
Evaluation Results
For an automated evaluation of this model, see the Sentence Embeddings Benchmark: https://seb.sbert.net
Training
The model was trained with the parameters:
DataLoader:
torch.utils.data.dataloader.DataLoader
of length 15607 with parameters:
{'batch_size': 48, 'sampler': 'torch.utils.data.sampler.RandomSampler', 'batch_sampler': 'torch.utils.data.sampler.BatchSampler'}
Loss:
sentence_transformers.losses.MultipleNegativesRankingLoss.MultipleNegativesRankingLoss
with parameters:
{'scale': 20.0, 'similarity_fct': 'cos_sim'}
Parameters of the fit()-Method:
{
"epochs": 10,
"evaluation_steps": 0,
"evaluator": "NoneType",
"max_grad_norm": 1,
"optimizer_class": "<class 'torch.optim.adamw.AdamW'>",
"optimizer_params": {
"lr": 2e-05
},
"scheduler": "WarmupLinear",
"steps_per_epoch": null,
"warmup_steps": 1000,
"weight_decay": 0.01
}
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: MPNetModel
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False})
(2): Normalize()
)