FRIDA / README.md
ai-forever's picture
Update README.md
7292217 verified
metadata
model-index:
  - name: FRIDA
    results:
      - dataset:
          config: default
          name: MTEB CEDRClassification (default)
          revision: c0ba03d058e3e1b2f3fd20518875a4563dd12db4
          split: test
          type: ai-forever/cedr-classification
        metrics:
          - type: accuracy
            value: 64.60148777895856
          - type: f1
            value: 70.36630348039266
          - type: lrap
            value: 92.47290116896953
          - type: main_score
            value: 64.60148777895856
        task:
          type: MultilabelClassification
      - dataset:
          config: default
          name: MTEB GeoreviewClassification (default)
          revision: 3765c0d1de6b7d264bc459433c45e5a75513839c
          split: test
          type: ai-forever/georeview-classification
        metrics:
          - type: accuracy
            value: 57.70996093750001
          - type: f1
            value: 53.18542982057098
          - type: f1_weighted
            value: 53.17663229582108
          - type: main_score
            value: 57.70996093750001
        task:
          type: Classification
      - dataset:
          config: default
          name: MTEB GeoreviewClusteringP2P (default)
          revision: 97a313c8fc85b47f13f33e7e9a95c1ad888c7fec
          split: test
          type: ai-forever/georeview-clustering-p2p
        metrics:
          - type: main_score
            value: 78.25468393043356
          - type: v_measure
            value: 78.25468393043356
          - type: v_measure_std
            value: 0.5094366871364238
        task:
          type: Clustering
      - dataset:
          config: default
          name: MTEB HeadlineClassification (default)
          revision: 2fe05ee6b5832cda29f2ef7aaad7b7fe6a3609eb
          split: test
          type: ai-forever/headline-classification
        metrics:
          - type: accuracy
            value: 89.0185546875
          - type: f1
            value: 88.993933120612
          - type: f1_weighted
            value: 88.99276764225768
          - type: main_score
            value: 89.0185546875
        task:
          type: Classification
      - dataset:
          config: default
          name: MTEB InappropriatenessClassification (default)
          revision: 601651fdc45ef243751676e62dd7a19f491c0285
          split: test
          type: ai-forever/inappropriateness-classification
        metrics:
          - type: accuracy
            value: 78.330078125
          - type: ap
            value: 73.17856750532495
          - type: ap_weighted
            value: 73.17856750532495
          - type: f1
            value: 78.20169867599041
          - type: f1_weighted
            value: 78.20169867599041
          - type: main_score
            value: 78.330078125
        task:
          type: Classification
      - dataset:
          config: default
          name: MTEB KinopoiskClassification (default)
          revision: 5911f26666ac11af46cb9c6849d0dc80a378af24
          split: test
          type: ai-forever/kinopoisk-sentiment-classification
        metrics:
          - type: accuracy
            value: 70.46666666666665
          - type: f1
            value: 65.83951766538878
          - type: f1_weighted
            value: 65.83951766538878
          - type: main_score
            value: 70.46666666666665
        task:
          type: Classification
      - dataset:
          config: ru
          name: MTEB MIRACLReranking (ru)
          revision: 6d1962c527217f8927fca80f890f14f36b2802af
          split: dev
          type: miracl/mmteb-miracl-reranking
        metrics:
          - type: MAP@1(MIRACL)
            value: 39.023
          - type: MAP@10(MIRACL)
            value: 60.208
          - type: MAP@100(MIRACL)
            value: 61.672000000000004
          - type: MAP@1000(MIRACL)
            value: 61.672000000000004
          - type: MAP@20(MIRACL)
            value: 61.30799999999999
          - type: MAP@3(MIRACL)
            value: 53.33
          - type: MAP@5(MIRACL)
            value: 57.289
          - type: NDCG@1(MIRACL)
            value: 63.352
          - type: NDCG@10(MIRACL)
            value: 66.042
          - type: NDCG@100(MIRACL)
            value: 68.702
          - type: NDCG@1000(MIRACL)
            value: 68.702
          - type: NDCG@20(MIRACL)
            value: 67.768
          - type: NDCG@3(MIRACL)
            value: 61.925
          - type: NDCG@5(MIRACL)
            value: 63.327
          - type: P@1(MIRACL)
            value: 63.352
          - type: P@10(MIRACL)
            value: 16.512
          - type: P@100(MIRACL)
            value: 1.9529999999999998
          - type: P@1000(MIRACL)
            value: 0.19499999999999998
          - type: P@20(MIRACL)
            value: 9.13
          - type: P@3(MIRACL)
            value: 37.878
          - type: P@5(MIRACL)
            value: 27.586
          - type: Recall@1(MIRACL)
            value: 39.023
          - type: Recall@10(MIRACL)
            value: 72.35000000000001
          - type: Recall@100(MIRACL)
            value: 79.952
          - type: Recall@1000(MIRACL)
            value: 79.952
          - type: Recall@20(MIRACL)
            value: 76.828
          - type: Recall@3(MIRACL)
            value: 57.769999999999996
          - type: Recall@5(MIRACL)
            value: 64.91900000000001
          - type: main_score
            value: 66.042
          - type: nAUC_MAP@1000_diff1(MIRACL)
            value: 27.150388833033052
          - type: nAUC_MAP@1000_max(MIRACL)
            value: 55.15672274267081
          - type: nAUC_MAP@1000_std(MIRACL)
            value: 30.088939934575553
          - type: nAUC_MAP@100_diff1(MIRACL)
            value: 27.150388833033052
          - type: nAUC_MAP@100_max(MIRACL)
            value: 55.15672274267081
          - type: nAUC_MAP@100_std(MIRACL)
            value: 30.088939934575553
          - type: nAUC_MAP@10_diff1(MIRACL)
            value: 27.853691773641742
          - type: nAUC_MAP@10_max(MIRACL)
            value: 52.89390350055654
          - type: nAUC_MAP@10_std(MIRACL)
            value: 28.08732516551691
          - type: nAUC_MAP@1_diff1(MIRACL)
            value: 43.23179150244192
          - type: nAUC_MAP@1_max(MIRACL)
            value: 29.923943954188864
          - type: nAUC_MAP@1_std(MIRACL)
            value: 7.447084370195121
          - type: nAUC_MAP@20_diff1(MIRACL)
            value: 27.328384072311675
          - type: nAUC_MAP@20_max(MIRACL)
            value: 54.60286379835721
          - type: nAUC_MAP@20_std(MIRACL)
            value: 29.8084128980043
          - type: nAUC_MAP@3_diff1(MIRACL)
            value: 31.244971536944554
          - type: nAUC_MAP@3_max(MIRACL)
            value: 43.63984692803854
          - type: nAUC_MAP@3_std(MIRACL)
            value: 18.609234683765887
          - type: nAUC_MAP@5_diff1(MIRACL)
            value: 29.088760492638286
          - type: nAUC_MAP@5_max(MIRACL)
            value: 48.30474364461509
          - type: nAUC_MAP@5_std(MIRACL)
            value: 23.817514353844224
          - type: nAUC_NDCG@1000_diff1(MIRACL)
            value: 23.12754356408408
          - type: nAUC_NDCG@1000_max(MIRACL)
            value: 64.24894553363303
          - type: nAUC_NDCG@1000_std(MIRACL)
            value: 38.19318050598967
          - type: nAUC_NDCG@100_diff1(MIRACL)
            value: 23.12754356408408
          - type: nAUC_NDCG@100_max(MIRACL)
            value: 64.24894553363303
          - type: nAUC_NDCG@100_std(MIRACL)
            value: 38.19318050598967
          - type: nAUC_NDCG@10_diff1(MIRACL)
            value: 24.779856373697275
          - type: nAUC_NDCG@10_max(MIRACL)
            value: 60.4054459738118
          - type: nAUC_NDCG@10_std(MIRACL)
            value: 35.148950441182784
          - type: nAUC_NDCG@1_diff1(MIRACL)
            value: 35.605865569438556
          - type: nAUC_NDCG@1_max(MIRACL)
            value: 65.77787399715454
          - type: nAUC_NDCG@1_std(MIRACL)
            value: 34.34726892885082
          - type: nAUC_NDCG@20_diff1(MIRACL)
            value: 23.71231783125691
          - type: nAUC_NDCG@20_max(MIRACL)
            value: 62.89676599488004
          - type: nAUC_NDCG@20_std(MIRACL)
            value: 37.697052941884316
          - type: nAUC_NDCG@3_diff1(MIRACL)
            value: 26.109027741640865
          - type: nAUC_NDCG@3_max(MIRACL)
            value: 56.22356793638693
          - type: nAUC_NDCG@3_std(MIRACL)
            value: 29.9437568508688
          - type: nAUC_NDCG@5_diff1(MIRACL)
            value: 25.98644715327336
          - type: nAUC_NDCG@5_max(MIRACL)
            value: 56.25032008404774
          - type: nAUC_NDCG@5_std(MIRACL)
            value: 31.581899860862578
          - type: nAUC_P@1000_diff1(MIRACL)
            value: -18.29912787064644
          - type: nAUC_P@1000_max(MIRACL)
            value: 31.811344878776087
          - type: nAUC_P@1000_std(MIRACL)
            value: 30.163820183304914
          - type: nAUC_P@100_diff1(MIRACL)
            value: -18.299127870646405
          - type: nAUC_P@100_max(MIRACL)
            value: 31.811344878776133
          - type: nAUC_P@100_std(MIRACL)
            value: 30.163820183304956
          - type: nAUC_P@10_diff1(MIRACL)
            value: -15.96416268531149
          - type: nAUC_P@10_max(MIRACL)
            value: 36.989578896466526
          - type: nAUC_P@10_std(MIRACL)
            value: 34.54507111688143
          - type: nAUC_P@1_diff1(MIRACL)
            value: 35.605865569438556
          - type: nAUC_P@1_max(MIRACL)
            value: 65.77787399715454
          - type: nAUC_P@1_std(MIRACL)
            value: 34.34726892885082
          - type: nAUC_P@20_diff1(MIRACL)
            value: -17.443963421383287
          - type: nAUC_P@20_max(MIRACL)
            value: 34.309618168778385
          - type: nAUC_P@20_std(MIRACL)
            value: 33.38820956485373
          - type: nAUC_P@3_diff1(MIRACL)
            value: -8.533621861815652
          - type: nAUC_P@3_max(MIRACL)
            value: 45.90408386776497
          - type: nAUC_P@3_std(MIRACL)
            value: 34.50459351305535
          - type: nAUC_P@5_diff1(MIRACL)
            value: -13.207968899314865
          - type: nAUC_P@5_max(MIRACL)
            value: 40.37718282248973
          - type: nAUC_P@5_std(MIRACL)
            value: 35.601417332196206
          - type: nAUC_Recall@1000_diff1(MIRACL)
            value: 7.907304198177226
          - type: nAUC_Recall@1000_max(MIRACL)
            value: 77.82197832361145
          - type: nAUC_Recall@1000_std(MIRACL)
            value: 52.66957487246724
          - type: nAUC_Recall@100_diff1(MIRACL)
            value: 7.907304198177226
          - type: nAUC_Recall@100_max(MIRACL)
            value: 77.82197832361145
          - type: nAUC_Recall@100_std(MIRACL)
            value: 52.66957487246724
          - type: nAUC_Recall@10_diff1(MIRACL)
            value: 15.498121023488693
          - type: nAUC_Recall@10_max(MIRACL)
            value: 62.24320529338724
          - type: nAUC_Recall@10_std(MIRACL)
            value: 40.60221460946224
          - type: nAUC_Recall@1_diff1(MIRACL)
            value: 43.23179150244192
          - type: nAUC_Recall@1_max(MIRACL)
            value: 29.923943954188864
          - type: nAUC_Recall@1_std(MIRACL)
            value: 7.447084370195121
          - type: nAUC_Recall@20_diff1(MIRACL)
            value: 11.457044176116248
          - type: nAUC_Recall@20_max(MIRACL)
            value: 70.3493054342368
          - type: nAUC_Recall@20_std(MIRACL)
            value: 49.27124296325928
          - type: nAUC_Recall@3_diff1(MIRACL)
            value: 25.12077828977941
          - type: nAUC_Recall@3_max(MIRACL)
            value: 42.903379317937166
          - type: nAUC_Recall@3_std(MIRACL)
            value: 20.324501722161497
          - type: nAUC_Recall@5_diff1(MIRACL)
            value: 20.925701235197977
          - type: nAUC_Recall@5_max(MIRACL)
            value: 49.85323960390812
          - type: nAUC_Recall@5_std(MIRACL)
            value: 29.04484539530469
        task:
          type: Reranking
      - dataset:
          config: ru
          name: MTEB MIRACLRetrieval (ru)
          revision: main
          split: dev
          type: miracl/mmteb-miracl
        metrics:
          - type: main_score
            value: 71.882
          - type: map_at_1
            value: 37.913000000000004
          - type: map_at_10
            value: 62.604000000000006
          - type: map_at_100
            value: 64.925
          - type: map_at_1000
            value: 64.992
          - type: map_at_20
            value: 64.081
          - type: map_at_3
            value: 55.212
          - type: map_at_5
            value: 59.445
          - type: mrr_at_1
            value: 73.24281150159744
          - type: mrr_at_10
            value: 81.65043866321825
          - type: mrr_at_100
            value: 81.85391378818977
          - type: mrr_at_1000
            value: 81.85753390802569
          - type: mrr_at_20
            value: 81.81045606130179
          - type: mrr_at_3
            value: 80.56443024494146
          - type: mrr_at_5
            value: 81.30724174653893
          - type: nauc_map_at_1000_diff1
            value: 26.962150235593356
          - type: nauc_map_at_1000_max
            value: 29.234958037854568
          - type: nauc_map_at_1000_std
            value: -2.4294465103633884
          - type: nauc_map_at_100_diff1
            value: 26.92990252114163
          - type: nauc_map_at_100_max
            value: 29.206328533120118
          - type: nauc_map_at_100_std
            value: -2.437371090941197
          - type: nauc_map_at_10_diff1
            value: 25.758265691179226
          - type: nauc_map_at_10_max
            value: 26.949978490795317
          - type: nauc_map_at_10_std
            value: -5.484961002106038
          - type: nauc_map_at_1_diff1
            value: 34.70849461278043
          - type: nauc_map_at_1_max
            value: 12.778570893623042
          - type: nauc_map_at_1_std
            value: -13.018292652743938
          - type: nauc_map_at_20_diff1
            value: 26.659923008218268
          - type: nauc_map_at_20_max
            value: 28.341440871568185
          - type: nauc_map_at_20_std
            value: -3.614549844913084
          - type: nauc_map_at_3_diff1
            value: 27.197629021438203
          - type: nauc_map_at_3_max
            value: 20.701094874050856
          - type: nauc_map_at_3_std
            value: -12.062992301112041
          - type: nauc_map_at_5_diff1
            value: 25.51793537203295
          - type: nauc_map_at_5_max
            value: 23.80396771243794
          - type: nauc_map_at_5_std
            value: -8.920465695323575
          - type: nauc_mrr_at_1000_diff1
            value: 45.14819989592967
          - type: nauc_mrr_at_1000_max
            value: 53.29202156141053
          - type: nauc_mrr_at_1000_std
            value: 18.037336462510524
          - type: nauc_mrr_at_100_diff1
            value: 45.15287600228451
          - type: nauc_mrr_at_100_max
            value: 53.29979751928615
          - type: nauc_mrr_at_100_std
            value: 18.04996604778386
          - type: nauc_mrr_at_10_diff1
            value: 44.96865105944474
          - type: nauc_mrr_at_10_max
            value: 53.53323465323092
          - type: nauc_mrr_at_10_std
            value: 18.25001344917689
          - type: nauc_mrr_at_1_diff1
            value: 46.16604946873163
          - type: nauc_mrr_at_1_max
            value: 48.573651103547874
          - type: nauc_mrr_at_1_std
            value: 13.764871626330915
          - type: nauc_mrr_at_20_diff1
            value: 45.11925458479102
          - type: nauc_mrr_at_20_max
            value: 53.35685123898342
          - type: nauc_mrr_at_20_std
            value: 18.127344968819905
          - type: nauc_mrr_at_3_diff1
            value: 45.377195452730234
          - type: nauc_mrr_at_3_max
            value: 53.35146309217089
          - type: nauc_mrr_at_3_std
            value: 17.47105877186237
          - type: nauc_mrr_at_5_diff1
            value: 45.00525578771549
          - type: nauc_mrr_at_5_max
            value: 53.76227254707128
          - type: nauc_mrr_at_5_std
            value: 18.437290060746957
          - type: nauc_ndcg_at_1000_diff1
            value: 31.19215594457491
          - type: nauc_ndcg_at_1000_max
            value: 38.09555406458668
          - type: nauc_ndcg_at_1000_std
            value: 7.225628621238009
          - type: nauc_ndcg_at_100_diff1
            value: 30.726331247999934
          - type: nauc_ndcg_at_100_max
            value: 37.81369589418277
          - type: nauc_ndcg_at_100_std
            value: 7.242855238555071
          - type: nauc_ndcg_at_10_diff1
            value: 27.514048333744835
          - type: nauc_ndcg_at_10_max
            value: 33.10990399385253
          - type: nauc_ndcg_at_10_std
            value: 0.3051899572112002
          - type: nauc_ndcg_at_1_diff1
            value: 47.06089085235751
          - type: nauc_ndcg_at_1_max
            value: 47.7300872370495
          - type: nauc_ndcg_at_1_std
            value: 12.468605493613916
          - type: nauc_ndcg_at_20_diff1
            value: 29.404215438764496
          - type: nauc_ndcg_at_20_max
            value: 35.26967886796471
          - type: nauc_ndcg_at_20_std
            value: 3.7214697890813353
          - type: nauc_ndcg_at_3_diff1
            value: 29.448848639643067
          - type: nauc_ndcg_at_3_max
            value: 33.85912412370657
          - type: nauc_ndcg_at_3_std
            value: 0.895453646819452
          - type: nauc_ndcg_at_5_diff1
            value: 26.916649012613526
          - type: nauc_ndcg_at_5_max
            value: 30.899005979291644
          - type: nauc_ndcg_at_5_std
            value: -1.0001575639156615
          - type: nauc_precision_at_1000_diff1
            value: -8.492004667432635
          - type: nauc_precision_at_1000_max
            value: 14.970190384017679
          - type: nauc_precision_at_1000_std
            value: 32.871386621137816
          - type: nauc_precision_at_100_diff1
            value: -8.287314133999967
          - type: nauc_precision_at_100_max
            value: 17.794821961284736
          - type: nauc_precision_at_100_std
            value: 35.092483550562
          - type: nauc_precision_at_10_diff1
            value: -7.594128993028063
          - type: nauc_precision_at_10_max
            value: 24.691446370325732
          - type: nauc_precision_at_10_std
            value: 30.126552282608493
          - type: nauc_precision_at_1_diff1
            value: 47.06089085235751
          - type: nauc_precision_at_1_max
            value: 47.7300872370495
          - type: nauc_precision_at_1_std
            value: 12.468605493613916
          - type: nauc_precision_at_20_diff1
            value: -6.503872195775146
          - type: nauc_precision_at_20_max
            value: 21.789730053141312
          - type: nauc_precision_at_20_std
            value: 32.61349377558794
          - type: nauc_precision_at_3_diff1
            value: 0.67417079971061
          - type: nauc_precision_at_3_max
            value: 30.793871354370662
          - type: nauc_precision_at_3_std
            value: 18.35266479252011
          - type: nauc_precision_at_5_diff1
            value: -7.088881730215777
          - type: nauc_precision_at_5_max
            value: 26.539771712769006
          - type: nauc_precision_at_5_std
            value: 24.116262291865834
          - type: nauc_recall_at_1000_diff1
            value: 34.53263588412461
          - type: nauc_recall_at_1000_max
            value: 63.54157869100173
          - type: nauc_recall_at_1000_std
            value: 64.19854844792808
          - type: nauc_recall_at_100_diff1
            value: 22.86564728642275
          - type: nauc_recall_at_100_max
            value: 40.350507162549825
          - type: nauc_recall_at_100_std
            value: 29.24492545863015
          - type: nauc_recall_at_10_diff1
            value: 15.384818367225009
          - type: nauc_recall_at_10_max
            value: 24.41108571453699
          - type: nauc_recall_at_10_std
            value: -3.9216160585776323
          - type: nauc_recall_at_1_diff1
            value: 34.70849461278043
          - type: nauc_recall_at_1_max
            value: 12.778570893623042
          - type: nauc_recall_at_1_std
            value: -13.018292652743938
          - type: nauc_recall_at_20_diff1
            value: 18.122499000084208
          - type: nauc_recall_at_20_max
            value: 26.63104220179424
          - type: nauc_recall_at_20_std
            value: 3.969217732521512
          - type: nauc_recall_at_3_diff1
            value: 21.413050725250116
          - type: nauc_recall_at_3_max
            value: 16.18894988386887
          - type: nauc_recall_at_3_std
            value: -15.24884339282375
          - type: nauc_recall_at_5_diff1
            value: 16.35673072212927
          - type: nauc_recall_at_5_max
            value: 18.607003829267846
          - type: nauc_recall_at_5_std
            value: -10.463525876945454
          - type: ndcg_at_1
            value: 72.923
          - type: ndcg_at_10
            value: 71.882
          - type: ndcg_at_100
            value: 77.09899999999999
          - type: ndcg_at_1000
            value: 77.835
          - type: ndcg_at_20
            value: 74.497
          - type: ndcg_at_3
            value: 68.504
          - type: ndcg_at_5
            value: 69.068
          - type: precision_at_1
            value: 72.923
          - type: precision_at_10
            value: 19.936
          - type: precision_at_100
            value: 2.6310000000000002
          - type: precision_at_1000
            value: 0.27799999999999997
          - type: precision_at_20
            value: 11.33
          - type: precision_at_3
            value: 45.927
          - type: precision_at_5
            value: 33.131
          - type: recall_at_1
            value: 37.913000000000004
          - type: recall_at_10
            value: 78.365
          - type: recall_at_100
            value: 94.348
          - type: recall_at_1000
            value: 98.187
          - type: recall_at_20
            value: 85.229
          - type: recall_at_3
            value: 61.42999999999999
          - type: recall_at_5
            value: 69.56700000000001
        task:
          type: Retrieval
      - dataset:
          config: ru
          name: MTEB MassiveIntentClassification (ru)
          revision: 4672e20407010da34463acc759c162ca9734bca6
          split: test
          type: mteb/amazon_massive_intent
        metrics:
          - type: accuracy
            value: 79.11903160726294
          - type: f1
            value: 76.22609082694545
          - type: f1_weighted
            value: 77.81461248063566
          - type: main_score
            value: 79.11903160726294
        task:
          type: Classification
      - dataset:
          config: ru
          name: MTEB MassiveScenarioClassification (ru)
          revision: fad2c6e8459f9e1c45d9315f4953d921437d70f8
          split: test
          type: mteb/amazon_massive_scenario
        metrics:
          - type: accuracy
            value: 88.80632145258912
          - type: f1
            value: 87.53157475314829
          - type: f1_weighted
            value: 88.22733432521495
          - type: main_score
            value: 88.80632145258912
        task:
          type: Classification
      - dataset:
          config: default
          name: MTEB RUParaPhraserSTS (default)
          revision: 43265056790b8f7c59e0139acb4be0a8dad2c8f4
          split: test
          type: merionum/ru_paraphraser
        metrics:
          - type: cosine_pearson
            value: 72.70307124858925
          - type: cosine_spearman
            value: 78.09439086920204
          - type: euclidean_pearson
            value: 76.2033672014715
          - type: euclidean_spearman
            value: 78.09439086920204
          - type: main_score
            value: 78.09439086920204
          - type: manhattan_pearson
            value: 76.11750470223116
          - type: manhattan_spearman
            value: 78.01081063503413
          - type: pearson
            value: 72.70307124858925
          - type: spearman
            value: 78.09439086920204
        task:
          type: STS
      - dataset:
          config: default
          name: MTEB RiaNewsRetrieval (default)
          revision: 82374b0bbacda6114f39ff9c5b925fa1512ca5d7
          split: test
          type: ai-forever/ria-news-retrieval
        metrics:
          - type: main_score
            value: 86.819
          - type: map_at_1
            value: 78.79
          - type: map_at_10
            value: 84.516
          - type: map_at_100
            value: 84.68
          - type: map_at_1000
            value: 84.685
          - type: map_at_20
            value: 84.624
          - type: map_at_3
            value: 83.722
          - type: map_at_5
            value: 84.246
          - type: mrr_at_1
            value: 78.78
          - type: mrr_at_10
            value: 84.51815476190441
          - type: mrr_at_100
            value: 84.68390840473289
          - type: mrr_at_1000
            value: 84.68947095200002
          - type: mrr_at_20
            value: 84.62958130822527
          - type: mrr_at_3
            value: 83.74499999999964
          - type: mrr_at_5
            value: 84.23849999999955
          - type: nauc_map_at_1000_diff1
            value: 82.09914867708899
          - type: nauc_map_at_1000_max
            value: 43.02024854784386
          - type: nauc_map_at_1000_std
            value: -22.919695880762777
          - type: nauc_map_at_100_diff1
            value: 82.09705922783733
          - type: nauc_map_at_100_max
            value: 43.02697379581718
          - type: nauc_map_at_100_std
            value: -22.90719212899522
          - type: nauc_map_at_10_diff1
            value: 82.04404594672894
          - type: nauc_map_at_10_max
            value: 43.06752103182731
          - type: nauc_map_at_10_std
            value: -23.007870153273576
          - type: nauc_map_at_1_diff1
            value: 83.89134152210333
          - type: nauc_map_at_1_max
            value: 38.083626428503415
          - type: nauc_map_at_1_std
            value: -25.817960401194252
          - type: nauc_map_at_20_diff1
            value: 82.08534662247806
          - type: nauc_map_at_20_max
            value: 43.074305042312346
          - type: nauc_map_at_20_std
            value: -22.91785703613217
          - type: nauc_map_at_3_diff1
            value: 81.7967508697558
          - type: nauc_map_at_3_max
            value: 42.90927479098251
          - type: nauc_map_at_3_std
            value: -24.01312203859392
          - type: nauc_map_at_5_diff1
            value: 81.90704517505098
          - type: nauc_map_at_5_max
            value: 43.05204677044616
          - type: nauc_map_at_5_std
            value: -23.267331507554896
          - type: nauc_mrr_at_1000_diff1
            value: 82.11902348082472
          - type: nauc_mrr_at_1000_max
            value: 43.04118936353063
          - type: nauc_mrr_at_1000_std
            value: -22.858804296830773
          - type: nauc_mrr_at_100_diff1
            value: 82.11685562002263
          - type: nauc_mrr_at_100_max
            value: 43.0482537895494
          - type: nauc_mrr_at_100_std
            value: -22.84431127787993
          - type: nauc_mrr_at_10_diff1
            value: 82.06909958688058
          - type: nauc_mrr_at_10_max
            value: 43.07921689466605
          - type: nauc_mrr_at_10_std
            value: -22.957623576663234
          - type: nauc_mrr_at_1_diff1
            value: 83.91147637794326
          - type: nauc_mrr_at_1_max
            value: 37.91917159543152
          - type: nauc_mrr_at_1_std
            value: -26.141868289283266
          - type: nauc_mrr_at_20_diff1
            value: 82.10314004731809
          - type: nauc_mrr_at_20_max
            value: 43.09295406509764
          - type: nauc_mrr_at_20_std
            value: -22.862091782178787
          - type: nauc_mrr_at_3_diff1
            value: 81.82117067269036
          - type: nauc_mrr_at_3_max
            value: 42.94628953323521
          - type: nauc_mrr_at_3_std
            value: -23.852510312400714
          - type: nauc_mrr_at_5_diff1
            value: 81.92857441701598
          - type: nauc_mrr_at_5_max
            value: 43.129719354492934
          - type: nauc_mrr_at_5_std
            value: -23.145342272624085
          - type: nauc_ndcg_at_1000_diff1
            value: 81.75015729717991
          - type: nauc_ndcg_at_1000_max
            value: 44.7266586308995
          - type: nauc_ndcg_at_1000_std
            value: -20.60663899715267
          - type: nauc_ndcg_at_100_diff1
            value: 81.6897808298767
          - type: nauc_ndcg_at_100_max
            value: 44.99492791287099
          - type: nauc_ndcg_at_100_std
            value: -20.09637266506936
          - type: nauc_ndcg_at_10_diff1
            value: 81.46290312197337
          - type: nauc_ndcg_at_10_max
            value: 45.30218378452244
          - type: nauc_ndcg_at_10_std
            value: -20.70393523891777
          - type: nauc_ndcg_at_1_diff1
            value: 83.89134152210333
          - type: nauc_ndcg_at_1_max
            value: 38.083626428503415
          - type: nauc_ndcg_at_1_std
            value: -25.817960401194252
          - type: nauc_ndcg_at_20_diff1
            value: 81.61080772657213
          - type: nauc_ndcg_at_20_max
            value: 45.36571800492172
          - type: nauc_ndcg_at_20_std
            value: -20.278763852504042
          - type: nauc_ndcg_at_3_diff1
            value: 80.95965359410461
          - type: nauc_ndcg_at_3_max
            value: 44.756971949205834
          - type: nauc_ndcg_at_3_std
            value: -23.07797617717319
          - type: nauc_ndcg_at_5_diff1
            value: 81.12417712163976
          - type: nauc_ndcg_at_5_max
            value: 45.15727381406512
          - type: nauc_ndcg_at_5_std
            value: -21.52861766165519
          - type: nauc_precision_at_1000_diff1
            value: 76.80566850396093
          - type: nauc_precision_at_1000_max
            value: 82.45685370922442
          - type: nauc_precision_at_1000_std
            value: 46.93570976777808
          - type: nauc_precision_at_100_diff1
            value: 77.21645520953484
          - type: nauc_precision_at_100_max
            value: 73.43604108309935
          - type: nauc_precision_at_100_std
            value: 31.978176891671367
          - type: nauc_precision_at_10_diff1
            value: 77.88251664302092
          - type: nauc_precision_at_10_max
            value: 60.58112638995018
          - type: nauc_precision_at_10_std
            value: -3.674424315180332
          - type: nauc_precision_at_1_diff1
            value: 83.89134152210333
          - type: nauc_precision_at_1_max
            value: 38.083626428503415
          - type: nauc_precision_at_1_std
            value: -25.817960401194252
          - type: nauc_precision_at_20_diff1
            value: 78.16426786697438
          - type: nauc_precision_at_20_max
            value: 66.0723612699222
          - type: nauc_precision_at_20_std
            value: 6.121527084555938
          - type: nauc_precision_at_3_diff1
            value: 77.43122492166451
          - type: nauc_precision_at_3_max
            value: 52.50727288548085
          - type: nauc_precision_at_3_std
            value: -19.036076920799427
          - type: nauc_precision_at_5_diff1
            value: 77.1127254320532
          - type: nauc_precision_at_5_max
            value: 56.100901899221135
          - type: nauc_precision_at_5_std
            value: -12.009191140844198
          - type: nauc_recall_at_1000_diff1
            value: 76.80566850396035
          - type: nauc_recall_at_1000_max
            value: 82.45685370922577
          - type: nauc_recall_at_1000_std
            value: 46.93570976777776
          - type: nauc_recall_at_100_diff1
            value: 77.21645520953459
          - type: nauc_recall_at_100_max
            value: 73.43604108310011
          - type: nauc_recall_at_100_std
            value: 31.978176891671993
          - type: nauc_recall_at_10_diff1
            value: 77.88251664302089
          - type: nauc_recall_at_10_max
            value: 60.58112638994999
          - type: nauc_recall_at_10_std
            value: -3.6744243151805427
          - type: nauc_recall_at_1_diff1
            value: 83.89134152210333
          - type: nauc_recall_at_1_max
            value: 38.083626428503415
          - type: nauc_recall_at_1_std
            value: -25.817960401194252
          - type: nauc_recall_at_20_diff1
            value: 78.16426786697409
          - type: nauc_recall_at_20_max
            value: 66.07236126992217
          - type: nauc_recall_at_20_std
            value: 6.121527084555941
          - type: nauc_recall_at_3_diff1
            value: 77.43122492166454
          - type: nauc_recall_at_3_max
            value: 52.507272885480816
          - type: nauc_recall_at_3_std
            value: -19.036076920799776
          - type: nauc_recall_at_5_diff1
            value: 77.11272543205318
          - type: nauc_recall_at_5_max
            value: 56.10090189922128
          - type: nauc_recall_at_5_std
            value: -12.009191140843809
          - type: ndcg_at_1
            value: 78.79
          - type: ndcg_at_10
            value: 86.819
          - type: ndcg_at_100
            value: 87.599
          - type: ndcg_at_1000
            value: 87.761
          - type: ndcg_at_20
            value: 87.208
          - type: ndcg_at_3
            value: 85.222
          - type: ndcg_at_5
            value: 86.164
          - type: precision_at_1
            value: 78.79
          - type: precision_at_10
            value: 9.384
          - type: precision_at_100
            value: 0.975
          - type: precision_at_1000
            value: 0.099
          - type: precision_at_20
            value: 4.769
          - type: precision_at_3
            value: 29.842999999999996
          - type: precision_at_5
            value: 18.362000000000002
          - type: recall_at_1
            value: 78.79
          - type: recall_at_10
            value: 93.84
          - type: recall_at_100
            value: 97.45
          - type: recall_at_1000
            value: 98.76
          - type: recall_at_20
            value: 95.37
          - type: recall_at_3
            value: 89.53
          - type: recall_at_5
            value: 91.81
        task:
          type: Retrieval
      - dataset:
          config: default
          name: MTEB RuBQReranking (default)
          revision: 2e96b8f098fa4b0950fc58eacadeb31c0d0c7fa2
          split: test
          type: ai-forever/rubq-reranking
        metrics:
          - type: main_score
            value: 77.07394404835635
          - type: map
            value: 77.07394404835635
          - type: mrr
            value: 82.53144412718882
          - type: nAUC_map_diff1
            value: 45.29805217456628
          - type: nAUC_map_max
            value: 34.39894042439188
          - type: nAUC_map_std
            value: 21.11309674418275
          - type: nAUC_mrr_diff1
            value: 54.783994737367046
          - type: nAUC_mrr_max
            value: 45.68526733900048
          - type: nAUC_mrr_std
            value: 28.22466385500339
        task:
          type: Reranking
      - dataset:
          config: default
          name: MTEB RuBQRetrieval (default)
          revision: e19b6ffa60b3bc248e0b41f4cc37c26a55c2a67b
          split: test
          type: ai-forever/rubq-retrieval
        metrics:
          - type: main_score
            value: 72.392
          - type: map_at_1
            value: 47.370000000000005
          - type: map_at_10
            value: 65.503
          - type: map_at_100
            value: 66.38
          - type: map_at_1000
            value: 66.42099999999999
          - type: map_at_20
            value: 66.071
          - type: map_at_3
            value: 61.439
          - type: map_at_5
            value: 63.922999999999995
          - type: mrr_at_1
            value: 67.37588652482269
          - type: mrr_at_10
            value: 76.0066747345116
          - type: mrr_at_100
            value: 76.25754138969413
          - type: mrr_at_1000
            value: 76.26968825657428
          - type: mrr_at_20
            value: 76.17548265904622
          - type: mrr_at_3
            value: 74.61583924349881
          - type: mrr_at_5
            value: 75.46690307328608
          - type: nauc_map_at_1000_diff1
            value: 42.52570720187294
          - type: nauc_map_at_1000_max
            value: 37.40318318724238
          - type: nauc_map_at_1000_std
            value: 0.6037788201535506
          - type: nauc_map_at_100_diff1
            value: 42.493410029691226
          - type: nauc_map_at_100_max
            value: 37.39802489244377
          - type: nauc_map_at_100_std
            value: 0.6071359951887154
          - type: nauc_map_at_10_diff1
            value: 42.09833519659916
          - type: nauc_map_at_10_max
            value: 37.1184138958874
          - type: nauc_map_at_10_std
            value: 0.4063543094010351
          - type: nauc_map_at_1_diff1
            value: 49.56605205141156
          - type: nauc_map_at_1_max
            value: 26.251096698710384
          - type: nauc_map_at_1_std
            value: -4.580748485387834
          - type: nauc_map_at_20_diff1
            value: 42.33372393482018
          - type: nauc_map_at_20_max
            value: 37.416955604649985
          - type: nauc_map_at_20_std
            value: 0.6050577802787294
          - type: nauc_map_at_3_diff1
            value: 42.362234475441845
          - type: nauc_map_at_3_max
            value: 34.56001379838821
          - type: nauc_map_at_3_std
            value: -1.507636598929042
          - type: nauc_map_at_5_diff1
            value: 42.0202264882535
          - type: nauc_map_at_5_max
            value: 36.64306050200848
          - type: nauc_map_at_5_std
            value: -0.09509025708798424
          - type: nauc_mrr_at_1000_diff1
            value: 58.99601742026931
          - type: nauc_mrr_at_1000_max
            value: 49.61561872452777
          - type: nauc_mrr_at_1000_std
            value: 2.3956102974352356
          - type: nauc_mrr_at_100_diff1
            value: 58.9865943101085
          - type: nauc_mrr_at_100_max
            value: 49.6248111507265
          - type: nauc_mrr_at_100_std
            value: 2.411155095066369
          - type: nauc_mrr_at_10_diff1
            value: 58.81758131092919
          - type: nauc_mrr_at_10_max
            value: 49.780365572616695
          - type: nauc_mrr_at_10_std
            value: 2.7068696565195944
          - type: nauc_mrr_at_1_diff1
            value: 61.67036882487055
          - type: nauc_mrr_at_1_max
            value: 45.455271042821714
          - type: nauc_mrr_at_1_std
            value: -0.9370526815458349
          - type: nauc_mrr_at_20_diff1
            value: 58.93674818203478
          - type: nauc_mrr_at_20_max
            value: 49.703218108625215
          - type: nauc_mrr_at_20_std
            value: 2.4473106598190415
          - type: nauc_mrr_at_3_diff1
            value: 59.046856598788445
          - type: nauc_mrr_at_3_max
            value: 49.37161726123392
          - type: nauc_mrr_at_3_std
            value: 1.5110936686701506
          - type: nauc_mrr_at_5_diff1
            value: 58.92289378915668
          - type: nauc_mrr_at_5_max
            value: 49.847638994134144
          - type: nauc_mrr_at_5_std
            value: 2.420421880131702
          - type: nauc_ndcg_at_1000_diff1
            value: 45.56062215161734
          - type: nauc_ndcg_at_1000_max
            value: 41.507152286702
          - type: nauc_ndcg_at_1000_std
            value: 2.79388283208751
          - type: nauc_ndcg_at_100_diff1
            value: 44.84064192570408
          - type: nauc_ndcg_at_100_max
            value: 41.50353573562353
          - type: nauc_ndcg_at_100_std
            value: 3.1804999773629357
          - type: nauc_ndcg_at_10_diff1
            value: 43.341482144213614
          - type: nauc_ndcg_at_10_max
            value: 41.159590898395074
          - type: nauc_ndcg_at_10_std
            value: 2.945242338240843
          - type: nauc_ndcg_at_1_diff1
            value: 62.23623985611396
          - type: nauc_ndcg_at_1_max
            value: 45.04945770947091
          - type: nauc_ndcg_at_1_std
            value: -0.8804967656575725
          - type: nauc_ndcg_at_20_diff1
            value: 43.905372612093664
          - type: nauc_ndcg_at_20_max
            value: 41.797709837872446
          - type: nauc_ndcg_at_20_std
            value: 3.1853356915569653
          - type: nauc_ndcg_at_3_diff1
            value: 44.18163998834299
          - type: nauc_ndcg_at_3_max
            value: 38.352891017864636
          - type: nauc_ndcg_at_3_std
            value: -0.8235767021150929
          - type: nauc_ndcg_at_5_diff1
            value: 43.41374688421302
          - type: nauc_ndcg_at_5_max
            value: 40.390365601593956
          - type: nauc_ndcg_at_5_std
            value: 1.6743650108127537
          - type: nauc_precision_at_1000_diff1
            value: -9.711058370691381
          - type: nauc_precision_at_1000_max
            value: 6.97321343449286
          - type: nauc_precision_at_1000_std
            value: 7.933531916622121
          - type: nauc_precision_at_100_diff1
            value: -8.247029644152319
          - type: nauc_precision_at_100_max
            value: 10.86740140944616
          - type: nauc_precision_at_100_std
            value: 9.581885544675918
          - type: nauc_precision_at_10_diff1
            value: -2.409043695429943
          - type: nauc_precision_at_10_max
            value: 21.04733206074314
          - type: nauc_precision_at_10_std
            value: 10.03334651647101
          - type: nauc_precision_at_1_diff1
            value: 62.23623985611396
          - type: nauc_precision_at_1_max
            value: 45.04945770947091
          - type: nauc_precision_at_1_std
            value: -0.8804967656575725
          - type: nauc_precision_at_20_diff1
            value: -5.230303656931621
          - type: nauc_precision_at_20_max
            value: 17.77799716919181
          - type: nauc_precision_at_20_std
            value: 10.739127998618654
          - type: nauc_precision_at_3_diff1
            value: 10.40376424999862
          - type: nauc_precision_at_3_max
            value: 30.933333400254035
          - type: nauc_precision_at_3_std
            value: 6.126209127968004
          - type: nauc_precision_at_5_diff1
            value: 3.147398101830739
          - type: nauc_precision_at_5_max
            value: 27.1746309955971
          - type: nauc_precision_at_5_std
            value: 8.874723615388788
          - type: nauc_recall_at_1000_diff1
            value: 5.055940692380908
          - type: nauc_recall_at_1000_max
            value: 22.42031123370267
          - type: nauc_recall_at_1000_std
            value: 27.75476692527869
          - type: nauc_recall_at_100_diff1
            value: 17.86391178198642
          - type: nauc_recall_at_100_max
            value: 34.776134863678955
          - type: nauc_recall_at_100_std
            value: 18.96377158778504
          - type: nauc_recall_at_10_diff1
            value: 24.863097695413597
          - type: nauc_recall_at_10_max
            value: 37.697411651507444
          - type: nauc_recall_at_10_std
            value: 9.519849994253967
          - type: nauc_recall_at_1_diff1
            value: 49.56605205141156
          - type: nauc_recall_at_1_max
            value: 26.251096698710384
          - type: nauc_recall_at_1_std
            value: -4.580748485387834
          - type: nauc_recall_at_20_diff1
            value: 22.440602811005636
          - type: nauc_recall_at_20_max
            value: 39.538861316515
          - type: nauc_recall_at_20_std
            value: 11.363269553121468
          - type: nauc_recall_at_3_diff1
            value: 32.80302839873736
          - type: nauc_recall_at_3_max
            value: 32.53105685012729
          - type: nauc_recall_at_3_std
            value: -0.7140166410605693
          - type: nauc_recall_at_5_diff1
            value: 29.375386639154865
          - type: nauc_recall_at_5_max
            value: 36.91045781164083
          - type: nauc_recall_at_5_std
            value: 4.725419050262578
          - type: ndcg_at_1
            value: 67.13900000000001
          - type: ndcg_at_10
            value: 72.392
          - type: ndcg_at_100
            value: 75.25800000000001
          - type: ndcg_at_1000
            value: 75.982
          - type: ndcg_at_20
            value: 73.783
          - type: ndcg_at_3
            value: 67.269
          - type: ndcg_at_5
            value: 69.807
          - type: precision_at_1
            value: 67.13900000000001
          - type: precision_at_10
            value: 13.327
          - type: precision_at_100
            value: 1.5559999999999998
          - type: precision_at_1000
            value: 0.164
          - type: precision_at_20
            value: 7.119000000000001
          - type: precision_at_3
            value: 35.599
          - type: precision_at_5
            value: 23.936
          - type: recall_at_1
            value: 47.370000000000005
          - type: recall_at_10
            value: 82.16
          - type: recall_at_100
            value: 93.34
          - type: recall_at_1000
            value: 98.202
          - type: recall_at_20
            value: 86.687
          - type: recall_at_3
            value: 69.319
          - type: recall_at_5
            value: 75.637
        task:
          type: Retrieval
      - dataset:
          config: default
          name: MTEB RuReviewsClassification (default)
          revision: f6d2c31f4dc6b88f468552750bfec05b4b41b05a
          split: test
          type: ai-forever/ru-reviews-classification
        metrics:
          - type: accuracy
            value: 75.0537109375
          - type: f1
            value: 74.00523205209554
          - type: f1_weighted
            value: 74.00436782840376
          - type: main_score
            value: 75.0537109375
        task:
          type: Classification
      - dataset:
          config: default
          name: MTEB RuSTSBenchmarkSTS (default)
          revision: 7cf24f325c6da6195df55bef3d86b5e0616f3018
          split: test
          type: ai-forever/ru-stsbenchmark-sts
        metrics:
          - type: cosine_pearson
            value: 81.10255413476487
          - type: cosine_spearman
            value: 81.40020843157141
          - type: euclidean_pearson
            value: 81.25155479902466
          - type: euclidean_spearman
            value: 81.40020831064922
          - type: main_score
            value: 81.40020843157141
          - type: manhattan_pearson
            value: 81.1493715249014
          - type: manhattan_spearman
            value: 81.30973667941649
          - type: pearson
            value: 81.10255413476487
          - type: spearman
            value: 81.40020843157141
        task:
          type: STS
      - dataset:
          config: default
          name: MTEB RuSciBenchGRNTIClassification (default)
          revision: 673a610d6d3dd91a547a0d57ae1b56f37ebbf6a1
          split: test
          type: ai-forever/ru-scibench-grnti-classification
        metrics:
          - type: accuracy
            value: 69.8974609375
          - type: f1
            value: 68.57837564785511
          - type: f1_weighted
            value: 68.59030489460784
          - type: main_score
            value: 69.8974609375
        task:
          type: Classification
      - dataset:
          config: default
          name: MTEB RuSciBenchGRNTIClusteringP2P (default)
          revision: 673a610d6d3dd91a547a0d57ae1b56f37ebbf6a1
          split: test
          type: ai-forever/ru-scibench-grnti-classification
        metrics:
          - type: main_score
            value: 67.03880348548029
          - type: v_measure
            value: 67.03880348548029
          - type: v_measure_std
            value: 0.6126278133139618
        task:
          type: Clustering
      - dataset:
          config: default
          name: MTEB RuSciBenchOECDClassification (default)
          revision: 26c88e99dcaba32bb45d0e1bfc21902337f6d471
          split: test
          type: ai-forever/ru-scibench-oecd-classification
        metrics:
          - type: accuracy
            value: 54.63378906250001
          - type: f1
            value: 51.34306420274629
          - type: f1_weighted
            value: 51.33495867493914
          - type: main_score
            value: 54.63378906250001
        task:
          type: Classification
      - dataset:
          config: default
          name: MTEB RuSciBenchOECDClusteringP2P (default)
          revision: 26c88e99dcaba32bb45d0e1bfc21902337f6d471
          split: test
          type: ai-forever/ru-scibench-oecd-classification
        metrics:
          - type: main_score
            value: 56.55947121159027
          - type: v_measure
            value: 56.55947121159027
          - type: v_measure_std
            value: 0.5498882006880662
        task:
          type: Clustering
      - dataset:
          config: ru
          name: MTEB STS22 (ru)
          revision: de9d86b3b84231dc21f76c7b7af1f28e2f57f6e3
          split: test
          type: mteb/sts22-crosslingual-sts
        metrics:
          - type: cosine_pearson
            value: 61.833294921667914
          - type: cosine_spearman
            value: 63.53967536726357
          - type: euclidean_pearson
            value: 60.382865218855805
          - type: euclidean_spearman
            value: 63.53967536726357
          - type: main_score
            value: 63.53967536726357
          - type: manhattan_pearson
            value: 60.24879015304578
          - type: manhattan_spearman
            value: 63.42305760430092
          - type: pearson
            value: 61.833294921667914
          - type: spearman
            value: 63.53967536726357
        task:
          type: STS
      - dataset:
          config: default
          name: MTEB SensitiveTopicsClassification (default)
          revision: 416b34a802308eac30e4192afc0ff99bb8dcc7f2
          split: test
          type: ai-forever/sensitive-topics-classification
        metrics:
          - type: accuracy
            value: 39.8193359375
          - type: f1
            value: 55.46591740935434
          - type: lrap
            value: 66.50980631510454
          - type: main_score
            value: 39.8193359375
        task:
          type: MultilabelClassification
      - dataset:
          config: default
          name: MTEB TERRa (default)
          revision: 7b58f24536063837d644aab9a023c62199b2a612
          split: dev
          type: ai-forever/terra-pairclassification
        metrics:
          - type: cosine_accuracy
            value: 66.77524429967427
          - type: cosine_accuracy_threshold
            value: 55.58975338935852
          - type: cosine_ap
            value: 66.4567219323658
          - type: cosine_f1
            value: 70.64676616915423
          - type: cosine_f1_threshold
            value: 45.55969536304474
          - type: cosine_precision
            value: 57.028112449799195
          - type: cosine_recall
            value: 92.81045751633987
          - type: dot_accuracy
            value: 66.77524429967427
          - type: dot_accuracy_threshold
            value: 55.589759349823
          - type: dot_ap
            value: 66.4567219323658
          - type: dot_f1
            value: 70.64676616915423
          - type: dot_f1_threshold
            value: 45.55969536304474
          - type: dot_precision
            value: 57.028112449799195
          - type: dot_recall
            value: 92.81045751633987
          - type: euclidean_accuracy
            value: 66.77524429967427
          - type: euclidean_accuracy_threshold
            value: 94.24455165863037
          - type: euclidean_ap
            value: 66.4567219323658
          - type: euclidean_f1
            value: 70.64676616915423
          - type: euclidean_f1_threshold
            value: 104.34587001800537
          - type: euclidean_precision
            value: 57.028112449799195
          - type: euclidean_recall
            value: 92.81045751633987
          - type: main_score
            value: 66.4567219323658
          - type: manhattan_accuracy
            value: 66.77524429967427
          - type: manhattan_accuracy_threshold
            value: 2865.5345916748047
          - type: manhattan_ap
            value: 66.26659863769075
          - type: manhattan_f1
            value: 70.8542713567839
          - type: manhattan_f1_threshold
            value: 3212.3912811279297
          - type: manhattan_precision
            value: 57.55102040816327
          - type: manhattan_recall
            value: 92.15686274509804
          - type: max_accuracy
            value: 66.77524429967427
          - type: max_ap
            value: 66.4567219323658
          - type: max_f1
            value: 70.8542713567839
          - type: max_precision
            value: 57.55102040816327
          - type: max_recall
            value: 92.81045751633987
          - type: similarity_accuracy
            value: 66.77524429967427
          - type: similarity_accuracy_threshold
            value: 55.58975338935852
          - type: similarity_ap
            value: 66.4567219323658
          - type: similarity_f1
            value: 70.64676616915423
          - type: similarity_f1_threshold
            value: 45.55969536304474
          - type: similarity_precision
            value: 57.028112449799195
          - type: similarity_recall
            value: 92.81045751633987
        task:
          type: PairClassification
license: mit
language:
  - ru
  - en
tags:
  - mteb
  - transformers
  - sentence-transformers
base_model: ai-forever/FRED-T5-1.7B
pipeline_tag: feature-extraction

Model Card for FRIDA

FRIDA is a full-scale finetuned general text embedding model inspired by denoising architecture based on T5. The model is based on the encoder part of FRED-T5 model and continues research of text embedding models (ruMTEB, ru-en-RoSBERTa). It has been pre-trained on a Russian-English dataset and fine-tuned for improved performance on the target task.

For more model details please refer to our technical report [TODO].

Usage

The model can be used as is with prefixes. It is recommended to use CLS pooling. The choice of prefix and pooling depends on the task.

We use the following basic rules to choose a prefix:

  • "search_query: " and "search_document: " prefixes are for answer or relevant paragraph retrieval
  • "paraphrase: " prefix is for symmetric paraphrasing related tasks (STS, paraphrase mining, deduplication)
  • "categorize: " prefix is for asymmetric matching of document title and body (e.g. news, scientific papers, social posts)
  • "categorize_sentiment: " prefix is for any tasks that rely on sentiment features (e.g. hate, toxic, emotion)
  • "categorize_topic: " prefix is intended for tasks where you need to group texts by topic
  • "categorize_entailment: " prefix is for textual entailment task (NLI)

To better tailor the model to your needs, you can fine-tune it with relevant high-quality Russian and English datasets.

Below are examples of texts encoding using the Transformers and SentenceTransformers libraries.

Transformers

import torch
import torch.nn.functional as F
from transformers import AutoTokenizer, T5EncoderModel


def pool(hidden_state, mask, pooling_method="cls"):
    if pooling_method == "mean":
        s = torch.sum(hidden_state * mask.unsqueeze(-1).float(), dim=1)
        d = mask.sum(axis=1, keepdim=True).float()
        return s / d
    elif pooling_method == "cls":
        return hidden_state[:, 0]

inputs = [
    # 
    "paraphrase: В Ярославской области разрешили работу бань, но без посетителей",
    "categorize_entailment: Женщину доставили в больницу, за ее жизнь сейчас борются врачи.",
    "search_query: Сколько программистов нужно, чтобы вкрутить лампочку?",
    # 
    "paraphrase: Ярославским баням разрешили работать без посетителей",
    "categorize_entailment: Женщину спасают врачи.",
    "search_document: Чтобы вкрутить лампочку, требуется три программиста: один напишет программу извлечения лампочки, другой — вкручивания лампочки, а третий проведет тестирование."
]

tokenizer = AutoTokenizer.from_pretrained("ai-forever/FRIDA")
model = T5EncoderModel.from_pretrained("ai-forever/FRIDA")

tokenized_inputs = tokenizer(inputs, max_length=512, padding=True, truncation=True, return_tensors="pt")

with torch.no_grad():
    outputs = model(**tokenized_inputs)
    
embeddings = pool(
    outputs.last_hidden_state, 
    tokenized_inputs["attention_mask"],
    pooling_method="cls" # or try "mean"
)

embeddings = F.normalize(embeddings, p=2, dim=1)
sim_scores = embeddings[:3] @ embeddings[3:].T
print(sim_scores.diag().tolist())
# [0.9360030293464661, 0.8591322302818298, 0.728583037853241]

SentenceTransformers

from sentence_transformers import SentenceTransformer

inputs = [
    # 
    "paraphrase: В Ярославской области разрешили работу бань, но без посетителей",
    "categorize_entailment: Женщину доставили в больницу, за ее жизнь сейчас борются врачи.",
    "search_query: Сколько программистов нужно, чтобы вкрутить лампочку?",
    # 
    "paraphrase: Ярославским баням разрешили работать без посетителей",
    "categorize_entailment: Женщину спасают врачи.",
    "search_document: Чтобы вкрутить лампочку, требуется три программиста: один напишет программу извлечения лампочки, другой — вкручивания лампочки, а третий проведет тестирование."
]

# loads model with CLS pooling
model = SentenceTransformer("ai-forever/FRIDA")

# embeddings are normalized by default
embeddings = model.encode(inputs, convert_to_tensor=True)

sim_scores = embeddings[:3] @ embeddings[3:].T
print(sim_scores.diag().tolist())
# [0.9360026717185974, 0.8591331243515015, 0.7285830974578857]

or using prompts (sentence-transformers>=2.4.0):

from sentence_transformers import SentenceTransformer

# loads model with CLS pooling
model = SentenceTransformer("ai-forever/FRIDA")

paraphrase = model.encode(["В Ярославской области разрешили работу бань, но без посетителей", "Ярославским баням разрешили работать без посетителей"], prompt_name="paraphrase")
print(paraphrase[0] @ paraphrase[1].T) # 0.9360032

categorize_entailment = model.encode(["Женщину доставили в больницу, за ее жизнь сейчас борются врачи.", "Женщину спасают врачи."], prompt_name="categorize_entailment")
print(categorize_entailment[0] @ categorize_entailment[1].T) # 0.8591322

query_embedding = model.encode("Сколько программистов нужно, чтобы вкрутить лампочку?", prompt_name="search_query")
document_embedding = model.encode("Чтобы вкрутить лампочку, требуется три программиста: один напишет программу извлечения лампочки, другой — вкручивания лампочки, а третий проведет тестирование.", prompt_name="search_document")
print(query_embedding @ document_embedding.T) # 0.7285831

Authors

Citation

@misc{TODO
}

Limitations

The model is designed to process texts in Russian, the quality in English is unknown. Maximum input text length is limited to 512 tokens.