Edit model card

bert-large-uncased_stereoset_classifieronly

This model is a fine-tuned version of bert-large-uncased on the stereoset dataset. It achieves the following results on the evaluation set:

  • Loss: 0.6856
  • Accuracy: 0.5479
  • Tp: 0.3579
  • Tn: 0.1900
  • Fp: 0.3242
  • Fn: 0.1279

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 64
  • eval_batch_size: 64
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 30

Training results

Training Loss Epoch Step Validation Loss Accuracy Tp Tn Fp Fn
0.6938 0.43 20 0.6891 0.5573 0.2630 0.2943 0.2198 0.2229
0.703 0.85 40 0.6942 0.5455 0.3901 0.1554 0.3587 0.0958
0.6929 1.28 60 0.6882 0.5510 0.3069 0.2441 0.2700 0.1790
0.7044 1.7 80 0.6895 0.5471 0.3603 0.1868 0.3273 0.1256
0.6947 2.13 100 0.6874 0.5463 0.3006 0.2457 0.2684 0.1852
0.7115 2.55 120 0.6910 0.5479 0.3768 0.1711 0.3430 0.1091
0.7016 2.98 140 0.6879 0.5518 0.3430 0.2088 0.3053 0.1429
0.6952 3.4 160 0.6918 0.5361 0.3980 0.1381 0.3760 0.0879
0.6959 3.83 180 0.6910 0.5463 0.3878 0.1586 0.3556 0.0981
0.6924 4.26 200 0.6906 0.5471 0.3830 0.1641 0.3501 0.1028
0.6943 4.68 220 0.6877 0.5487 0.3195 0.2292 0.2849 0.1664
0.7052 5.11 240 0.6879 0.5644 0.2473 0.3171 0.1970 0.2386
0.6881 5.53 260 0.6889 0.5479 0.3791 0.1688 0.3454 0.1068
0.6971 5.96 280 0.6882 0.5463 0.3752 0.1711 0.3430 0.1107
0.6781 6.38 300 0.6930 0.5330 0.4035 0.1295 0.3846 0.0824
0.6992 6.81 320 0.6875 0.5495 0.3579 0.1915 0.3226 0.1279
0.6954 7.23 340 0.6868 0.5557 0.3195 0.2363 0.2779 0.1664
0.6949 7.66 360 0.6877 0.5479 0.3556 0.1923 0.3218 0.1303
0.6946 8.09 380 0.6899 0.5471 0.3878 0.1593 0.3548 0.0981
0.6877 8.51 400 0.6862 0.5542 0.3218 0.2323 0.2818 0.1641
0.6994 8.94 420 0.6890 0.5479 0.3823 0.1656 0.3485 0.1036
0.7061 9.36 440 0.6867 0.5620 0.2347 0.3273 0.1868 0.2512
0.6945 9.79 460 0.6893 0.5479 0.3878 0.1601 0.3540 0.0981
0.7078 10.21 480 0.6908 0.5353 0.3972 0.1381 0.3760 0.0887
0.6911 10.64 500 0.6858 0.5502 0.3108 0.2394 0.2747 0.1750
0.684 11.06 520 0.6875 0.5502 0.3768 0.1735 0.3407 0.1091
0.6925 11.49 540 0.6906 0.5369 0.3972 0.1397 0.3744 0.0887
0.7104 11.91 560 0.6856 0.5597 0.2527 0.3069 0.2072 0.2331
0.6919 12.34 580 0.6857 0.5479 0.3391 0.2088 0.3053 0.1468
0.6873 12.77 600 0.6903 0.5338 0.3987 0.1350 0.3791 0.0871
0.6915 13.19 620 0.6862 0.5471 0.3540 0.1931 0.3210 0.1319
0.6921 13.62 640 0.6859 0.5518 0.3485 0.2033 0.3108 0.1374
0.7092 14.04 660 0.6888 0.5479 0.3807 0.1672 0.3469 0.1052
0.6874 14.47 680 0.6851 0.5518 0.3210 0.2308 0.2834 0.1648
0.682 14.89 700 0.6877 0.5510 0.3744 0.1766 0.3375 0.1115
0.6953 15.32 720 0.6853 0.5526 0.3273 0.2253 0.2889 0.1586
0.7056 15.74 740 0.6882 0.5487 0.3885 0.1601 0.3540 0.0973
0.6776 16.17 760 0.6875 0.5471 0.3783 0.1688 0.3454 0.1075
0.6862 16.6 780 0.6863 0.5510 0.3642 0.1868 0.3273 0.1217
0.6827 17.02 800 0.6868 0.5510 0.3705 0.1805 0.3336 0.1154
0.7161 17.45 820 0.6878 0.5502 0.3791 0.1711 0.3430 0.1068
0.6991 17.87 840 0.6852 0.5487 0.3359 0.2127 0.3014 0.1499
0.6836 18.3 860 0.6876 0.5487 0.3830 0.1656 0.3485 0.1028
0.7023 18.72 880 0.6862 0.5487 0.3595 0.1892 0.3250 0.1264
0.6939 19.15 900 0.6854 0.5495 0.3485 0.2009 0.3132 0.1374
0.6883 19.57 920 0.6860 0.5479 0.3587 0.1892 0.3250 0.1272
0.6872 20.0 940 0.6866 0.5518 0.3697 0.1821 0.3320 0.1162
0.685 20.43 960 0.6861 0.5487 0.3595 0.1892 0.3250 0.1264
0.6771 20.85 980 0.6853 0.5510 0.3477 0.2033 0.3108 0.1381
0.6904 21.28 1000 0.6859 0.5487 0.3564 0.1923 0.3218 0.1295
0.6925 21.7 1020 0.6848 0.5518 0.3132 0.2386 0.2755 0.1727
0.6982 22.13 1040 0.6856 0.5463 0.3532 0.1931 0.3210 0.1327
0.7015 22.55 1060 0.6859 0.5479 0.3587 0.1892 0.3250 0.1272
0.6851 22.98 1080 0.6860 0.5518 0.3650 0.1868 0.3273 0.1209
0.6875 23.4 1100 0.6856 0.5463 0.3532 0.1931 0.3210 0.1327
0.7035 23.83 1120 0.6851 0.5510 0.3454 0.2057 0.3085 0.1405
0.699 24.26 1140 0.6846 0.5534 0.3281 0.2253 0.2889 0.1578
0.6954 24.68 1160 0.6851 0.5495 0.3485 0.2009 0.3132 0.1374
0.6881 25.11 1180 0.6851 0.5510 0.3485 0.2025 0.3116 0.1374
0.6931 25.53 1200 0.6862 0.5487 0.3666 0.1821 0.3320 0.1193
0.6967 25.96 1220 0.6868 0.5487 0.3752 0.1735 0.3407 0.1107
0.6826 26.38 1240 0.6863 0.5502 0.3689 0.1813 0.3328 0.1170
0.6927 26.81 1260 0.6857 0.5487 0.3587 0.1900 0.3242 0.1272
0.692 27.23 1280 0.6853 0.5471 0.3524 0.1947 0.3195 0.1334
0.6936 27.66 1300 0.6856 0.5479 0.3579 0.1900 0.3242 0.1279
0.6871 28.09 1320 0.6856 0.5487 0.3579 0.1907 0.3234 0.1279
0.6956 28.51 1340 0.6857 0.5487 0.3595 0.1892 0.3250 0.1264
0.6788 28.94 1360 0.6859 0.5479 0.3611 0.1868 0.3273 0.1248
0.6933 29.36 1380 0.6856 0.5479 0.3579 0.1900 0.3242 0.1279
0.6909 29.79 1400 0.6856 0.5479 0.3579 0.1900 0.3242 0.1279

Framework versions

  • Transformers 4.26.1
  • Pytorch 1.13.1
  • Datasets 2.10.1
  • Tokenizers 0.13.2
Downloads last month
8
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train henryscheible/bert-large-uncased_stereoset_classifieronly

Evaluation results