ST2_modernbert-base_hazard_V1
This model is a fine-tuned version of answerdotai/ModernBERT-base on the None dataset. It achieves the following results on the evaluation set:
- Loss: 1.4784
- F1: 0.8438
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 36
- eval_batch_size: 16
- seed: 42
- optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- num_epochs: 200
Training results
Training Loss | Epoch | Step | Validation Loss | F1 |
---|---|---|---|---|
2.5191 | 1.0 | 128 | 1.0456 | 0.7432 |
0.9471 | 2.0 | 256 | 0.7727 | 0.7980 |
0.5678 | 3.0 | 384 | 0.8717 | 0.8031 |
0.2555 | 4.0 | 512 | 0.7572 | 0.8172 |
0.1684 | 5.0 | 640 | 0.8652 | 0.8206 |
0.1207 | 6.0 | 768 | 0.8450 | 0.8357 |
0.1161 | 7.0 | 896 | 0.9799 | 0.8240 |
0.0479 | 8.0 | 1024 | 0.9729 | 0.8212 |
0.0508 | 9.0 | 1152 | 0.9321 | 0.8460 |
0.0255 | 10.0 | 1280 | 0.9720 | 0.8499 |
0.0222 | 11.0 | 1408 | 1.0322 | 0.8224 |
0.0242 | 12.0 | 1536 | 1.0043 | 0.8340 |
0.0138 | 13.0 | 1664 | 1.0457 | 0.8253 |
0.0149 | 14.0 | 1792 | 1.1048 | 0.8228 |
0.0092 | 15.0 | 1920 | 1.0876 | 0.8321 |
0.0023 | 16.0 | 2048 | 1.0608 | 0.8406 |
0.0088 | 17.0 | 2176 | 1.1299 | 0.8305 |
0.0048 | 18.0 | 2304 | 1.1019 | 0.8414 |
0.0064 | 19.0 | 2432 | 1.0774 | 0.8277 |
0.0033 | 20.0 | 2560 | 1.1586 | 0.8345 |
0.0071 | 21.0 | 2688 | 1.0852 | 0.8252 |
0.0104 | 22.0 | 2816 | 1.1648 | 0.8245 |
0.0136 | 23.0 | 2944 | 1.2453 | 0.8153 |
0.0105 | 24.0 | 3072 | 1.0781 | 0.8333 |
0.0341 | 25.0 | 3200 | 1.1619 | 0.8297 |
0.0342 | 26.0 | 3328 | 1.1759 | 0.8313 |
0.0296 | 27.0 | 3456 | 1.2133 | 0.8248 |
0.0196 | 28.0 | 3584 | 1.1874 | 0.8421 |
0.0186 | 29.0 | 3712 | 1.1718 | 0.8292 |
0.0094 | 30.0 | 3840 | 1.2452 | 0.8467 |
0.0076 | 31.0 | 3968 | 1.2893 | 0.8359 |
0.0038 | 32.0 | 4096 | 1.3181 | 0.8402 |
0.0027 | 33.0 | 4224 | 1.3386 | 0.8451 |
0.001 | 34.0 | 4352 | 1.3360 | 0.8445 |
0.0026 | 35.0 | 4480 | 1.3282 | 0.8424 |
0.0024 | 36.0 | 4608 | 1.3332 | 0.8470 |
0.0004 | 37.0 | 4736 | 1.3393 | 0.8496 |
0.0028 | 38.0 | 4864 | 1.3387 | 0.8496 |
0.0023 | 39.0 | 4992 | 1.3492 | 0.8469 |
0.0017 | 40.0 | 5120 | 1.3429 | 0.8496 |
0.0027 | 41.0 | 5248 | 1.3550 | 0.8518 |
0.0021 | 42.0 | 5376 | 1.3583 | 0.8499 |
0.0014 | 43.0 | 5504 | 1.3619 | 0.8466 |
0.0013 | 44.0 | 5632 | 1.3568 | 0.8469 |
0.0012 | 45.0 | 5760 | 1.3727 | 0.8466 |
0.0038 | 46.0 | 5888 | 1.3737 | 0.8448 |
0.0021 | 47.0 | 6016 | 1.3665 | 0.8490 |
0.0024 | 48.0 | 6144 | 1.3730 | 0.8438 |
0.002 | 49.0 | 6272 | 1.3639 | 0.8485 |
0.002 | 50.0 | 6400 | 1.3754 | 0.8455 |
0.0026 | 51.0 | 6528 | 1.3731 | 0.8469 |
0.0016 | 52.0 | 6656 | 1.3841 | 0.8445 |
0.0019 | 53.0 | 6784 | 1.3772 | 0.8435 |
0.0022 | 54.0 | 6912 | 1.3832 | 0.8484 |
0.0021 | 55.0 | 7040 | 1.3866 | 0.8419 |
0.0013 | 56.0 | 7168 | 1.3917 | 0.8405 |
0.0015 | 57.0 | 7296 | 1.3902 | 0.8444 |
0.0017 | 58.0 | 7424 | 1.3941 | 0.8457 |
0.0019 | 59.0 | 7552 | 1.3992 | 0.8380 |
0.0019 | 60.0 | 7680 | 1.3967 | 0.8459 |
0.0023 | 61.0 | 7808 | 1.3910 | 0.8408 |
0.0022 | 62.0 | 7936 | 1.4057 | 0.8417 |
0.0019 | 63.0 | 8064 | 1.4024 | 0.8462 |
0.0012 | 64.0 | 8192 | 1.4142 | 0.8437 |
0.0022 | 65.0 | 8320 | 1.3902 | 0.8417 |
0.0012 | 66.0 | 8448 | 1.4110 | 0.8409 |
0.0016 | 67.0 | 8576 | 1.4014 | 0.8402 |
0.0015 | 68.0 | 8704 | 1.4132 | 0.8395 |
0.0011 | 69.0 | 8832 | 1.4247 | 0.8369 |
0.0029 | 70.0 | 8960 | 1.4302 | 0.8440 |
0.001 | 71.0 | 9088 | 1.3837 | 0.8371 |
0.1169 | 72.0 | 9216 | 1.1830 | 0.8102 |
0.097 | 73.0 | 9344 | 1.1205 | 0.8271 |
0.059 | 74.0 | 9472 | 1.2308 | 0.8477 |
0.0139 | 75.0 | 9600 | 1.2471 | 0.8398 |
0.0106 | 76.0 | 9728 | 1.2684 | 0.8316 |
0.0018 | 77.0 | 9856 | 1.2728 | 0.8325 |
0.0014 | 78.0 | 9984 | 1.2775 | 0.8322 |
0.0017 | 79.0 | 10112 | 1.2850 | 0.8303 |
0.0013 | 80.0 | 10240 | 1.2844 | 0.8303 |
0.0015 | 81.0 | 10368 | 1.2923 | 0.8332 |
0.0022 | 82.0 | 10496 | 1.2924 | 0.8320 |
0.002 | 83.0 | 10624 | 1.2962 | 0.8339 |
0.0009 | 84.0 | 10752 | 1.2992 | 0.8339 |
0.0012 | 85.0 | 10880 | 1.3002 | 0.8339 |
0.0018 | 86.0 | 11008 | 1.3037 | 0.8339 |
0.0019 | 87.0 | 11136 | 1.3079 | 0.8323 |
0.0009 | 88.0 | 11264 | 1.3084 | 0.8323 |
0.002 | 89.0 | 11392 | 1.3105 | 0.8343 |
0.0017 | 90.0 | 11520 | 1.3118 | 0.8380 |
0.0012 | 91.0 | 11648 | 1.3124 | 0.8345 |
0.0022 | 92.0 | 11776 | 1.3147 | 0.8366 |
0.0017 | 93.0 | 11904 | 1.3192 | 0.8343 |
0.0015 | 94.0 | 12032 | 1.3197 | 0.8343 |
0.0019 | 95.0 | 12160 | 1.3164 | 0.8363 |
0.0013 | 96.0 | 12288 | 1.3225 | 0.8348 |
0.0016 | 97.0 | 12416 | 1.3221 | 0.8354 |
0.0014 | 98.0 | 12544 | 1.3242 | 0.8378 |
0.0014 | 99.0 | 12672 | 1.3255 | 0.8378 |
0.0014 | 100.0 | 12800 | 1.3271 | 0.8388 |
0.0017 | 101.0 | 12928 | 1.3282 | 0.8378 |
0.0017 | 102.0 | 13056 | 1.3317 | 0.8382 |
0.0015 | 103.0 | 13184 | 1.3328 | 0.8382 |
0.0015 | 104.0 | 13312 | 1.3317 | 0.8382 |
0.0017 | 105.0 | 13440 | 1.3333 | 0.8401 |
0.0021 | 106.0 | 13568 | 1.3365 | 0.8388 |
0.0011 | 107.0 | 13696 | 1.3397 | 0.8392 |
0.0017 | 108.0 | 13824 | 1.3391 | 0.8398 |
0.0007 | 109.0 | 13952 | 1.3383 | 0.8411 |
0.002 | 110.0 | 14080 | 1.3450 | 0.8408 |
0.0014 | 111.0 | 14208 | 1.3477 | 0.8408 |
0.002 | 112.0 | 14336 | 1.3461 | 0.8411 |
0.0007 | 113.0 | 14464 | 1.3513 | 0.8417 |
0.0017 | 114.0 | 14592 | 1.3512 | 0.8421 |
0.0013 | 115.0 | 14720 | 1.3513 | 0.8408 |
0.001 | 116.0 | 14848 | 1.3515 | 0.8397 |
0.0015 | 117.0 | 14976 | 1.3584 | 0.8394 |
0.0016 | 118.0 | 15104 | 1.3529 | 0.8421 |
0.0008 | 119.0 | 15232 | 1.3539 | 0.8417 |
0.0022 | 120.0 | 15360 | 1.3544 | 0.8444 |
0.0016 | 121.0 | 15488 | 1.3628 | 0.8419 |
0.002 | 122.0 | 15616 | 1.3633 | 0.8417 |
0.0014 | 123.0 | 15744 | 1.3661 | 0.8397 |
0.0016 | 124.0 | 15872 | 1.3688 | 0.8418 |
0.0016 | 125.0 | 16000 | 1.3660 | 0.8417 |
0.0012 | 126.0 | 16128 | 1.3665 | 0.8431 |
0.0016 | 127.0 | 16256 | 1.3702 | 0.8395 |
0.0016 | 128.0 | 16384 | 1.3827 | 0.8416 |
0.002 | 129.0 | 16512 | 1.3598 | 0.8413 |
0.0011 | 130.0 | 16640 | 1.3711 | 0.8437 |
0.0014 | 131.0 | 16768 | 1.3608 | 0.8465 |
0.0023 | 132.0 | 16896 | 1.3945 | 0.8418 |
0.0015 | 133.0 | 17024 | 1.3688 | 0.8465 |
0.0011 | 134.0 | 17152 | 1.3865 | 0.8415 |
0.002 | 135.0 | 17280 | 1.3798 | 0.8435 |
0.0014 | 136.0 | 17408 | 1.3950 | 0.8436 |
0.0016 | 137.0 | 17536 | 1.3800 | 0.8435 |
0.0009 | 138.0 | 17664 | 1.4076 | 0.8415 |
0.0023 | 139.0 | 17792 | 1.3928 | 0.8436 |
0.0012 | 140.0 | 17920 | 1.3917 | 0.8412 |
0.0013 | 141.0 | 18048 | 1.3954 | 0.8436 |
0.0021 | 142.0 | 18176 | 1.3990 | 0.8436 |
0.0014 | 143.0 | 18304 | 1.3970 | 0.8436 |
0.001 | 144.0 | 18432 | 1.3982 | 0.8436 |
0.0017 | 145.0 | 18560 | 1.4059 | 0.8436 |
0.0016 | 146.0 | 18688 | 1.4020 | 0.8436 |
0.0015 | 147.0 | 18816 | 1.4094 | 0.8436 |
0.0013 | 148.0 | 18944 | 1.3975 | 0.8453 |
0.0011 | 149.0 | 19072 | 1.4131 | 0.8436 |
0.0018 | 150.0 | 19200 | 1.4027 | 0.8436 |
0.0013 | 151.0 | 19328 | 1.4186 | 0.8436 |
0.0006 | 152.0 | 19456 | 1.4225 | 0.8436 |
0.0027 | 153.0 | 19584 | 1.4087 | 0.8413 |
0.0013 | 154.0 | 19712 | 1.4294 | 0.8438 |
0.0018 | 155.0 | 19840 | 1.4011 | 0.8438 |
0.0009 | 156.0 | 19968 | 1.4305 | 0.8444 |
0.0016 | 157.0 | 20096 | 1.3805 | 0.8444 |
0.0013 | 158.0 | 20224 | 1.4375 | 0.8436 |
0.001 | 159.0 | 20352 | 1.4288 | 0.8436 |
0.0022 | 160.0 | 20480 | 1.4348 | 0.8438 |
0.001 | 161.0 | 20608 | 1.4338 | 0.8436 |
0.0015 | 162.0 | 20736 | 1.4358 | 0.8436 |
0.0019 | 163.0 | 20864 | 1.4315 | 0.8436 |
0.0009 | 164.0 | 20992 | 1.4362 | 0.8436 |
0.0017 | 165.0 | 21120 | 1.4363 | 0.8436 |
0.0006 | 166.0 | 21248 | 1.4398 | 0.8436 |
0.0018 | 167.0 | 21376 | 1.4364 | 0.8436 |
0.0017 | 168.0 | 21504 | 1.4435 | 0.8438 |
0.0015 | 169.0 | 21632 | 1.4482 | 0.8436 |
0.001 | 170.0 | 21760 | 1.4436 | 0.8436 |
0.0016 | 171.0 | 21888 | 1.4507 | 0.8436 |
0.0012 | 172.0 | 22016 | 1.4470 | 0.8436 |
0.001 | 173.0 | 22144 | 1.4505 | 0.8436 |
0.0017 | 174.0 | 22272 | 1.4478 | 0.8436 |
0.0011 | 175.0 | 22400 | 1.4470 | 0.8436 |
0.0013 | 176.0 | 22528 | 1.4537 | 0.8436 |
0.0012 | 177.0 | 22656 | 1.4564 | 0.8436 |
0.0015 | 178.0 | 22784 | 1.4572 | 0.8436 |
0.0015 | 179.0 | 22912 | 1.4587 | 0.8436 |
0.001 | 180.0 | 23040 | 1.4622 | 0.8436 |
0.0014 | 181.0 | 23168 | 1.4619 | 0.8436 |
0.0016 | 182.0 | 23296 | 1.4650 | 0.8436 |
0.0008 | 183.0 | 23424 | 1.4695 | 0.8438 |
0.0016 | 184.0 | 23552 | 1.4658 | 0.8438 |
0.0008 | 185.0 | 23680 | 1.4687 | 0.8436 |
0.0016 | 186.0 | 23808 | 1.4716 | 0.8436 |
0.0012 | 187.0 | 23936 | 1.4747 | 0.8436 |
0.001 | 188.0 | 24064 | 1.4733 | 0.8436 |
0.0014 | 189.0 | 24192 | 1.4756 | 0.8438 |
0.0012 | 190.0 | 24320 | 1.4786 | 0.8438 |
0.0012 | 191.0 | 24448 | 1.4776 | 0.8436 |
0.0008 | 192.0 | 24576 | 1.4775 | 0.8436 |
0.0016 | 193.0 | 24704 | 1.4768 | 0.8436 |
0.0012 | 194.0 | 24832 | 1.4759 | 0.8438 |
0.0012 | 195.0 | 24960 | 1.4774 | 0.8438 |
0.0014 | 196.0 | 25088 | 1.4777 | 0.8438 |
0.0014 | 197.0 | 25216 | 1.4794 | 0.8436 |
0.001 | 198.0 | 25344 | 1.4799 | 0.8436 |
0.0012 | 199.0 | 25472 | 1.4787 | 0.8438 |
0.0012 | 200.0 | 25600 | 1.4784 | 0.8438 |
Framework versions
- Transformers 4.48.0.dev0
- Pytorch 2.4.1+cu121
- Datasets 3.1.0
- Tokenizers 0.21.0
- Downloads last month
- 10
Inference Providers
NEW
This model is not currently available via any of the supported third-party Inference Providers, and
the model is not deployed on the HF Inference API.
Model tree for BenPhan/ST2_modernbert-base_hazard_V1
Base model
answerdotai/ModernBERT-base