llama3.0-8B_finetune_QA_EDU_36k_samples_r64

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4130

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 3.6e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • optimizer: Use OptimizerNames.PAGED_ADAMW_8BIT with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 5
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
1.0648 0.0055 50 0.9501
0.7039 0.0109 100 0.9215
0.7897 0.0164 150 0.9036
0.8434 0.0219 200 0.8875
0.7746 0.0273 250 0.8835
0.6896 0.0328 300 0.8671
0.824 0.0383 350 0.8548
1.0038 0.0437 400 0.8458
0.6472 0.0492 450 0.8378
0.8127 0.0547 500 0.8314
0.4811 0.0601 550 0.8286
0.8293 0.0656 600 0.8193
0.5957 0.0711 650 0.8136
0.7233 0.0765 700 0.8085
0.7291 0.0820 750 0.8020
0.822 0.0875 800 0.7976
0.6432 0.0929 850 0.7935
0.811 0.0984 900 0.7898
0.7663 0.1039 950 0.7878
1.0561 0.1093 1000 0.7835
0.5032 0.1148 1050 0.7826
0.8217 0.1203 1100 0.7813
0.6665 0.1257 1150 0.7722
0.9122 0.1312 1200 0.7701
0.7676 0.1367 1250 0.7683
0.6259 0.1421 1300 0.7648
0.6051 0.1476 1350 0.7614
0.8253 0.1531 1400 0.7540
0.7211 0.1585 1450 0.7529
0.6831 0.1640 1500 0.7473
0.6355 0.1695 1550 0.7476
0.7132 0.1749 1600 0.7456
0.7474 0.1804 1650 0.7418
0.928 0.1859 1700 0.7396
0.9145 0.1913 1750 0.7330
0.794 0.1968 1800 0.7357
0.477 0.2023 1850 0.7315
0.9995 0.2077 1900 0.7287
0.8334 0.2132 1950 0.7267
0.9413 0.2187 2000 0.7264
0.6294 0.2241 2050 0.7289
0.5025 0.2296 2100 0.7229
0.5802 0.2350 2150 0.7183
0.5811 0.2405 2200 0.7112
1.049 0.2460 2250 0.7085
0.6275 0.2514 2300 0.7045
0.5112 0.2569 2350 0.7006
0.8751 0.2624 2400 0.7032
0.6469 0.2678 2450 0.7019
0.7677 0.2733 2500 0.6989
0.8143 0.2788 2550 0.6988
1.2143 0.2842 2600 0.6997
0.8023 0.2897 2650 0.6942
0.5641 0.2952 2700 0.6904
0.8155 0.3006 2750 0.6943
0.5784 0.3061 2800 0.6861
0.7558 0.3116 2850 0.6778
0.6899 0.3170 2900 0.6795
0.5752 0.3225 2950 0.6781
0.8825 0.3280 3000 0.6786
0.6724 0.3334 3050 0.6765
0.6598 0.3389 3100 0.6738
0.6229 0.3444 3150 0.6721
0.5764 0.3498 3200 0.6686
0.5497 0.3553 3250 0.6706
0.3927 0.3608 3300 0.6668
0.4647 0.3662 3350 0.6667
0.3929 0.3717 3400 0.6648
0.8083 0.3772 3450 0.6652
0.5741 0.3826 3500 0.6615
0.6214 0.3881 3550 0.6576
0.7467 0.3936 3600 0.6566
0.9464 0.3990 3650 0.6546
0.841 0.4045 3700 0.6509
0.6993 0.4100 3750 0.6444
0.6812 0.4154 3800 0.6406
0.7938 0.4209 3850 0.6378
0.9625 0.4264 3900 0.6352
0.6543 0.4318 3950 0.6343
0.6272 0.4373 4000 0.6359
1.0398 0.4428 4050 0.6340
0.7372 0.4482 4100 0.6341
0.4583 0.4537 4150 0.6288
0.8163 0.4592 4200 0.6251
0.7215 0.4646 4250 0.6221
0.516 0.4701 4300 0.6261
0.9572 0.4756 4350 0.6216
0.6965 0.4810 4400 0.6191
0.8783 0.4865 4450 0.6188
0.6163 0.4920 4500 0.6172
0.7207 0.4974 4550 0.6177
0.4977 0.5029 4600 0.6158
0.6102 0.5084 4650 0.6125
0.9167 0.5138 4700 0.6116
0.5921 0.5193 4750 0.6107
0.6261 0.5248 4800 0.6061
0.5889 0.5302 4850 0.6064
0.3506 0.5357 4900 0.6012
0.3856 0.5412 4950 0.6037
0.6855 0.5466 5000 0.5985
0.6345 0.5521 5050 0.6021
0.8208 0.5576 5100 0.5984
0.5655 0.5630 5150 0.5958
0.4587 0.5685 5200 0.5952
0.5134 0.5740 5250 0.5946
0.3903 0.5794 5300 0.5955
0.5257 0.5849 5350 0.5925
0.6125 0.5904 5400 0.5913
0.6799 0.5958 5450 0.5903
0.7916 0.6013 5500 0.5884
0.7222 0.6068 5550 0.5863
0.4425 0.6122 5600 0.5821
0.6597 0.6177 5650 0.5839
0.4371 0.6232 5700 0.5815
0.4633 0.6286 5750 0.5804
0.6525 0.6341 5800 0.5808
0.5727 0.6396 5850 0.5803
0.424 0.6450 5900 0.5758
0.6045 0.6505 5950 0.5776
0.4846 0.6560 6000 0.5783
0.5949 0.6614 6050 0.5747
0.5127 0.6669 6100 0.5751
0.4289 0.6724 6150 0.5718
1.1129 0.6778 6200 0.5734
0.6932 0.6833 6250 0.5757
0.7736 0.6888 6300 0.5752
0.4592 0.6942 6350 0.5752
0.2358 0.6997 6400 0.5688
0.764 0.7051 6450 0.5659
0.6635 0.7106 6500 0.5671
0.5054 0.7161 6550 0.5679
0.5181 0.7215 6600 0.5697
0.5062 0.7270 6650 0.5699
0.3872 0.7325 6700 0.5665
0.6949 0.7379 6750 0.5649
0.8365 0.7434 6800 0.5661
0.5633 0.7489 6850 0.5626
0.889 0.7543 6900 0.5606
0.7509 0.7598 6950 0.5574
1.193 0.7653 7000 0.5550
0.6633 0.7707 7050 0.5529
0.3857 0.7762 7100 0.5591
0.3379 0.7817 7150 0.5504
0.7843 0.7871 7200 0.5501
0.4472 0.7926 7250 0.5520
0.3562 0.7981 7300 0.5472
0.3685 0.8035 7350 0.5472
0.5075 0.8090 7400 0.5477
0.5256 0.8145 7450 0.5465
0.5499 0.8199 7500 0.5452
0.7681 0.8254 7550 0.5462
0.7673 0.8309 7600 0.5495
0.4798 0.8363 7650 0.5441
0.5003 0.8418 7700 0.5445
0.5173 0.8473 7750 0.5440
0.3333 0.8527 7800 0.5426
0.4621 0.8582 7850 0.5382
0.4846 0.8637 7900 0.5413
0.4184 0.8691 7950 0.5408
0.4504 0.8746 8000 0.5386
0.5621 0.8801 8050 0.5362
0.4928 0.8855 8100 0.5336
0.4746 0.8910 8150 0.5311
0.4835 0.8965 8200 0.5304
0.3912 0.9019 8250 0.5292
0.621 0.9074 8300 0.5287
0.8945 0.9129 8350 0.5275
0.4848 0.9183 8400 0.5277
0.8911 0.9238 8450 0.5268
0.6915 0.9293 8500 0.5258
0.6046 0.9347 8550 0.5256
0.5119 0.9402 8600 0.5253
0.8352 0.9457 8650 0.5249
0.7015 0.9511 8700 0.5263
0.4502 0.9566 8750 0.5233
0.5712 0.9621 8800 0.5218
0.8441 0.9675 8850 0.5193
0.6835 0.9730 8900 0.5211
0.5472 0.9785 8950 0.5199
0.316 0.9839 9000 0.5175
0.7185 0.9894 9050 0.5169
0.3761 0.9949 9100 0.5178
0.5343 1.0003 9150 0.5164
0.7962 1.0058 9200 0.5161
0.3389 1.0113 9250 0.5138
0.4794 1.0167 9300 0.5131
0.5351 1.0222 9350 0.5169
0.3571 1.0277 9400 0.5178
0.3144 1.0331 9450 0.5136
0.5541 1.0386 9500 0.5144
0.3353 1.0441 9550 0.5119
0.4068 1.0495 9600 0.5134
0.3882 1.0550 9650 0.5100
0.2819 1.0605 9700 0.5082
0.3234 1.0659 9750 0.5094
0.3772 1.0714 9800 0.5063
0.4083 1.0769 9850 0.5090
0.4886 1.0823 9900 0.5069
0.283 1.0878 9950 0.5082
0.7671 1.0933 10000 0.5068
0.3055 1.0987 10050 0.5056
0.4367 1.1042 10100 0.5071
0.5444 1.1097 10150 0.5057
0.4949 1.1151 10200 0.5061
0.3558 1.1206 10250 0.5086
0.4746 1.1261 10300 0.5089
0.4472 1.1315 10350 0.5026
0.4686 1.1370 10400 0.5024
0.4081 1.1425 10450 0.5042
0.2828 1.1479 10500 0.5017
0.749 1.1534 10550 0.5011
0.2499 1.1588 10600 0.5013
0.3395 1.1643 10650 0.5021
0.3409 1.1698 10700 0.4978
0.674 1.1752 10750 0.4987
0.5194 1.1807 10800 0.4948
0.3518 1.1862 10850 0.4944
0.6073 1.1916 10900 0.4919
0.3766 1.1971 10950 0.4946
0.4954 1.2026 11000 0.4956
0.2772 1.2080 11050 0.4988
0.4468 1.2135 11100 0.4929
0.4541 1.2190 11150 0.4932
0.5671 1.2244 11200 0.4947
0.4888 1.2299 11250 0.4910
0.568 1.2354 11300 0.4907
0.3026 1.2408 11350 0.4911
0.3755 1.2463 11400 0.4896
0.498 1.2518 11450 0.4910
0.3694 1.2572 11500 0.4901
0.5963 1.2627 11550 0.4890
0.4029 1.2682 11600 0.4875
0.4503 1.2736 11650 0.4878
0.57 1.2791 11700 0.4850
0.4235 1.2846 11750 0.4843
0.2921 1.2900 11800 0.4840
0.7008 1.2955 11850 0.4853
0.4751 1.3010 11900 0.4865
0.2681 1.3064 11950 0.4854
0.342 1.3119 12000 0.4834
0.4396 1.3174 12050 0.4841
0.4525 1.3228 12100 0.4823
0.3439 1.3283 12150 0.4806
0.4636 1.3338 12200 0.4816
0.5279 1.3392 12250 0.4787
0.4047 1.3447 12300 0.4798
0.3597 1.3502 12350 0.4786
0.5365 1.3556 12400 0.4762
0.6849 1.3611 12450 0.4748
0.3914 1.3666 12500 0.4725
0.5433 1.3720 12550 0.4725
0.3853 1.3775 12600 0.4726
0.2984 1.3830 12650 0.4732
0.3082 1.3884 12700 0.4728
0.3704 1.3939 12750 0.4739
0.4911 1.3994 12800 0.4739
0.3299 1.4048 12850 0.4757
0.3212 1.4103 12900 0.4768
0.5281 1.4158 12950 0.4762
0.3491 1.4212 13000 0.4751
0.2243 1.4267 13050 0.4727
0.3651 1.4322 13100 0.4707
0.2275 1.4376 13150 0.4688
0.3817 1.4431 13200 0.4689
0.3759 1.4486 13250 0.4679
0.4378 1.4540 13300 0.4663
0.3523 1.4595 13350 0.4642
0.5074 1.4650 13400 0.4641
0.426 1.4704 13450 0.4660
0.3082 1.4759 13500 0.4618
0.2244 1.4814 13550 0.4662
0.5025 1.4868 13600 0.4646
0.3179 1.4923 13650 0.4641
0.275 1.4978 13700 0.4626
0.5281 1.5032 13750 0.4599
0.3667 1.5087 13800 0.4592
0.4539 1.5142 13850 0.4591
0.4156 1.5196 13900 0.4610
0.1621 1.5251 13950 0.4585
0.4954 1.5306 14000 0.4591
0.4589 1.5360 14050 0.4607
0.41 1.5415 14100 0.4583
0.4453 1.5470 14150 0.4545
0.244 1.5524 14200 0.4546
0.5205 1.5579 14250 0.4565
0.2065 1.5634 14300 0.4556
0.4503 1.5688 14350 0.4553
0.482 1.5743 14400 0.4524
0.2292 1.5798 14450 0.4525
0.4871 1.5852 14500 0.4517
0.4763 1.5907 14550 0.4539
0.2866 1.5962 14600 0.4552
0.4019 1.6016 14650 0.4557
0.5441 1.6071 14700 0.4551
0.4762 1.6126 14750 0.4533
0.3066 1.6180 14800 0.4526
0.4991 1.6235 14850 0.4526
0.311 1.6289 14900 0.4505
0.2365 1.6344 14950 0.4506
0.3477 1.6399 15000 0.4514
0.5098 1.6453 15050 0.4518
0.2939 1.6508 15100 0.4508
0.1844 1.6563 15150 0.4499
0.4786 1.6617 15200 0.4492
0.4923 1.6672 15250 0.4468
0.3181 1.6727 15300 0.4489
0.5213 1.6781 15350 0.4454
0.3283 1.6836 15400 0.4445
0.2816 1.6891 15450 0.4455
0.3369 1.6945 15500 0.4443
0.3503 1.7000 15550 0.4429
0.359 1.7055 15600 0.4439
0.3685 1.7109 15650 0.4426
0.2218 1.7164 15700 0.4440
0.3213 1.7219 15750 0.4417
0.5844 1.7273 15800 0.4404
0.461 1.7328 15850 0.4432
0.4323 1.7383 15900 0.4414
0.3304 1.7437 15950 0.4409
0.2571 1.7492 16000 0.4392
0.3406 1.7547 16050 0.4371
0.7021 1.7601 16100 0.4365
0.2579 1.7656 16150 0.4350
0.3533 1.7711 16200 0.4336
0.5392 1.7765 16250 0.4312
0.4225 1.7820 16300 0.4321
0.5097 1.7875 16350 0.4322
0.2512 1.7929 16400 0.4305
0.2924 1.7984 16450 0.4317
0.5928 1.8039 16500 0.4312
0.4437 1.8093 16550 0.4299
0.3437 1.8148 16600 0.4307
0.3557 1.8203 16650 0.4311
0.276 1.8257 16700 0.4294
0.3173 1.8312 16750 0.4294
0.2305 1.8367 16800 0.4309
0.372 1.8421 16850 0.4265
0.6609 1.8476 16900 0.4274
0.4588 1.8531 16950 0.4247
0.3001 1.8585 17000 0.4229
0.3603 1.8640 17050 0.4228
0.4153 1.8695 17100 0.4214
0.4851 1.8749 17150 0.4230
0.3846 1.8804 17200 0.4240
0.2041 1.8859 17250 0.4234
0.5593 1.8913 17300 0.4239
0.4937 1.8968 17350 0.4228
0.5448 1.9023 17400 0.4224
0.2487 1.9077 17450 0.4208
0.3078 1.9132 17500 0.4202
0.2279 1.9187 17550 0.4196
0.5487 1.9241 17600 0.4198
0.4874 1.9296 17650 0.4175
0.3608 1.9351 17700 0.4165
0.2441 1.9405 17750 0.4166
0.2644 1.9460 17800 0.4134
0.3575 1.9515 17850 0.4115
0.721 1.9569 17900 0.4133
0.4024 1.9624 17950 0.4130
0.4279 1.9679 18000 0.4140
0.7236 1.9733 18050 0.4117
0.3854 1.9788 18100 0.4117
0.3183 1.9843 18150 0.4098
0.3771 1.9897 18200 0.4112
0.2921 1.9952 18250 0.4112
0.2658 2.0007 18300 0.4126
0.1989 2.0061 18350 0.4202
0.262 2.0116 18400 0.4232
0.2986 2.0171 18450 0.4209
0.2186 2.0225 18500 0.4226
0.3781 2.0280 18550 0.4204
0.4399 2.0335 18600 0.4186
0.2152 2.0389 18650 0.4219
0.2351 2.0444 18700 0.4242
0.2159 2.0499 18750 0.4195
0.3344 2.0553 18800 0.4177
0.3056 2.0608 18850 0.4182
0.1924 2.0663 18900 0.4204
0.315 2.0717 18950 0.4197
0.2728 2.0772 19000 0.4210
0.3754 2.0827 19050 0.4218
0.1673 2.0881 19100 0.4213
0.1615 2.0936 19150 0.4209
0.2859 2.0990 19200 0.4185
0.2025 2.1045 19250 0.4197
0.4237 2.1100 19300 0.4216
0.2878 2.1154 19350 0.4213
0.2416 2.1209 19400 0.4203
0.3055 2.1264 19450 0.4182
0.2704 2.1318 19500 0.4191
0.1831 2.1373 19550 0.4215
0.178 2.1428 19600 0.4202
0.2575 2.1482 19650 0.4139
0.1799 2.1537 19700 0.4171
0.3215 2.1592 19750 0.4153
0.2772 2.1646 19800 0.4154
0.2041 2.1701 19850 0.4142
0.2015 2.1756 19900 0.4148
0.2451 2.1810 19950 0.4198
0.1856 2.1865 20000 0.4192
0.2024 2.1920 20050 0.4145
0.2167 2.1974 20100 0.4138
0.2629 2.2029 20150 0.4120
0.1391 2.2084 20200 0.4168
0.2906 2.2138 20250 0.4154
0.3033 2.2193 20300 0.4154
0.3119 2.2248 20350 0.4149
0.2636 2.2302 20400 0.4179
0.1601 2.2357 20450 0.4148
0.1798 2.2412 20500 0.4145
0.2127 2.2466 20550 0.4147
0.3626 2.2521 20600 0.4174
0.3045 2.2576 20650 0.4130

Framework versions

  • PEFT 0.12.0
  • Transformers 4.47.0
  • Pytorch 2.5.1+cu124
  • Datasets 3.0.0
  • Tokenizers 0.21.0
Downloads last month
4
Inference API
Unable to determine this model’s pipeline type. Check the docs .

Model tree for strongpear/llama3.0-8B_finetune_QA_EDU_36k_samples_r64

Adapter
(538)
this model