long-t5-local-base-finetuned-justification-v03

This model is a fine-tuned version of google/long-t5-local-base on the None dataset. It achieves the following results on the evaluation set:

Loss: 3.0759

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-07
train_batch_size: 1
eval_batch_size: 1
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 100

Training results

Training Loss	Epoch	Step	Validation Loss
37.3451	1.0	676	30.7532
35.6146	2.0	1352	28.8261
32.3915	3.0	2028	26.9524
30.6984	4.0	2704	25.0794
29.1119	5.0	3380	23.2287
25.8833	6.0	4056	21.3800
24.6464	7.0	4732	19.5382
23.0238	8.0	5408	17.7077
20.1628	9.0	6084	15.7830
18.3835	10.0	6760	13.7916
17.0307	11.0	7436	11.7701
13.9093	12.0	8112	9.7176
12.3394	13.0	8788	7.7920
10.888	14.0	9464	6.6093
8.1438	15.0	10140	6.1657
7.1948	16.0	10816	5.9876
6.4311	17.0	11492	5.8574
5.2684	18.0	12168	5.7303
5.2309	19.0	12844	5.6090
4.6475	20.0	13520	5.4782
4.583	21.0	14196	5.3425
4.3645	22.0	14872	5.2021
4.1721	23.0	15548	5.0568
4.1423	24.0	16224	4.9156
3.983	25.0	16900	4.7832
3.9396	26.0	17576	4.6574
3.8342	27.0	18252	4.5455
3.651	28.0	18928	4.4371
3.663	29.0	19604	4.3453
3.5847	30.0	20280	4.2648
3.5013	31.0	20956	4.1942
3.5122	32.0	21632	4.1298
3.3473	33.0	22308	4.0700
3.3417	34.0	22984	4.0167
3.3881	35.0	23660	3.9700
3.2404	36.0	24336	3.9258
3.2232	37.0	25012	3.8830
3.2287	38.0	25688	3.8438
3.0759	39.0	26364	3.8066
3.053	40.0	27040	3.7711
3.0726	41.0	27716	3.7386
3.0198	42.0	28392	3.7072
3.0923	43.0	29068	3.6768
2.986	44.0	29744	3.6489
2.9184	45.0	30420	3.6221
2.9114	46.0	31096	3.5972
2.9585	47.0	31772	3.5736
2.91	48.0	32448	3.5477
2.8974	49.0	33124	3.5252
2.9211	50.0	33800	3.5020
2.785	51.0	34476	3.4795
2.8177	52.0	35152	3.4581
2.9204	53.0	35828	3.4392
2.7911	54.0	36504	3.4199
2.8178	55.0	37180	3.4013
2.8029	56.0	37856	3.3830
2.7654	57.0	38532	3.3655
2.7854	58.0	39208	3.3486
2.7322	59.0	39884	3.3314
2.722	60.0	40560	3.3161
2.6665	61.0	41236	3.3016
2.719	62.0	41912	3.2865
2.6758	63.0	42588	3.2720
2.6586	64.0	43264	3.2582
2.6443	65.0	43940	3.2451
2.6656	66.0	44616	3.2322
2.6183	67.0	45292	3.2206
2.6143	68.0	45968	3.2087
2.6117	69.0	46644	3.1989
2.6567	70.0	47320	3.1890
2.5946	71.0	47996	3.1797
2.5836	72.0	48672	3.1703
2.5928	73.0	49348	3.1627
2.6024	74.0	50024	3.1552
2.6117	75.0	50700	3.1469
2.5681	76.0	51376	3.1392
2.5731	77.0	52052	3.1329
2.5749	78.0	52728	3.1267
2.5788	79.0	53404	3.1208
2.5478	80.0	54080	3.1156
2.6048	81.0	54756	3.1104
2.5514	82.0	55432	3.1062
2.5705	83.0	56108	3.1027
2.5733	84.0	56784	3.0994
2.4704	85.0	57460	3.0965
2.558	86.0	58136	3.0928
2.5566	87.0	58812	3.0895
2.4822	88.0	59488	3.0869
2.5377	89.0	60164	3.0844
2.5173	90.0	60840	3.0826
2.5312	91.0	61516	3.0809
2.5038	92.0	62192	3.0799
2.5645	93.0	62868	3.0788
2.5612	94.0	63544	3.0778
2.4948	95.0	64220	3.0770
2.538	96.0	64896	3.0765
2.4701	97.0	65572	3.0762
2.5269	98.0	66248	3.0759
2.5265	99.0	66924	3.0759
2.4955	100.0	67600	3.0759

Framework versions

Transformers 4.38.2
Pytorch 2.2.2+cu121
Datasets 2.18.0
Tokenizers 0.15.2

satyanshu404
/

long-t5-local-base-finetuned-justification-v03

long-t5-local-base-finetuned-justification-v03

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for satyanshu404/long-t5-local-base-finetuned-justification-v03

Evaluation results