long-t5-local-base-finetuned-justification-v04

This model is a fine-tuned version of google/long-t5-local-base on the None dataset. It achieves the following results on the evaluation set:

Loss: 3.1250

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-07
train_batch_size: 1
eval_batch_size: 1
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 100

Training results

Training Loss	Epoch	Step	Validation Loss
28.8458	1.0	676	22.6997
27.4586	2.0	1352	20.9991
24.6852	3.0	2028	19.3376
23.1391	4.0	2704	17.6632
21.9699	5.0	3380	15.9812
19.2778	6.0	4056	14.2373
17.8249	7.0	4732	12.4497
16.3797	8.0	5408	10.5990
13.5688	9.0	6084	8.7096
11.8803	10.0	6760	7.3735
10.6635	11.0	7436	6.9285
7.8818	12.0	8112	6.7818
7.0876	13.0	8788	6.7247
6.4869	14.0	9464	6.6927
5.4921	15.0	10140	6.6090
5.4028	16.0	10816	6.4377
5.1724	17.0	11492	6.1746
4.7436	18.0	12168	5.9188
4.8811	19.0	12844	5.6474
4.5173	20.0	13520	5.3848
4.463	21.0	14196	5.1572
4.2793	22.0	14872	4.9559
4.1378	23.0	15548	4.7922
4.1071	24.0	16224	4.6584
3.9575	25.0	16900	4.5451
3.913	26.0	17576	4.4492
3.8237	27.0	18252	4.3749
3.6779	28.0	18928	4.3044
3.7032	29.0	19604	4.2445
3.6205	30.0	20280	4.1905
3.5488	31.0	20956	4.1418
3.5511	32.0	21632	4.0970
3.4027	33.0	22308	4.0561
3.4178	34.0	22984	4.0172
3.4568	35.0	23660	3.9798
3.3257	36.0	24336	3.9451
3.3202	37.0	25012	3.9120
3.3266	38.0	25688	3.8796
3.1662	39.0	26364	3.8489
3.1681	40.0	27040	3.8190
3.194	41.0	27716	3.7912
3.138	42.0	28392	3.7628
3.2074	43.0	29068	3.7340
3.1001	44.0	29744	3.7068
3.0341	45.0	30420	3.6814
3.0394	46.0	31096	3.6553
3.0737	47.0	31772	3.6293
3.0273	48.0	32448	3.6053
3.0223	49.0	33124	3.5809
3.04	50.0	33800	3.5563
2.9182	51.0	34476	3.5335
2.9424	52.0	35152	3.5120
3.0383	53.0	35828	3.4895
2.9066	54.0	36504	3.4680
2.9488	55.0	37180	3.4477
2.9238	56.0	37856	3.4284
2.8921	57.0	38532	3.4095
2.9047	58.0	39208	3.3916
2.8593	59.0	39884	3.3749
2.8443	60.0	40560	3.3587
2.7948	61.0	41236	3.3435
2.8484	62.0	41912	3.3290
2.8094	63.0	42588	3.3143
2.7928	64.0	43264	3.3019
2.7722	65.0	43940	3.2888
2.7996	66.0	44616	3.2757
2.7521	67.0	45292	3.2646
2.7506	68.0	45968	3.2536
2.7454	69.0	46644	3.2433
2.782	70.0	47320	3.2333
2.733	71.0	47996	3.2244
2.7219	72.0	48672	3.2158
2.7289	73.0	49348	3.2075
2.7365	74.0	50024	3.2001
2.7487	75.0	50700	3.1934
2.6969	76.0	51376	3.1863
2.7079	77.0	52052	3.1803
2.717	78.0	52728	3.1741
2.7059	79.0	53404	3.1690
2.681	80.0	54080	3.1639
2.7309	81.0	54756	3.1592
2.6887	82.0	55432	3.1546
2.7021	83.0	56108	3.1506
2.7144	84.0	56784	3.1474
2.6032	85.0	57460	3.1443
2.6943	86.0	58136	3.1411
2.6888	87.0	58812	3.1382
2.6167	88.0	59488	3.1356
2.6672	89.0	60164	3.1333
2.6447	90.0	60840	3.1315
2.668	91.0	61516	3.1300
2.6378	92.0	62192	3.1287
2.7002	93.0	62868	3.1277
2.6958	94.0	63544	3.1269
2.6296	95.0	64220	3.1262
2.6784	96.0	64896	3.1257
2.6044	97.0	65572	3.1253
2.6682	98.0	66248	3.1251
2.6628	99.0	66924	3.1250
2.6305	100.0	67600	3.1250

Framework versions

Transformers 4.38.2
Pytorch 2.2.2+cu121
Datasets 2.18.0
Tokenizers 0.15.2

satyanshu404
/

long-t5-local-base-finetuned-justification-v04

long-t5-local-base-finetuned-justification-v04

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for satyanshu404/long-t5-local-base-finetuned-justification-v04

Evaluation results