t5-base-p-l-akk-en-20241125-151008

This model was trained from scratch on the None dataset. It achieves the following results on the evaluation set:

Loss: 0.4584

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0001
train_batch_size: 200
eval_batch_size: 200
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 2000
num_epochs: 200

Training results

Training Loss	Epoch	Step	Validation Loss
0.9667	1.0	10362	0.9112
0.8355	2.0	20724	0.7909
0.772	3.0	31086	0.7263
0.7326	4.0	41448	0.6910
0.7033	5.0	51810	0.6666
0.6787	6.0	62172	0.6455
0.6633	7.0	72534	0.6329
0.652	8.0	82896	0.6206
0.6408	9.0	93258	0.6073
0.6315	10.0	103620	0.6015
0.6161	11.0	113982	0.5914
0.6211	12.0	124344	0.5857
0.6053	13.0	134706	0.5766
0.6043	14.0	145068	0.5727
0.5954	15.0	155430	0.5681
0.59	16.0	165792	0.5649
0.5844	17.0	176154	0.5628
0.579	18.0	186516	0.5564
0.5792	19.0	196878	0.5493
0.5739	20.0	207240	0.5479
0.567	21.0	217602	0.5435
0.5626	22.0	227964	0.5406
0.5591	23.0	238326	0.5375
0.5508	24.0	248688	0.5356
0.5548	25.0	259050	0.5329
0.5512	26.0	269412	0.5299
0.5473	27.0	279774	0.5267
0.5413	28.0	290136	0.5243
0.5433	29.0	300498	0.5246
0.5378	30.0	310860	0.5209
0.5375	31.0	321222	0.5206
0.5363	32.0	331584	0.5178
0.528	33.0	341946	0.5143
0.532	34.0	352308	0.5121
0.5279	35.0	362670	0.5137
0.5265	36.0	373032	0.5080
0.5231	37.0	383394	0.5077
0.5187	38.0	393756	0.5082
0.5191	39.0	404118	0.5047
0.5159	40.0	414480	0.5029
0.5159	41.0	424842	0.5014
0.5131	42.0	435204	0.4998
0.5137	43.0	445566	0.4973
0.5128	44.0	455928	0.4972
0.5101	45.0	466290	0.4985
0.505	46.0	476652	0.4969
0.5014	47.0	487014	0.4964
0.4988	48.0	497376	0.4938
0.5051	49.0	507738	0.4898
0.4974	50.0	518100	0.4928
0.4999	51.0	528462	0.4904
0.4973	52.0	538824	0.4884
0.4973	53.0	549186	0.4877
0.4913	54.0	559548	0.4879
0.4968	55.0	569910	0.4846
0.4916	56.0	580272	0.4838
0.4938	57.0	590634	0.4833
0.4866	58.0	600996	0.4819
0.4871	59.0	611358	0.4818
0.4837	60.0	621720	0.4792
0.4855	61.0	632082	0.4783
0.4828	62.0	642444	0.4781
0.4789	63.0	652806	0.4780
0.4781	64.0	663168	0.4785
0.4803	65.0	673530	0.4767
0.4791	66.0	683892	0.4755
0.4783	67.0	694254	0.4743
0.4772	68.0	704616	0.4739
0.4757	69.0	714978	0.4730
0.4708	70.0	725340	0.4711
0.4698	71.0	735702	0.4717
0.4719	72.0	746064	0.4733
0.4708	73.0	756426	0.4703
0.4717	74.0	766788	0.4700
0.4714	75.0	777150	0.4677
0.4641	76.0	787512	0.4688
0.4642	77.0	797874	0.4678
0.4656	78.0	808236	0.4666
0.4625	79.0	818598	0.4661
0.4623	80.0	828960	0.4664
0.4619	81.0	839322	0.4657
0.4574	82.0	849684	0.4635
0.4562	83.0	860046	0.4628
0.4593	84.0	870408	0.4613
0.4583	85.0	880770	0.4600
0.4573	86.0	891132	0.4598
0.4518	87.0	901494	0.4564
0.4599	88.0	911856	0.4577
0.4545	89.0	922218	0.4594
0.4534	90.0	932580	0.4564
0.449	91.0	942942	0.4564
0.4523	92.0	953304	0.4584

Framework versions

Transformers 4.45.2
Pytorch 2.6.0.dev20241022+cu124
Datasets 3.0.1
Tokenizers 0.20.1

Thalesian
/

t5-base-p-l-akk-en-20241125-151008

t5-base-p-l-akk-en-20241125-151008

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Evaluation results