roberta-large-sst-2-64-13-smoothed

This model is a fine-tuned version of roberta-large on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 0.5741
Accuracy: 0.9375

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 1e-05
train_batch_size: 32
eval_batch_size: 32
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 50
num_epochs: 75
label_smoothing_factor: 0.45

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy
No log	1.0	4	0.6932	0.5
No log	2.0	8	0.6930	0.5
0.6986	3.0	12	0.6928	0.5078
0.6986	4.0	16	0.6926	0.5078
0.7049	5.0	20	0.6926	0.5
0.7049	6.0	24	0.6924	0.5
0.7049	7.0	28	0.6922	0.5
0.6928	8.0	32	0.6918	0.5234
0.6928	9.0	36	0.6912	0.5312
0.6889	10.0	40	0.6905	0.5625
0.6889	11.0	44	0.6895	0.5078
0.6889	12.0	48	0.6880	0.5781
0.6855	13.0	52	0.6823	0.6875
0.6855	14.0	56	0.6590	0.8281
0.6346	15.0	60	0.6187	0.8672
0.6346	16.0	64	0.6192	0.8281
0.6346	17.0	68	0.5983	0.9062
0.5877	18.0	72	0.6030	0.875
0.5877	19.0	76	0.5942	0.9141
0.564	20.0	80	0.5918	0.8984
0.564	21.0	84	0.5860	0.9141
0.564	22.0	88	0.5761	0.9375
0.5505	23.0	92	0.5854	0.9297
0.5505	24.0	96	0.5750	0.9141
0.5462	25.0	100	0.5776	0.9141
0.5462	26.0	104	0.5713	0.9453
0.5462	27.0	108	0.5731	0.9375
0.5414	28.0	112	0.5770	0.9297
0.5414	29.0	116	0.5789	0.9141
0.5382	30.0	120	0.5871	0.9062
0.5382	31.0	124	0.5810	0.9141
0.5382	32.0	128	0.5765	0.9297
0.5383	33.0	132	0.5769	0.9297
0.5383	34.0	136	0.5718	0.9453
0.5385	35.0	140	0.5704	0.9453
0.5385	36.0	144	0.5728	0.9453
0.5385	37.0	148	0.5737	0.9297
0.5381	38.0	152	0.5749	0.9375
0.5381	39.0	156	0.5754	0.9375
0.5389	40.0	160	0.5742	0.9375
0.5389	41.0	164	0.5723	0.9375
0.5389	42.0	168	0.5720	0.9375
0.5372	43.0	172	0.5694	0.9453
0.5372	44.0	176	0.5723	0.9375
0.5384	45.0	180	0.5766	0.9375
0.5384	46.0	184	0.5715	0.9375
0.5384	47.0	188	0.5696	0.9453
0.5379	48.0	192	0.5709	0.9453
0.5379	49.0	196	0.5720	0.9453
0.5372	50.0	200	0.5717	0.9453
0.5372	51.0	204	0.5706	0.9453
0.5372	52.0	208	0.5697	0.9453
0.5371	53.0	212	0.5700	0.9453
0.5371	54.0	216	0.5706	0.9453
0.5368	55.0	220	0.5697	0.9453
0.5368	56.0	224	0.5702	0.9453
0.5368	57.0	228	0.5719	0.9453
0.5371	58.0	232	0.5728	0.9453
0.5371	59.0	236	0.5729	0.9375
0.5371	60.0	240	0.5734	0.9375
0.5371	61.0	244	0.5736	0.9375
0.5371	62.0	248	0.5745	0.9375
0.5369	63.0	252	0.5760	0.9375
0.5369	64.0	256	0.5772	0.9375
0.5365	65.0	260	0.5771	0.9375
0.5365	66.0	264	0.5763	0.9375
0.5365	67.0	268	0.5759	0.9375
0.5365	68.0	272	0.5753	0.9375
0.5365	69.0	276	0.5751	0.9375
0.5369	70.0	280	0.5746	0.9375
0.5369	71.0	284	0.5741	0.9375
0.5369	72.0	288	0.5742	0.9375
0.5367	73.0	292	0.5742	0.9375
0.5367	74.0	296	0.5741	0.9375
0.5368	75.0	300	0.5741	0.9375

Framework versions

Transformers 4.32.0.dev0
Pytorch 2.0.1+cu118
Datasets 2.4.0
Tokenizers 0.13.3

simonycl
/

roberta-large-sst-2-64-13-smoothed

roberta-large-sst-2-64-13-smoothed

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for simonycl/roberta-large-sst-2-64-13-smoothed

Evaluation results