metadata

license: mit
base_model: roberta-large
tags:
  - generated_from_trainer
metrics:
  - accuracy
model-index:
  - name: roberta-large-sst-2-32-13-smoothed
    results: []

roberta-large-sst-2-32-13-smoothed

This model is a fine-tuned version of roberta-large on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 0.5917
Accuracy: 0.8906

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 1e-05
train_batch_size: 32
eval_batch_size: 32
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 50
num_epochs: 75
label_smoothing_factor: 0.45

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy
No log	1.0	2	0.7430	0.5
No log	2.0	4	0.7414	0.5
No log	3.0	6	0.7386	0.5
No log	4.0	8	0.7348	0.5
0.7439	5.0	10	0.7302	0.5
0.7439	6.0	12	0.7248	0.5
0.7439	7.0	14	0.7195	0.5
0.7439	8.0	16	0.7143	0.5
0.7439	9.0	18	0.7082	0.5
0.7171	10.0	20	0.7022	0.5
0.7171	11.0	22	0.6977	0.5
0.7171	12.0	24	0.6954	0.5312
0.7171	13.0	26	0.6936	0.5156
0.7171	14.0	28	0.6926	0.5156
0.7024	15.0	30	0.6922	0.5312
0.7024	16.0	32	0.6921	0.5469
0.7024	17.0	34	0.6927	0.5312
0.7024	18.0	36	0.6938	0.5312
0.7024	19.0	38	0.6958	0.5156
0.6826	20.0	40	0.6982	0.5156
0.6826	21.0	42	0.7138	0.5
0.6826	22.0	44	0.7064	0.5312
0.6826	23.0	46	0.6992	0.5625
0.6826	24.0	48	0.6926	0.5625
0.6474	25.0	50	0.6836	0.5781
0.6474	26.0	52	0.6617	0.7344
0.6474	27.0	54	0.6450	0.7656
0.6474	28.0	56	0.6392	0.7812
0.6474	29.0	58	0.6513	0.7344
0.5878	30.0	60	0.6481	0.7812
0.5878	31.0	62	0.6583	0.7969
0.5878	32.0	64	0.6649	0.7812
0.5878	33.0	66	0.6280	0.8125
0.5878	34.0	68	0.6212	0.8594
0.5602	35.0	70	0.6214	0.8281
0.5602	36.0	72	0.6534	0.75
0.5602	37.0	74	0.6334	0.8594
0.5602	38.0	76	0.6060	0.875
0.5602	39.0	78	0.6048	0.875
0.55	40.0	80	0.6064	0.8594
0.55	41.0	82	0.6095	0.8438
0.55	42.0	84	0.6161	0.8438
0.55	43.0	86	0.6068	0.8594
0.55	44.0	88	0.5929	0.875
0.5425	45.0	90	0.5918	0.8906
0.5425	46.0	92	0.5919	0.8906
0.5425	47.0	94	0.5921	0.875
0.5425	48.0	96	0.5925	0.875
0.5425	49.0	98	0.5970	0.8906
0.5415	50.0	100	0.6128	0.8438
0.5415	51.0	102	0.6187	0.8438
0.5415	52.0	104	0.6012	0.8906
0.5415	53.0	106	0.5981	0.8906
0.5415	54.0	108	0.6085	0.8125
0.5434	55.0	110	0.6028	0.8438
0.5434	56.0	112	0.5970	0.8594
0.5434	57.0	114	0.6013	0.8906
0.5434	58.0	116	0.6023	0.8906
0.5434	59.0	118	0.6002	0.8906
0.5397	60.0	120	0.5964	0.8906
0.5397	61.0	122	0.5940	0.8906
0.5397	62.0	124	0.5934	0.8906
0.5397	63.0	126	0.5936	0.8906
0.5397	64.0	128	0.5936	0.8906
0.5403	65.0	130	0.5939	0.8906
0.5403	66.0	132	0.5939	0.8906
0.5403	67.0	134	0.5933	0.8906
0.5403	68.0	136	0.5933	0.8906
0.5403	69.0	138	0.5934	0.8906
0.5394	70.0	140	0.5931	0.8906
0.5394	71.0	142	0.5926	0.8906
0.5394	72.0	144	0.5921	0.8906
0.5394	73.0	146	0.5919	0.8906
0.5394	74.0	148	0.5918	0.8906
0.5394	75.0	150	0.5917	0.8906

Framework versions

Transformers 4.32.0.dev0
Pytorch 2.0.1+cu118
Datasets 2.4.0
Tokenizers 0.13.3