bert-large-uncased-sst-2-64-13-smoothed

This model is a fine-tuned version of bert-large-uncased on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 0.6024
Accuracy: 0.8438

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 1e-05
train_batch_size: 32
eval_batch_size: 32
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 50
num_epochs: 75
label_smoothing_factor: 0.45

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy
No log	1.0	4	0.8114	0.5078
No log	2.0	8	0.7930	0.5078
0.8117	3.0	12	0.7630	0.5
0.8117	4.0	16	0.7257	0.5078
0.7546	5.0	20	0.6872	0.5938
0.7546	6.0	24	0.6706	0.6875
0.7546	7.0	28	0.6589	0.7578
0.6762	8.0	32	0.6473	0.7734
0.6762	9.0	36	0.6369	0.7812
0.6014	10.0	40	0.6282	0.7969
0.6014	11.0	44	0.6232	0.8125
0.6014	12.0	48	0.6226	0.8281
0.5545	13.0	52	0.6205	0.8281
0.5545	14.0	56	0.6191	0.7969
0.5486	15.0	60	0.6288	0.8047
0.5486	16.0	64	0.6184	0.8438
0.5486	17.0	68	0.6241	0.8203
0.5451	18.0	72	0.6098	0.8438
0.5451	19.0	76	0.6090	0.875
0.5418	20.0	80	0.6094	0.8672
0.5418	21.0	84	0.6092	0.8594
0.5418	22.0	88	0.6102	0.8594
0.5414	23.0	92	0.6107	0.8594
0.5414	24.0	96	0.6106	0.8281
0.5394	25.0	100	0.6104	0.8359
0.5394	26.0	104	0.6107	0.8359
0.5394	27.0	108	0.6125	0.8359
0.539	28.0	112	0.6144	0.8359
0.539	29.0	116	0.6139	0.8359
0.5398	30.0	120	0.6149	0.8281
0.5398	31.0	124	0.6174	0.8438
0.5398	32.0	128	0.6216	0.8359
0.5387	33.0	132	0.6200	0.8359
0.5387	34.0	136	0.6151	0.8438
0.5396	35.0	140	0.6138	0.8438
0.5396	36.0	144	0.6140	0.8438
0.5396	37.0	148	0.6147	0.8281
0.5388	38.0	152	0.6111	0.8516
0.5388	39.0	156	0.6097	0.8516
0.5391	40.0	160	0.6088	0.8594
0.5391	41.0	164	0.6090	0.8438
0.5391	42.0	168	0.6109	0.8438
0.5388	43.0	172	0.6102	0.8438
0.5388	44.0	176	0.6088	0.8438
0.5385	45.0	180	0.6091	0.8438
0.5385	46.0	184	0.6127	0.8438
0.5385	47.0	188	0.6167	0.8203
0.5391	48.0	192	0.6143	0.8359
0.5391	49.0	196	0.6071	0.8516
0.5387	50.0	200	0.6061	0.8516
0.5387	51.0	204	0.6054	0.8438
0.5387	52.0	208	0.6037	0.8516
0.5385	53.0	212	0.6019	0.8516
0.5385	54.0	216	0.6008	0.8438
0.5379	55.0	220	0.5998	0.8516
0.5379	56.0	224	0.5992	0.8516
0.5379	57.0	228	0.6001	0.8516
0.5382	58.0	232	0.6026	0.8438
0.5382	59.0	236	0.6039	0.8438
0.5381	60.0	240	0.6043	0.8438
0.5381	61.0	244	0.6032	0.8438
0.5381	62.0	248	0.6030	0.8438
0.5389	63.0	252	0.6023	0.8438
0.5389	64.0	256	0.6019	0.8438
0.5378	65.0	260	0.6024	0.8438
0.5378	66.0	264	0.6025	0.8438
0.5378	67.0	268	0.6020	0.8438
0.5374	68.0	272	0.6016	0.8438
0.5374	69.0	276	0.6017	0.8438
0.5378	70.0	280	0.6023	0.8438
0.5378	71.0	284	0.6025	0.8438
0.5378	72.0	288	0.6024	0.8438
0.5372	73.0	292	0.6023	0.8438
0.5372	74.0	296	0.6024	0.8438
0.5377	75.0	300	0.6024	0.8438

Framework versions

Transformers 4.32.0.dev0
Pytorch 2.0.1+cu118
Datasets 2.4.0
Tokenizers 0.13.3

simonycl
/

bert-large-uncased-sst-2-64-13-smoothed

bert-large-uncased-sst-2-64-13-smoothed

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for simonycl/bert-large-uncased-sst-2-64-13-smoothed

Evaluation results