distilbert-finetuned-lr1e-05-epochs50

This model is a fine-tuned version of distilbert-base-cased-distilled-squad on the None dataset. It achieves the following results on the evaluation set:

Loss: 4.5477

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 1e-05
train_batch_size: 16
eval_batch_size: 16
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 50

Training results

Training Loss	Epoch	Step	Validation Loss
No log	1.0	10	4.0110
No log	2.0	20	3.3394
No log	3.0	30	3.1992
No log	4.0	40	2.9902
No log	5.0	50	2.9628
No log	6.0	60	2.9346
No log	7.0	70	2.9844
No log	8.0	80	2.9660
No log	9.0	90	2.9239
No log	10.0	100	3.0764
No log	11.0	110	3.1964
No log	12.0	120	3.2409
No log	13.0	130	3.3191
No log	14.0	140	3.3747
No log	15.0	150	3.5559
No log	16.0	160	3.6678
No log	17.0	170	3.6692
No log	18.0	180	3.7116
No log	19.0	190	3.6768
No log	20.0	200	3.7929
No log	21.0	210	3.8766
No log	22.0	220	3.8967
No log	23.0	230	3.8982
No log	24.0	240	3.9140
No log	25.0	250	3.9563
No log	26.0	260	3.9702
No log	27.0	270	3.9615
No log	28.0	280	4.0481
No log	29.0	290	4.1172
No log	30.0	300	4.2297
No log	31.0	310	4.3585
No log	32.0	320	4.3186
No log	33.0	330	4.2844
No log	34.0	340	4.2662
No log	35.0	350	4.3037
No log	36.0	360	4.4106
No log	37.0	370	4.4208
No log	38.0	380	4.3877
No log	39.0	390	4.4133
No log	40.0	400	4.4798
No log	41.0	410	4.4925
No log	42.0	420	4.4595
No log	43.0	430	4.4402
No log	44.0	440	4.4379
No log	45.0	450	4.4711
No log	46.0	460	4.4953
No log	47.0	470	4.5282
No log	48.0	480	4.5400
No log	49.0	490	4.5472
0.4112	50.0	500	4.5477

Framework versions

Transformers 4.28.1
Pytorch 2.0.0+cu118
Datasets 2.12.0
Tokenizers 0.13.3

gallyamovi
/

distilbert-finetuned-lr1e-05-epochs50

distilbert-finetuned-lr1e-05-epochs50

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Evaluation results