distilbert-finetuned-lr1e-06-epochs50

This model is a fine-tuned version of distilbert-base-cased-distilled-squad on the None dataset. It achieves the following results on the evaluation set:

Loss: 3.1397

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 1e-06
train_batch_size: 16
eval_batch_size: 16
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 50

Training results

Training Loss	Epoch	Step	Validation Loss
No log	1.0	10	5.6380
No log	2.0	20	5.2148
No log	3.0	30	4.9729
No log	4.0	40	4.8036
No log	5.0	50	4.6566
No log	6.0	60	4.5248
No log	7.0	70	4.4054
No log	8.0	80	4.2868
No log	9.0	90	4.1864
No log	10.0	100	4.0935
No log	11.0	110	4.0126
No log	12.0	120	3.9390
No log	13.0	130	3.8698
No log	14.0	140	3.8036
No log	15.0	150	3.7400
No log	16.0	160	3.6834
No log	17.0	170	3.6343
No log	18.0	180	3.5871
No log	19.0	190	3.5456
No log	20.0	200	3.5103
No log	21.0	210	3.4753
No log	22.0	220	3.4419
No log	23.0	230	3.4087
No log	24.0	240	3.3805
No log	25.0	250	3.3562
No log	26.0	260	3.3345
No log	27.0	270	3.3151
No log	28.0	280	3.2957
No log	29.0	290	3.2772
No log	30.0	300	3.2620
No log	31.0	310	3.2497
No log	32.0	320	3.2358
No log	33.0	330	3.2254
No log	34.0	340	3.2158
No log	35.0	350	3.2057
No log	36.0	360	3.1972
No log	37.0	370	3.1877
No log	38.0	380	3.1800
No log	39.0	390	3.1722
No log	40.0	400	3.1664
No log	41.0	410	3.1630
No log	42.0	420	3.1585
No log	43.0	430	3.1538
No log	44.0	440	3.1488
No log	45.0	450	3.1454
No log	46.0	460	3.1422
No log	47.0	470	3.1414
No log	48.0	480	3.1407
No log	49.0	490	3.1399
2.8494	50.0	500	3.1397

Framework versions

Transformers 4.28.1
Pytorch 2.0.0+cu118
Datasets 2.12.0
Tokenizers 0.13.3

gallyamovi
/

distilbert-finetuned-lr1e-06-epochs50

distilbert-finetuned-lr1e-06-epochs50

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Evaluation results