digikala_products_parsbert_model

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

Loss: 2.7245

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 32
eval_batch_size: 8
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 100

Training results

Training Loss	Epoch	Step	Validation Loss
No log	1.0	25	7.8477
No log	2.0	50	7.0014
No log	3.0	75	6.3235
No log	4.0	100	5.6651
No log	5.0	125	4.9101
No log	6.0	150	4.2448
No log	7.0	175	3.8656
No log	8.0	200	3.4329
No log	9.0	225	3.3204
No log	10.0	250	3.0740
No log	11.0	275	2.9556
No log	12.0	300	2.9938
No log	13.0	325	2.8620
No log	14.0	350	2.7879
No log	15.0	375	2.8619
No log	16.0	400	2.8521
No log	17.0	425	2.7920
No log	18.0	450	2.8494
No log	19.0	475	2.8209
4.1477	20.0	500	2.8471
4.1477	21.0	525	2.8478
4.1477	22.0	550	2.7904
4.1477	23.0	575	2.7961
4.1477	24.0	600	2.7494
4.1477	25.0	625	2.8250
4.1477	26.0	650	2.7439
4.1477	27.0	675	2.7539
4.1477	28.0	700	2.7635
4.1477	29.0	725	2.7742
4.1477	30.0	750	2.7711
4.1477	31.0	775	2.8243
4.1477	32.0	800	2.7547
4.1477	33.0	825	2.7690
4.1477	34.0	850	2.7178
4.1477	35.0	875	2.7554
4.1477	36.0	900	2.7701
4.1477	37.0	925	2.7953
4.1477	38.0	950	2.8062
4.1477	39.0	975	2.7637
2.772	40.0	1000	2.7675
2.772	41.0	1025	2.7953
2.772	42.0	1050	2.8003
2.772	43.0	1075	2.7484
2.772	44.0	1100	2.7292
2.772	45.0	1125	2.7287
2.772	46.0	1150	2.6998
2.772	47.0	1175	2.7381
2.772	48.0	1200	2.7196
2.772	49.0	1225	2.7450
2.772	50.0	1250	2.7293
2.772	51.0	1275	2.7216
2.772	52.0	1300	2.7981
2.772	53.0	1325	2.7405
2.772	54.0	1350	2.7895
2.772	55.0	1375	2.7092
2.772	56.0	1400	2.7977
2.772	57.0	1425	2.7012
2.772	58.0	1450	2.7752
2.772	59.0	1475	2.7469
2.742	60.0	1500	2.7205
2.742	61.0	1525	2.7752
2.742	62.0	1550	2.6942
2.742	63.0	1575	2.6916
2.742	64.0	1600	2.8169
2.742	65.0	1625	2.7256
2.742	66.0	1650	2.6844
2.742	67.0	1675	2.7544
2.742	68.0	1700	2.7083
2.742	69.0	1725	2.7286
2.742	70.0	1750	2.7492
2.742	71.0	1775	2.6946
2.742	72.0	1800	2.7395
2.742	73.0	1825	2.7597
2.742	74.0	1850	2.7953
2.742	75.0	1875	2.7468
2.742	76.0	1900	2.7274
2.742	77.0	1925	2.7507
2.742	78.0	1950	2.7174
2.742	79.0	1975	2.7233
2.7185	80.0	2000	2.7405
2.7185	81.0	2025	2.7781
2.7185	82.0	2050	2.7534
2.7185	83.0	2075	2.7588
2.7185	84.0	2100	2.7469
2.7185	85.0	2125	2.6929
2.7185	86.0	2150	2.6785
2.7185	87.0	2175	2.7098
2.7185	88.0	2200	2.7622
2.7185	89.0	2225	2.7726
2.7185	90.0	2250	2.7144
2.7185	91.0	2275	2.7877
2.7185	92.0	2300	2.7665
2.7185	93.0	2325	2.7794
2.7185	94.0	2350	2.6788
2.7185	95.0	2375	2.7398
2.7185	96.0	2400	2.7277
2.7185	97.0	2425	2.8053
2.7185	98.0	2450	2.7537
2.7185	99.0	2475	2.7467
2.7057	100.0	2500	2.7191

Framework versions

Transformers 4.26.0
Pytorch 1.13.1+cu116
Datasets 2.9.0
Tokenizers 0.13.2

danfarh2000
/

digikala_products_parsbert_model

digikala_products_parsbert_model

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Evaluation results