categorization-finetuned-20220721-164940-distilled-20220811-132317

This model is a fine-tuned version of carted-nlp/categorization-finetuned-20220721-164940 on the None dataset. It achieves the following results on the evaluation set:

Loss: 0.1522
Accuracy: 0.8783
F1: 0.8779

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 4e-05
train_batch_size: 64
eval_batch_size: 128
seed: 314
gradient_accumulation_steps: 4
total_train_batch_size: 256
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: cosine
lr_scheduler_warmup_steps: 2000
num_epochs: 30.0

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy	F1
0.5212	0.56	2500	0.2564	0.7953	0.7921
0.243	1.12	5000	0.2110	0.8270	0.8249
0.2105	1.69	7500	0.1925	0.8409	0.8391
0.1939	2.25	10000	0.1837	0.8476	0.8465
0.1838	2.81	12500	0.1771	0.8528	0.8517
0.1729	3.37	15000	0.1722	0.8564	0.8555
0.1687	3.94	17500	0.1684	0.8593	0.8576
0.1602	4.5	20000	0.1653	0.8614	0.8604
0.1572	5.06	22500	0.1629	0.8648	0.8638
0.1507	5.62	25000	0.1605	0.8654	0.8646
0.1483	6.19	27500	0.1602	0.8661	0.8653
0.1431	6.75	30000	0.1597	0.8669	0.8663
0.1393	7.31	32500	0.1581	0.8691	0.8687
0.1374	7.87	35000	0.1556	0.8704	0.8697
0.1321	8.43	37500	0.1558	0.8707	0.8700
0.1328	9.0	40000	0.1536	0.8719	0.8711
0.1261	9.56	42500	0.1544	0.8716	0.8708
0.1256	10.12	45000	0.1541	0.8731	0.8725
0.122	10.68	47500	0.1520	0.8741	0.8734
0.1196	11.25	50000	0.1529	0.8734	0.8728
0.1182	11.81	52500	0.1510	0.8758	0.8751
0.1145	12.37	55000	0.1526	0.8746	0.8737
0.1141	12.93	57500	0.1512	0.8765	0.8759
0.1094	13.5	60000	0.1517	0.8760	0.8753
0.1098	14.06	62500	0.1513	0.8771	0.8764
0.1058	14.62	65000	0.1506	0.8775	0.8768
0.1048	15.18	67500	0.1521	0.8774	0.8768
0.1028	15.74	70000	0.1520	0.8778	0.8773
0.1006	16.31	72500	0.1517	0.8780	0.8774
0.1001	16.87	75000	0.1505	0.8794	0.8790
0.0971	17.43	77500	0.1520	0.8784	0.8778
0.0973	17.99	80000	0.1514	0.8796	0.8790
0.0938	18.56	82500	0.1516	0.8795	0.8789
0.0942	19.12	85000	0.1522	0.8794	0.8789
0.0918	19.68	87500	0.1518	0.8799	0.8793
0.0909	20.24	90000	0.1528	0.8803	0.8796
0.0901	20.81	92500	0.1516	0.8799	0.8793
0.0882	21.37	95000	0.1519	0.8800	0.8794
0.088	21.93	97500	0.1517	0.8802	0.8798
0.086	22.49	100000	0.1530	0.8800	0.8795
0.0861	23.05	102500	0.1523	0.8806	0.8801
0.0846	23.62	105000	0.1524	0.8808	0.8802
0.0843	24.18	107500	0.1522	0.8805	0.8800
0.0836	24.74	110000	0.1525	0.8808	0.8803
0.083	25.3	112500	0.1528	0.8810	0.8803
0.0829	25.87	115000	0.1528	0.8808	0.8802
0.082	26.43	117500	0.1529	0.8808	0.8802
0.0818	26.99	120000	0.1525	0.8811	0.8805
0.0816	27.55	122500	0.1526	0.8811	0.8806
0.0809	28.12	125000	0.1528	0.8810	0.8805
0.0809	28.68	127500	0.1527	0.8810	0.8804
0.0814	29.24	130000	0.1528	0.8808	0.8802
0.0807	29.8	132500	0.1528	0.8808	0.8802

Framework versions

Transformers 4.17.0
Pytorch 1.11.0+cu113
Datasets 2.3.2
Tokenizers 0.11.6

carted-nlp
/

categorization-finetuned-20220721-164940-distilled-20220811-132317

categorization-finetuned-20220721-164940-distilled-20220811-132317

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Evaluation results