Edit model card

categorization-finetuned-20220721-164940-distilled-20220810-185342

This model is a fine-tuned version of carted-nlp/categorization-finetuned-20220721-164940 on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0639
  • Accuracy: 0.87
  • F1: 0.8690

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 64
  • eval_batch_size: 64
  • seed: 314
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 256
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 1500
  • num_epochs: 30.0

Training results

Training Loss Epoch Step Validation Loss Accuracy F1
0.269 0.56 2500 0.1280 0.7547 0.7461
0.125 1.12 5000 0.1052 0.7960 0.7916
0.1079 1.69 7500 0.0950 0.8132 0.8102
0.0992 2.25 10000 0.0898 0.8216 0.8188
0.0938 2.81 12500 0.0859 0.8294 0.8268
0.0891 3.37 15000 0.0828 0.8349 0.8329
0.0863 3.94 17500 0.0806 0.8391 0.8367
0.0834 4.5 20000 0.0788 0.8417 0.8400
0.081 5.06 22500 0.0774 0.8449 0.8430
0.0792 5.62 25000 0.0754 0.8475 0.8460
0.0778 6.19 27500 0.0749 0.8489 0.8474
0.0758 6.75 30000 0.0738 0.8517 0.8502
0.0745 7.31 32500 0.0729 0.8531 0.8519
0.0733 7.87 35000 0.0720 0.8544 0.8528
0.072 8.43 37500 0.0714 0.8559 0.8546
0.0716 9.0 40000 0.0707 0.8565 0.8554
0.0701 9.56 42500 0.0704 0.8574 0.8558
0.0693 10.12 45000 0.0700 0.8581 0.8569
0.0686 10.68 47500 0.0690 0.8600 0.8588
0.0675 11.25 50000 0.0690 0.8605 0.8593
0.0673 11.81 52500 0.0682 0.8614 0.8603
0.0663 12.37 55000 0.0682 0.8619 0.8606
0.0657 12.93 57500 0.0675 0.8634 0.8624
0.0648 13.5 60000 0.0674 0.8636 0.8625
0.0647 14.06 62500 0.0668 0.8644 0.8633
0.0638 14.62 65000 0.0669 0.8648 0.8635
0.0634 15.18 67500 0.0665 0.8654 0.8643
0.063 15.74 70000 0.0663 0.8664 0.8654
0.0623 16.31 72500 0.0662 0.8663 0.8652
0.0622 16.87 75000 0.0657 0.8669 0.8660
0.0615 17.43 77500 0.0658 0.8670 0.8660
0.0616 17.99 80000 0.0655 0.8676 0.8667
0.0608 18.56 82500 0.0653 0.8683 0.8672
0.0606 19.12 85000 0.0653 0.8679 0.8669
0.0602 19.68 87500 0.0648 0.8690 0.8680
0.0599 20.24 90000 0.0650 0.8688 0.8677
0.0598 20.81 92500 0.0647 0.8689 0.8680
0.0592 21.37 95000 0.0647 0.8692 0.8681
0.0591 21.93 97500 0.0646 0.8698 0.8688
0.0587 22.49 100000 0.0645 0.8699 0.8690
0.0586 23.05 102500 0.0644 0.8699 0.8690
0.0583 23.62 105000 0.0644 0.8699 0.8690
0.058 24.18 107500 0.0642 0.8703 0.8693
0.058 24.74 110000 0.0642 0.8704 0.8694
0.0578 25.3 112500 0.0641 0.8703 0.8693
0.0576 25.87 115000 0.0641 0.8708 0.8699
0.0573 26.43 117500 0.0641 0.8708 0.8698
0.0574 26.99 120000 0.0639 0.8711 0.8702
0.0571 27.55 122500 0.0640 0.8711 0.8701
0.0569 28.12 125000 0.0639 0.8711 0.8702
0.0569 28.68 127500 0.0639 0.8712 0.8703
0.057 29.24 130000 0.0639 0.8712 0.8703
0.0566 29.8 132500 0.0638 0.8713 0.8704

Framework versions

  • Transformers 4.17.0
  • Pytorch 1.11.0+cu113
  • Datasets 2.3.2
  • Tokenizers 0.11.6
Downloads last month
7
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.