loose_balanced_cf_seed-21_1e-3

This model was trained from scratch on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 3.1893
  • Accuracy: 0.4004

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 32
  • eval_batch_size: 64
  • seed: 21
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 256
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 32000
  • num_epochs: 20.0
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Accuracy
5.9852 0.9994 1485 4.4088 0.2942
4.3009 1.9995 2971 3.9064 0.3329
3.683 2.9996 4457 3.6296 0.3564
3.4929 3.9997 5943 3.4703 0.3715
3.2595 4.9997 7429 3.3734 0.3806
3.187 5.9998 8915 3.3128 0.3861
3.0814 6.9999 10401 3.2790 0.3895
3.0421 8.0 11887 3.2530 0.3919
2.9817 8.9994 13372 3.2345 0.3942
2.9585 9.9995 14858 3.2191 0.3958
2.9213 10.9996 16344 3.2119 0.3970
2.897 11.9997 17830 3.2048 0.3979
2.8782 12.9997 19316 3.1990 0.3987
2.8603 13.9998 20802 3.1978 0.3989
2.8496 14.9999 22288 3.1967 0.3992
2.8323 16.0 23774 3.1947 0.3995
2.8308 16.9994 25259 3.1966 0.3997
2.8119 17.9995 26745 3.1920 0.4000
2.8173 18.9996 28231 3.1855 0.4005
2.8007 19.9882 29700 3.1893 0.4004

Framework versions

  • Transformers 4.46.2
  • Pytorch 2.5.1+cu124
  • Datasets 3.2.0
  • Tokenizers 0.20.0
Downloads last month
5
Safetensors
Model size
110M params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.