qing-yao's picture
Model save
a6a41ae verified
metadata
library_name: transformers
tags:
  - generated_from_trainer
metrics:
  - accuracy
model-index:
  - name: counterfactual_seed-21_1e-3
    results: []

counterfactual_seed-21_1e-3

This model was trained from scratch on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 3.1823
  • Accuracy: 0.4006

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 32
  • eval_batch_size: 64
  • seed: 21
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 256
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 32000
  • num_epochs: 20.0
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Accuracy
5.989 0.9994 1486 4.4159 0.2934
4.3089 1.9992 2972 3.9080 0.3322
3.7012 2.9991 4458 3.6264 0.3562
3.5317 3.9996 5945 3.4692 0.3708
3.3115 4.9994 7431 3.3732 0.3799
3.2381 5.9992 8917 3.3111 0.3859
3.1324 6.9991 10403 3.2706 0.3900
3.0931 7.9996 11890 3.2468 0.3921
3.0318 8.9994 13376 3.2260 0.3944
3.0079 9.9992 14862 3.2155 0.3960
2.9696 10.9991 16348 3.2068 0.3968
2.9483 11.9996 17835 3.2019 0.3973
2.9289 12.9994 19321 3.2017 0.3978
2.9089 13.9992 20807 3.1979 0.3988
2.9 14.9991 22293 3.1900 0.3990
2.8805 15.9996 23780 3.1883 0.3997
2.8807 16.9994 25266 3.1876 0.3999
2.8605 17.9992 26752 3.1893 0.3999
2.8688 18.9991 28238 3.1880 0.4002
2.8484 19.9962 29720 3.1823 0.4006

Framework versions

  • Transformers 4.46.2
  • Pytorch 2.5.1+cu124
  • Datasets 3.2.0
  • Tokenizers 0.20.0