getcode

This model is a fine-tuned version of facebook/opt-125m on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.3129

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0004
  • train_batch_size: 16
  • eval_batch_size: 4
  • seed: 42
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 128
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 20
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
No log 1.0 22 1.8588
No log 2.0 44 1.4885
No log 3.0 66 1.3922
No log 4.0 88 1.3830
No log 5.0 110 1.3547
No log 6.0 132 1.3556
No log 7.0 154 1.3672
No log 8.0 176 1.3509
No log 9.0 198 1.3414
No log 10.0 220 1.3428
No log 11.0 242 1.3361
No log 12.0 264 1.3332
No log 13.0 286 1.3333
No log 14.0 308 1.3280
No log 15.0 330 1.3232
No log 16.0 352 1.3215
No log 17.0 374 1.3185
No log 18.0 396 1.3142
No log 19.0 418 1.3129
No log 19.0947 420 1.3129

Framework versions

  • Transformers 4.47.0
  • Pytorch 2.5.1+cu121
  • Datasets 3.2.0
  • Tokenizers 0.21.0
Downloads last month
2,080
Safetensors
Model size
125M params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Model tree for ccore/getcode

Base model

facebook/opt-125m
Finetuned
(71)
this model