Edit model card

collapse_gemma-2-2b_hs2_replace_iter19_sftsd2

This model is a fine-tuned version of google/gemma-2-2b on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 2.6313
  • Num Input Tokens Seen: 4586296

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 8e-06
  • train_batch_size: 8
  • eval_batch_size: 16
  • seed: 2
  • gradient_accumulation_steps: 16
  • total_train_batch_size: 128
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: constant_with_warmup
  • lr_scheduler_warmup_ratio: 0.05
  • num_epochs: 1

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
No log 0 0 1.3909 0
1.6265 0.0513 5 1.2809 240576
0.8464 0.1026 10 1.3184 478736
0.464 0.1538 15 1.5648 714752
0.1587 0.2051 20 1.7575 949952
0.1115 0.2564 25 1.9885 1188848
0.0459 0.3077 30 2.2789 1425720
0.0281 0.3590 35 2.4173 1661440
0.0227 0.4103 40 2.5453 1883136
0.0234 0.4615 45 2.6143 2119680
0.0251 0.5128 50 2.6521 2357408
0.0212 0.5641 55 2.6451 2594336
0.0217 0.6154 60 2.6474 2834280
0.02 0.6667 65 2.6435 3067200
0.0212 0.7179 70 2.6357 3305472
0.0208 0.7692 75 2.6311 3547016
0.0198 0.8205 80 2.6427 3785376
0.0212 0.8718 85 2.6364 4022648
0.0194 0.9231 90 2.6294 4260808
0.0206 0.9744 95 2.6309 4491424

Framework versions

  • Transformers 4.44.0
  • Pytorch 2.4.0+cu121
  • Datasets 2.20.0
  • Tokenizers 0.19.1
Downloads last month
8
Safetensors
Model size
2.61B params
Tensor type
BF16
·
Inference API
Unable to determine this model's library. Check the docs .

Model tree for RylanSchaeffer/collapse_gemma-2-2b_hs2_replace_iter19_sftsd2

Base model

google/gemma-2-2b
Finetuned
(448)
this model