collapse_gemma-2-9b_hs2_replace_iter3_sftsd0

This model is a fine-tuned version of google/gemma-2-9b on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.3146
  • Num Input Tokens Seen: 4629248

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 8e-06
  • train_batch_size: 4
  • eval_batch_size: 16
  • seed: 0
  • gradient_accumulation_steps: 32
  • total_train_batch_size: 128
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: constant_with_warmup
  • lr_scheduler_warmup_ratio: 0.05
  • num_epochs: 1

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
No log 0 0 1.2335 0
1.1505 0.0514 5 1.0838 237896
0.5751 0.1028 10 1.1218 477308
0.257 0.1541 15 1.1457 712540
0.0988 0.2055 20 1.2485 950820
0.073 0.2569 25 1.3308 1186016
0.0413 0.3083 30 1.2650 1434072
0.0789 0.3597 35 1.2347 1670144
0.0581 0.4110 40 1.1793 1906636
0.0455 0.4624 45 1.1711 2150924
0.0256 0.5138 50 1.2403 2385516
0.0448 0.5652 55 1.2306 2630840
0.0498 0.6166 60 1.2123 2870696
0.0334 0.6680 65 1.2054 3105528
0.0319 0.7193 70 1.2169 3337616
0.0269 0.7707 75 1.2902 3571224
0.0304 0.8221 80 1.2938 3813928
0.0335 0.8735 85 1.2884 4054540
0.0272 0.9249 90 1.2766 4299016
0.0282 0.9762 95 1.3041 4533368

Framework versions

  • Transformers 4.44.0
  • Pytorch 2.4.0+cu121
  • Datasets 2.20.0
  • Tokenizers 0.19.1
Downloads last month
1
Safetensors
Model size
9.24B params
Tensor type
BF16
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.

Model tree for RylanSchaeffer/collapse_gemma-2-9b_hs2_replace_iter3_sftsd0

Base model

google/gemma-2-9b
Finetuned
(226)
this model