Edit model card

gemma7_on_korean_events

This model is a fine-tuned version of beomi/gemma-ko-7b on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.5534

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 42
  • gradient_accumulation_steps: 5
  • total_train_batch_size: 10
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 50
  • training_steps: 760
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
1.0063 0.2632 20 0.6860
0.4681 0.5263 40 0.3754
0.3565 0.7895 60 0.3030
0.2872 1.0526 80 0.2880
0.2557 1.3158 100 0.2723
0.2213 1.5789 120 0.2674
0.2386 1.8421 140 0.2537
0.1885 2.1053 160 0.3032
0.1362 2.3684 180 0.2710
0.131 2.6316 200 0.2844
0.1366 2.8947 220 0.2626
0.0935 3.1579 240 0.3632
0.0618 3.4211 260 0.3315
0.0672 3.6842 280 0.3255
0.0684 3.9474 300 0.3238
0.0393 4.2105 320 0.4230
0.0311 4.4737 340 0.4180
0.0337 4.7368 360 0.3933
0.0402 5.0 380 0.3846
0.0179 5.2632 400 0.4478
0.0176 5.5263 420 0.4464
0.0274 5.7895 440 0.4003
0.017 6.0526 460 0.4284
0.0112 6.3158 480 0.4675
0.0107 6.5789 500 0.4715
0.0115 6.8421 520 0.4911
0.0107 7.1053 540 0.4776
0.0053 7.3684 560 0.4829
0.0049 7.6316 580 0.4962
0.0046 7.8947 600 0.5087
0.0039 8.1579 620 0.5240
0.0028 8.4211 640 0.5317
0.0035 8.6842 660 0.5351
0.0034 8.9474 680 0.5393
0.0013 9.2105 700 0.5445
0.0031 9.4737 720 0.5502
0.0018 9.7368 740 0.5523
0.002 10.0 760 0.5534

Framework versions

  • PEFT 0.11.2.dev0
  • Transformers 4.41.1
  • Pytorch 2.3.0+cu121
  • Datasets 2.15.0
  • Tokenizers 0.19.1
Downloads last month
2
Unable to determine this model’s pipeline type. Check the docs .

Adapter for