pszemraj's picture
End of training
2b4ffac verified
|
raw
history blame
1.98 kB
metadata
license: apache-2.0
base_model: Qwen/Qwen2-1.5B
tags:
  - generated_from_trainer
metrics:
  - accuracy
model-index:
  - name: Qwen2-1.5B-stepbasin-books-vN
    results: []

Visualize in Weights & Biases

Qwen2-1.5B-stepbasin-books-vN

This model is a fine-tuned version of Qwen/Qwen2-1.5B on the BEE-spoke-data/stepbasin-books dataset. It achieves the following results on the evaluation set:

  • Loss: 2.8110
  • Accuracy: 0.4298
  • Num Input Tokens Seen: 44040192

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 3e-05
  • train_batch_size: 1
  • eval_batch_size: 1
  • seed: 80085
  • gradient_accumulation_steps: 32
  • total_train_batch_size: 32
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.05
  • num_epochs: 3.0

Training results

Training Loss Epoch Step Validation Loss Accuracy Input Tokens Seen
2.7792 0.9967 28 2.8183 0.4287 14729216
2.6971 1.9933 56 2.8112 0.4297 29458432
2.7116 2.9900 84 2.8110 0.4298 44040192

Framework versions

  • Transformers 4.42.4
  • Pytorch 2.3.1+cu121
  • Datasets 2.20.0
  • Tokenizers 0.19.1