Built with Axolotl

a71e4e8f-2e2f-40e0-b810-4a7598f27713

This model is a fine-tuned version of fxmarty/really-tiny-falcon-testing on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 10.9664

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.000208
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 8
  • optimizer: Use OptimizerNames.ADAMW_BNB with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 50
  • training_steps: 452

Training results

Training Loss Epoch Step Validation Loss
No log 0.0022 1 11.0913
22.0457 0.1107 50 11.0146
21.9726 0.2215 100 10.9945
21.9355 0.3322 150 10.9843
21.9687 0.4430 200 10.9776
21.9515 0.5537 250 10.9733
21.9139 0.6645 300 10.9691
21.9329 0.7752 350 10.9677
21.9169 0.8859 400 10.9674
21.9063 0.9967 450 10.9664

Framework versions

  • PEFT 0.13.2
  • Transformers 4.46.0
  • Pytorch 2.5.0+cu124
  • Datasets 3.0.1
  • Tokenizers 0.20.1
Downloads last month
9
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no pipeline_tag.

Model tree for lesso08/a71e4e8f-2e2f-40e0-b810-4a7598f27713

Adapter
(282)
this model