langtest's picture
End of training
076168b verified
metadata
base_model: ybelkada/falcon-7b-sharded-bf16
library_name: peft
tags:
  - trl
  - sft
  - generated_from_trainer
model-index:
  - name: falcon-7b-sharded-bf16-finetuned-mental-health-hf-plus-dsm5mistral
    results: []

falcon-7b-sharded-bf16-finetuned-mental-health-hf-plus-dsm5mistral

This model is a fine-tuned version of ybelkada/falcon-7b-sharded-bf16 on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 1.6730

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0002
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 64
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.03
  • training_steps: 200

Training results

Training Loss Epoch Step Validation Loss
1.6574 0.1003 10 1.7478
1.4519 0.2005 20 1.7755
1.5823 0.3008 30 1.7614
1.5633 0.4010 40 1.7620
1.3816 0.5013 50 2.0733
1.75 0.6015 60 1.7332
1.3408 0.7018 70 1.7491
1.6436 0.8020 80 1.7197
1.439 0.9023 90 1.7320
1.3755 1.0025 100 1.7057
1.5751 1.1028 110 1.7190
1.1649 1.2030 120 1.7603
1.481 1.3033 130 1.6892
1.2769 1.4035 140 1.6977
1.2564 1.5038 150 1.7133
1.4486 1.6040 160 1.6711
1.0868 1.7043 170 1.6710
1.4892 1.8045 180 1.6721
1.1903 1.9048 190 1.6727
1.1577 2.0050 200 1.6730

Framework versions

  • PEFT 0.13.1.dev0
  • Transformers 4.45.1
  • Pytorch 2.4.1+cu121
  • Datasets 3.0.1
  • Tokenizers 0.20.0