Training Details

Training Data

Training Hyperparameters

  • model: phi3
  • config: tv2o-medium
  • max-len: 2048
  • lr: 1e-4 then 2e-5
  • weight-decay: 0.01
  • batch-size-per-gpu: 8
  • GPUs: 2 A100 80gb
  • bfloat16 precision

Loss

  • val loss: 0.2740

Files

  • model.ckpt: latest model checkpoint
Downloads last month
47
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.

Dataset used to train breadlicker45/bread-tv2o-medium

Spaces using breadlicker45/bread-tv2o-medium 2