Training
This model was trained on two datasets, shown in this model page.
- Skylion007/openwebtext: 1,000,000 examples at a batch size of 32-4096 (1 epoch)
- Locutusque/TM-DATA: All examples at a batch size of 12288 (3 epochs) Training took approximately 500 GPU hours on a single Titan V.
Metrics
You can look at the training metrics here: https://wandb.ai/locutusque/TinyMistral-V2/runs/g0rvw6wc
🔥 This model performed excellently on TruthfulQA, outperforming models more than 720x its size. These models include: mistralai/Mixtral-8x7B-v0.1, tiiuae/falcon-180B, berkeley-nest/Starling-LM-7B-alpha, upstage/SOLAR-10.7B-v1.0, and more. 🔥
- Downloads last month
- 1,198
Inference Providers
NEW
This model is not currently available via any of the supported third-party Inference Providers, and
the model is not deployed on the HF Inference API.