Mixtral-8x7b-v0.1-dpo / README.md

eelxpeng

Create README.md

9eb8bc4 verified 9 months ago

preview code

raw

history blame

410 Bytes

metadata

license: apache-2.0
datasets:
  - HuggingFaceH4/ultrafeedback_binarized
language:
  - en

Introduction

This model vistagi/Mixtral-8x7b-v0.1-sft is trained with Ultrachat-200K dataset through supervised finetuning using Mixtral-8x7b-v0.1 as the baseline model. The training is done with bfloat16 precision using LoRA.

Details

Used Librarys

torch
deepspeed
pytorch lightning
transformers
peft