Hugging Face
Models
Datasets
Spaces
Posts
Docs
Solutions
Pricing
Log In
Sign Up
floleuerer
/
SausageLM-7b-Instruct-v0.01-dpo-qlora
like
0
PEFT
TensorBoard
Safetensors
HuggingFaceH4/ultrafeedback_binarized
mistral
alignment-handbook
Generated from Trainer
trl
dpo
4-bit precision
bitsandbytes
License:
apache-2.0
Model card
Files
Files and versions
Metrics
Training metrics
Community
Train
Use this model
245644f
SausageLM-7b-Instruct-v0.01-dpo-qlora
Commit History
Training in progress, step 3200
245644f
verified
floleuerer
commited on
Jan 15
Training in progress, step 2800
b77be72
verified
floleuerer
commited on
Jan 15
Training in progress, step 2400
d48020a
verified
floleuerer
commited on
Jan 15
Training in progress, step 2000
a959ac1
verified
floleuerer
commited on
Jan 15
Training in progress, step 1600
c556d33
verified
floleuerer
commited on
Jan 15
Training in progress, step 1200
cd94c48
verified
floleuerer
commited on
Jan 15
Training in progress, step 800
a7fe1cd
verified
floleuerer
commited on
Jan 15
Training in progress, step 400
7057f8e
verified
floleuerer
commited on
Jan 15
initial commit
bed5713
verified
floleuerer
commited on
Jan 14