bhavinjawade commited on
Commit
65efbe3
1 Parent(s): f71d0bd

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -1
README.md CHANGED
@@ -7,7 +7,8 @@ datasets:
7
  ## SOLAR-10B-OrcaDPO-Jawade
8
 
9
  ### Overview
10
- This model card is instruction finetuned version of `upstage/SOLAR-10.7B-Instruct-v1.0` model. Trained on the Intel DPO Orca dataset using LoRA.
 
11
 
12
  ## How to Use This Model
13
 
 
7
  ## SOLAR-10B-OrcaDPO-Jawade
8
 
9
  ### Overview
10
+ This model card is instruction finetuned version of `upstage/SOLAR-10.7B-Instruct-v1.0` model. Trained on the Intel DPO Orca dataset using LoRA. Though it should be noted SOLAR-10.7B paper states that the
11
+ original model for alignment was trained on Intel ORCA DPO pairs. Retraining using DPO and LoRA shows slight (<1%) improvement on OpenLLM Leaderboard benchmarks against `SOLAR 10.7B-Instruct` and significant over `SOLAR 10.7B`
12
 
13
  ## How to Use This Model
14