InferenceIllusionist commited on
Commit
aa03379
1 Parent(s): 7cc57ac

Adding previous model scores for comparison

Browse files

Also slight clean-up and clarification in methodology language

Files changed (1) hide show
  1. README.md +7 -2
README.md CHANGED
@@ -125,9 +125,14 @@ An initial foray into the world of fine-tuning. The goal of this release was to
125
 
126
  ## Notes & Methodology
127
  * [Excalibur-7b](https://huggingface.co/InferenceIllusionist/Excalibur-7b) fine-tuned with Direct Preference Optimization (DPO) using Intel/orca_dpo_pairs
128
- * This is a quick experiment to determine the impact of DPO finetuning on the original base model
129
  * Ran for a little over an hour on a single A100
130
- * Internal benchmarks showed improvement over base model, awaiting final results
 
 
 
 
 
131
  * Precision: bfloat16
132
 
133
 
 
125
 
126
  ## Notes & Methodology
127
  * [Excalibur-7b](https://huggingface.co/InferenceIllusionist/Excalibur-7b) fine-tuned with Direct Preference Optimization (DPO) using Intel/orca_dpo_pairs
128
+ * This is a quick experiment to determine the impact of DPO finetuning on the Excelsior-7b base model
129
  * Ran for a little over an hour on a single A100
130
+ * Fine-tuning succeeded in making model conversational and more well-rounded
131
+ * Benchmark scores increased in the following categories versus base Excelsior-7b:
132
+ * ARC: 69.71 -> <b>70.9</b>
133
+ * HellaSwag: 87.56 -> <b>87.93</b>
134
+ * TruthfulQA: 67.24 -> <b>70.82</b>
135
+ * Average: 73.6 -> <b>73.84</b>
136
  * Precision: bfloat16
137
 
138