Update README.md
Browse files
README.md
CHANGED
@@ -166,7 +166,7 @@ Dataset: [HuggingFaceH4/ultrafeedback_binarized](https://huggingface.co/datasets
|
|
166 |
<br><br>
|
167 |
|
168 |
## Training Procedure
|
169 |
-
Both Pre-Train and Finetuning used [our fork](https://github.com/Pints-AI/1.5-Pints) of the [LitGPT Framework](https://github.com/Lightning-AI/litgpt). For DPO, we used the methods set out in [The Alignment Handbook](https://github.com/huggingface/alignment-handbook/blob/main/scripts/run_dpo.py). More details can be found in our [paper](
|
170 |
|
171 |
## Training Hyperparameters
|
172 |
**Pre-Train**<br>
|
|
|
166 |
<br><br>
|
167 |
|
168 |
## Training Procedure
|
169 |
+
Both Pre-Train and Finetuning used [our fork](https://github.com/Pints-AI/1.5-Pints) of the [LitGPT Framework](https://github.com/Lightning-AI/litgpt). For DPO, we used the methods set out in [The Alignment Handbook](https://github.com/huggingface/alignment-handbook/blob/main/scripts/run_dpo.py). More details can be found in our [paper](https://arxiv.org/abs/2408.03506).
|
170 |
|
171 |
## Training Hyperparameters
|
172 |
**Pre-Train**<br>
|