Ejafa commited on
Commit
f0d989e
1 Parent(s): 056e214

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +8 -0
README.md CHANGED
@@ -19,6 +19,14 @@ model-index:
19
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
20
  should probably proofread and complete it, then remove this comment. -->
21
 
 
 
 
 
 
 
 
 
22
  # phi-3-mini-128k-instruct-dpo-lr-5e-07
23
 
24
  This model is a fine-tuned version of [microsoft/Phi-3-mini-128k-instruct](https://huggingface.co/microsoft/Phi-3-mini-128k-instruct) on the princeton-nlp/llama3-ultrafeedback dataset.
 
19
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
20
  should probably proofread and complete it, then remove this comment. -->
21
 
22
+ ## Description
23
+ This model was trained as part of the Reinforcement Learning - 24 project at Peking University, focusing on [dpo].
24
+
25
+ ## Authors
26
+ - Ejafa Bassam
27
+ - Yaroslav Ponomarenko
28
+
29
+
30
  # phi-3-mini-128k-instruct-dpo-lr-5e-07
31
 
32
  This model is a fine-tuned version of [microsoft/Phi-3-mini-128k-instruct](https://huggingface.co/microsoft/Phi-3-mini-128k-instruct) on the princeton-nlp/llama3-ultrafeedback dataset.