Lichang-Chen commited on
Commit
38eac53
1 Parent(s): 52c1c2f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -7
README.md CHANGED
@@ -8,22 +8,20 @@ tags:
8
  - PPO
9
  ---
10
 
11
- <!-- Provide a quick summary of what the model is/does. --
12
-
13
  ## Model Details
 
 
14
 
15
- ### Model Description
16
-
17
- <!-- Provide a longer summary of what this model is. -->
18
-
19
 
20
 
 
21
  - **Developed by:** [Lichang-Chen](https://huggingface.co/Lichang-Chen) and [Chen Zhu](https://scholar.google.com/citations?hl=zh-CN&user=m-om5O8AAAAJ)
22
  - **Model type:** RLHF model.
23
  - **Language(s) (NLP):** English
24
  - **Finetuned from model:** [Vicuna-7b](https://huggingface.co/lmsys/vicuna-7b-v1.5)
25
 
26
- ### Model Sources [optional]
27
 
28
  <!-- Provide the basic links for the model. -->
29
 
 
8
  - PPO
9
  ---
10
 
 
 
11
  ## Model Details
12
+ This is an official implementation of ODIN-ppo-L230-7B model, which is a chat assistant trained by fine-tuning LLaMA on Open-Assistant dataset via PPO.
13
+ The L230 means the output length in LIMA test set is ~230. ODIN is the reward model for the training.
14
 
15
+ ## Model Description
 
 
 
16
 
17
 
18
+ <!-- Provide a longer summary of what this model is. -->
19
  - **Developed by:** [Lichang-Chen](https://huggingface.co/Lichang-Chen) and [Chen Zhu](https://scholar.google.com/citations?hl=zh-CN&user=m-om5O8AAAAJ)
20
  - **Model type:** RLHF model.
21
  - **Language(s) (NLP):** English
22
  - **Finetuned from model:** [Vicuna-7b](https://huggingface.co/lmsys/vicuna-7b-v1.5)
23
 
24
+ ### Model Sources
25
 
26
  <!-- Provide the basic links for the model. -->
27