yuxin commited on
Commit
dcf8c54
1 Parent(s): 42c8b2c

6fd424449d2a16ad785e7289c4a376277b3c94c6a13af7c05c8c78bd00507d2f

Browse files
Files changed (1) hide show
  1. README.md +4 -0
README.md CHANGED
@@ -33,6 +33,7 @@ with torch.no_grad():
33
  # reward: 0.76
34
  ```
35
  模型可以较为准确地判断文本重复,异常中断和不符合指令要求等低质量模型生成结果,并给出较低的奖励值。
 
36
  The model can more accurately determine low quality model generation results such as text repetition, interruptions and failure to meet instruction requirements, and give lower reward values.
37
 
38
  ```python
@@ -52,8 +53,11 @@ with torch.no_grad():
52
  print(reward.tolist())
53
  #reward: [0.76, -1.36, -2.99, -1.82]
54
  ```
 
55
  模型能够对比对同一指令的不同生成结果,并根据质量给出奖励值。
 
56
  The model is able to compare different generation results for the same instruction and give reward values based on quality.
 
57
  ```python
58
  prefix_user = "Human:"
59
  prefix_bot = "\n\nAssistant:"
 
33
  # reward: 0.76
34
  ```
35
  模型可以较为准确地判断文本重复,异常中断和不符合指令要求等低质量模型生成结果,并给出较低的奖励值。
36
+
37
  The model can more accurately determine low quality model generation results such as text repetition, interruptions and failure to meet instruction requirements, and give lower reward values.
38
 
39
  ```python
 
53
  print(reward.tolist())
54
  #reward: [0.76, -1.36, -2.99, -1.82]
55
  ```
56
+
57
  模型能够对比对同一指令的不同生成结果,并根据质量给出奖励值。
58
+
59
  The model is able to compare different generation results for the same instruction and give reward values based on quality.
60
+
61
  ```python
62
  prefix_user = "Human:"
63
  prefix_bot = "\n\nAssistant:"