danielpark commited on
Commit
363cd2f
1 Parent(s): d754ecb

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +83 -1
README.md CHANGED
@@ -27,6 +27,32 @@ KORANI is derived from GORANI, a project within llama2 that experiments with the
27
  <br>
28
 
29
  ## Template
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
30
  For safety, I used the default system message from Llama-2. But if a system message is specified in any datasets, I use that content.
31
  ```python
32
  ### System:
@@ -38,12 +64,24 @@ For safety, I used the default system message from Llama-2. But if a system mess
38
  ### Input:
39
  {{ Optional additional user input }}
40
 
41
- ### Response:
42
  {{ New assistant answer }}
43
  ```
44
 
45
  need to be converted to
46
 
 
 
 
 
 
 
 
 
 
 
 
 
47
 
48
  ```
49
  # https://github.com/huggingface/blog/blob/main/llama2.md#how-to-prompt-llama-2
@@ -88,6 +126,50 @@ If the sentence does not fall within these categories, is safe and does not need
88
  """
89
  ```
90
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
91
  ## Update
92
  - Since we cannot control resources, we will record the schedule retrospectively.
93
 
 
27
  <br>
28
 
29
  ## Template
30
+ Instruction model
31
+ ```
32
+ ### System:
33
+ {{ System prompt }}
34
+
35
+ ### User:
36
+ {{ New user input }}
37
+
38
+ ### Assistant:
39
+ {Assistant}
40
+ ```
41
+
42
+ Chat model
43
+ ```
44
+ <s>[INST] <<SYS>>
45
+ {{ System prompt }}
46
+ <</SYS>>
47
+
48
+ {{ New user message }} [/INST]
49
+ ```
50
+
51
+ <details>
52
+ <summary> Templates </summary>
53
+
54
+
55
+ #### Instruct model template
56
  For safety, I used the default system message from Llama-2. But if a system message is specified in any datasets, I use that content.
57
  ```python
58
  ### System:
 
64
  ### Input:
65
  {{ Optional additional user input }}
66
 
67
+ ### Assistant:
68
  {{ New assistant answer }}
69
  ```
70
 
71
  need to be converted to
72
 
73
+ ```
74
+ ### System:
75
+ {{ System prompt }}
76
+
77
+ ### User:
78
+ {{ New user input }}
79
+
80
+ ### Assistant:
81
+ {Assistant}
82
+ ```
83
+
84
+ ## Chat model template
85
 
86
  ```
87
  # https://github.com/huggingface/blog/blob/main/llama2.md#how-to-prompt-llama-2
 
126
  """
127
  ```
128
 
129
+ The context-dialogue refers to the system prompt I suppose, and in this case we have N samples:
130
+
131
+ ```
132
+ [SYSTEM: ](SYSTEM: Act as if you were Napoleon who like playing cricket.
133
+ USER: <user_msg_1>
134
+ ASSISTANT: <reply_1>
135
+ ```
136
+
137
+ ```
138
+ [SYSTEM: ](SYSTEM: Act as if you were Napoleon who like playing cricket.
139
+ USER:
140
+ ASSISTANT:
141
+ USER:
142
+ ASSISTANT:
143
+ ```
144
+
145
+ ```
146
+ [SYSTEM: ](SYSTEM: Act as if you were Napoleon who like playing cricket.
147
+ USER: <user_msg_1>
148
+ ASSISTANT: <reply_1>
149
+ USER: <user_msg_2>
150
+ ASSISTANT: <reply_2>
151
+ ...
152
+ USER: <user_msg_N>
153
+ ASSISTANT: <reply_N>
154
+ ```
155
+ For each sample S, at training time, we pass the context:
156
+
157
+ ```
158
+ [SYSTEM: ](SYSTEM: Act as if you were Napoleon who like playing cricket.
159
+ USER: <user_msg_1>
160
+ ASSISTANT: <reply_1>
161
+ USER: <user_msg_2>
162
+ ASSISTANT: <reply_2>
163
+ ...
164
+ USER: <user_msg_S-1>
165
+ ASSISTANT: <reply_S-1>
166
+ USER: <user_msg_S>
167
+ ```
168
+ And we train the probabilities on ASSISTANT: <reply_S>: the loss is zeroed out for all tokens before, then you compute cross-entropy between ground-truth (the actual <reply_S>) and the predicted tokens P(t>k|t0, ..., tk) where k is the length of the sequence
169
+
170
+ </details>
171
+
172
+
173
  ## Update
174
  - Since we cannot control resources, we will record the schedule retrospectively.
175