danielpark
commited on
Commit
•
363cd2f
1
Parent(s):
d754ecb
Update README.md
Browse files
README.md
CHANGED
@@ -27,6 +27,32 @@ KORANI is derived from GORANI, a project within llama2 that experiments with the
|
|
27 |
<br>
|
28 |
|
29 |
## Template
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
30 |
For safety, I used the default system message from Llama-2. But if a system message is specified in any datasets, I use that content.
|
31 |
```python
|
32 |
### System:
|
@@ -38,12 +64,24 @@ For safety, I used the default system message from Llama-2. But if a system mess
|
|
38 |
### Input:
|
39 |
{{ Optional additional user input }}
|
40 |
|
41 |
-
###
|
42 |
{{ New assistant answer }}
|
43 |
```
|
44 |
|
45 |
need to be converted to
|
46 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
47 |
|
48 |
```
|
49 |
# https://github.com/huggingface/blog/blob/main/llama2.md#how-to-prompt-llama-2
|
@@ -88,6 +126,50 @@ If the sentence does not fall within these categories, is safe and does not need
|
|
88 |
"""
|
89 |
```
|
90 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
91 |
## Update
|
92 |
- Since we cannot control resources, we will record the schedule retrospectively.
|
93 |
|
|
|
27 |
<br>
|
28 |
|
29 |
## Template
|
30 |
+
Instruction model
|
31 |
+
```
|
32 |
+
### System:
|
33 |
+
{{ System prompt }}
|
34 |
+
|
35 |
+
### User:
|
36 |
+
{{ New user input }}
|
37 |
+
|
38 |
+
### Assistant:
|
39 |
+
{Assistant}
|
40 |
+
```
|
41 |
+
|
42 |
+
Chat model
|
43 |
+
```
|
44 |
+
<s>[INST] <<SYS>>
|
45 |
+
{{ System prompt }}
|
46 |
+
<</SYS>>
|
47 |
+
|
48 |
+
{{ New user message }} [/INST]
|
49 |
+
```
|
50 |
+
|
51 |
+
<details>
|
52 |
+
<summary> Templates </summary>
|
53 |
+
|
54 |
+
|
55 |
+
#### Instruct model template
|
56 |
For safety, I used the default system message from Llama-2. But if a system message is specified in any datasets, I use that content.
|
57 |
```python
|
58 |
### System:
|
|
|
64 |
### Input:
|
65 |
{{ Optional additional user input }}
|
66 |
|
67 |
+
### Assistant:
|
68 |
{{ New assistant answer }}
|
69 |
```
|
70 |
|
71 |
need to be converted to
|
72 |
|
73 |
+
```
|
74 |
+
### System:
|
75 |
+
{{ System prompt }}
|
76 |
+
|
77 |
+
### User:
|
78 |
+
{{ New user input }}
|
79 |
+
|
80 |
+
### Assistant:
|
81 |
+
{Assistant}
|
82 |
+
```
|
83 |
+
|
84 |
+
## Chat model template
|
85 |
|
86 |
```
|
87 |
# https://github.com/huggingface/blog/blob/main/llama2.md#how-to-prompt-llama-2
|
|
|
126 |
"""
|
127 |
```
|
128 |
|
129 |
+
The context-dialogue refers to the system prompt I suppose, and in this case we have N samples:
|
130 |
+
|
131 |
+
```
|
132 |
+
[SYSTEM: ](SYSTEM: Act as if you were Napoleon who like playing cricket.
|
133 |
+
USER: <user_msg_1>
|
134 |
+
ASSISTANT: <reply_1>
|
135 |
+
```
|
136 |
+
|
137 |
+
```
|
138 |
+
[SYSTEM: ](SYSTEM: Act as if you were Napoleon who like playing cricket.
|
139 |
+
USER:
|
140 |
+
ASSISTANT:
|
141 |
+
USER:
|
142 |
+
ASSISTANT:
|
143 |
+
```
|
144 |
+
|
145 |
+
```
|
146 |
+
[SYSTEM: ](SYSTEM: Act as if you were Napoleon who like playing cricket.
|
147 |
+
USER: <user_msg_1>
|
148 |
+
ASSISTANT: <reply_1>
|
149 |
+
USER: <user_msg_2>
|
150 |
+
ASSISTANT: <reply_2>
|
151 |
+
...
|
152 |
+
USER: <user_msg_N>
|
153 |
+
ASSISTANT: <reply_N>
|
154 |
+
```
|
155 |
+
For each sample S, at training time, we pass the context:
|
156 |
+
|
157 |
+
```
|
158 |
+
[SYSTEM: ](SYSTEM: Act as if you were Napoleon who like playing cricket.
|
159 |
+
USER: <user_msg_1>
|
160 |
+
ASSISTANT: <reply_1>
|
161 |
+
USER: <user_msg_2>
|
162 |
+
ASSISTANT: <reply_2>
|
163 |
+
...
|
164 |
+
USER: <user_msg_S-1>
|
165 |
+
ASSISTANT: <reply_S-1>
|
166 |
+
USER: <user_msg_S>
|
167 |
+
```
|
168 |
+
And we train the probabilities on ASSISTANT: <reply_S>: the loss is zeroed out for all tokens before, then you compute cross-entropy between ground-truth (the actual <reply_S>) and the predicted tokens P(t>k|t0, ..., tk) where k is the length of the sequence
|
169 |
+
|
170 |
+
</details>
|
171 |
+
|
172 |
+
|
173 |
## Update
|
174 |
- Since we cannot control resources, we will record the schedule retrospectively.
|
175 |
|