danielpark
/

gorani-100k-llama2-13b-instruct

Text Generation

bitsandbytes, transformers, peft, accelerate, bitsandbytes, datasets, deepspeed, trl

Model card Files Files and versions Community

danielpark commited on Nov 13, 2023

Commit

363cd2f

•

1 Parent(s): d754ecb

Update README.md

Files changed (1) hide show

README.md +83 -1

README.md CHANGED Viewed

@@ -27,6 +27,32 @@ KORANI is derived from GORANI, a project within llama2 that experiments with the
 <br>
 ## Template
 For safety, I used the default system message from Llama-2. But if a system message is specified in any datasets, I use that content.
 ```python
 ### System:
@@ -38,12 +64,24 @@ For safety, I used the default system message from Llama-2. But if a system mess
 ### Input:
 {{ Optional additional user input }}
-### Response:
 {{ New assistant answer }}
 ```
 need to be converted to
 ```
 # https://github.com/huggingface/blog/blob/main/llama2.md#how-to-prompt-llama-2
@@ -88,6 +126,50 @@ If the sentence does not fall within these categories, is safe and does not need
 """
 ```
 ## Update
 - Since we cannot control resources, we will record the schedule retrospectively.

 <br>
 ## Template
+Instruction model
+```
+### System:
+{{ System prompt }}
+### User:
+{{ New user input }}
+### Assistant:
+{Assistant}
+```
+Chat model
+```
+<s>[INST] <<SYS>>
+{{ System prompt }}
+<</SYS>>
+{{ New user message }} [/INST]
+```
+<details>
+  <summary> Templates </summary>
+#### Instruct model template
 For safety, I used the default system message from Llama-2. But if a system message is specified in any datasets, I use that content.
 ```python
 ### System:
 ### Input:
 {{ Optional additional user input }}
+### Assistant:
 {{ New assistant answer }}
 ```
 need to be converted to
+```
+### System:
+{{ System prompt }}
+### User:
+{{ New user input }}
+### Assistant:
+{Assistant}
+```
+## Chat model template
 ```
 # https://github.com/huggingface/blog/blob/main/llama2.md#how-to-prompt-llama-2
 """
 ```
+The context-dialogue refers to the system prompt I suppose, and in this case we have N samples:
+```
+[SYSTEM: ](SYSTEM: Act as if you were Napoleon who like playing cricket.
+USER: <user_msg_1>
+ASSISTANT: <reply_1>
+```
+```
+[SYSTEM: ](SYSTEM: Act as if you were Napoleon who like playing cricket.
+USER:
+ASSISTANT:
+USER:
+ASSISTANT:
+```
+```
+[SYSTEM: ](SYSTEM: Act as if you were Napoleon who like playing cricket.
+USER: <user_msg_1>
+ASSISTANT: <reply_1>
+USER: <user_msg_2>
+ASSISTANT: <reply_2>
+...
+USER: <user_msg_N>
+ASSISTANT: <reply_N>
+```
+For each sample S, at training time, we pass the context:
+```
+[SYSTEM: ](SYSTEM: Act as if you were Napoleon who like playing cricket.
+USER: <user_msg_1>
+ASSISTANT: <reply_1>
+USER: <user_msg_2>
+ASSISTANT: <reply_2>
+...
+USER: <user_msg_S-1>
+ASSISTANT: <reply_S-1>
+USER: <user_msg_S>
+```
+And we train the probabilities on ASSISTANT: <reply_S>: the loss is zeroed out for all tokens before, then you compute cross-entropy between ground-truth (the actual <reply_S>) and the predicted tokens P(t>k|t0, ..., tk) where k is the length of the sequence
+</details>
 ## Update
 - Since we cannot control resources, we will record the schedule retrospectively.