Commit
•
3dfe921
1
Parent(s):
99e005c
Update README.md
Browse files
README.md
CHANGED
@@ -10,9 +10,9 @@ pipeline_tag: conversational
|
|
10 |
It is a chat Large Language model finetuned with pretrained [Falcon-1B model](https://huggingface.co/tiiuae/falcon-rw-1b)
|
11 |
and trained on [chat-bot-instructions prompts dataset](https://huggingface.co/datasets/ayoolaolafenwa/sft-data).
|
12 |
ChatLM was trained on a dataset containing normal day to day human conversations, due to limited data used in training
|
13 |
-
it
|
14 |
|
15 |
-
## Load Model in
|
16 |
``` python
|
17 |
import torch
|
18 |
from transformers import AutoModelForCausalLM, AutoTokenizer
|
@@ -73,10 +73,12 @@ output_text = output_text.replace("<|endoftext|>", "")
|
|
73 |
|
74 |
print(output_text)
|
75 |
```
|
76 |
-
|
|
|
|
|
77 |
|
78 |
Chatbot Instructions prompts dataset from https://huggingface.co/datasets/alespalla/chatbot_instruction_prompts/viewer/alespalla--chatbot_instruction_prompts
|
79 |
-
was processed into a supervised finetuning for training a user prompt and corresponding response.
|
80 |
|
81 |
##### Download Data
|
82 |
``` python
|
@@ -110,7 +112,7 @@ for i in range(len(text_data)):
|
|
110 |
|
111 |
# Add the message to the prompts list with <user> tag
|
112 |
prompts.append("<user>: " + prompt)
|
113 |
-
|
114 |
# Add the message to the responses list with <chatbot> tag
|
115 |
responses.append("<chatbot>: " + response)
|
116 |
|
@@ -121,8 +123,12 @@ new_data = pd.DataFrame({"prompt": prompts, "response": responses})
|
|
121 |
# Write the new dataframe to a csv file
|
122 |
new_data.to_csv("MyData/chatbot_instruction_prompts_train.csv", index=False)
|
123 |
```
|
124 |
-
|
125 |
Check the the modified dataset https://huggingface.co/datasets/ayoolaolafenwa/sft-data .
|
126 |
|
127 |
-
|
128 |
-
|
|
|
|
|
|
|
|
|
|
10 |
It is a chat Large Language model finetuned with pretrained [Falcon-1B model](https://huggingface.co/tiiuae/falcon-rw-1b)
|
11 |
and trained on [chat-bot-instructions prompts dataset](https://huggingface.co/datasets/ayoolaolafenwa/sft-data).
|
12 |
ChatLM was trained on a dataset containing normal day to day human conversations, due to limited data used in training
|
13 |
+
it does not generalize well for tasks like coding and current affairs.
|
14 |
|
15 |
+
## Load Model in bfloat16
|
16 |
``` python
|
17 |
import torch
|
18 |
from transformers import AutoModelForCausalLM, AutoTokenizer
|
|
|
73 |
|
74 |
print(output_text)
|
75 |
```
|
76 |
+
# Training procedure for Supervised Finetuning
|
77 |
+
|
78 |
+
## Dataset Preparation
|
79 |
|
80 |
Chatbot Instructions prompts dataset from https://huggingface.co/datasets/alespalla/chatbot_instruction_prompts/viewer/alespalla--chatbot_instruction_prompts
|
81 |
+
was processed into a supervised finetuning format for training a user prompt and a corresponding response.
|
82 |
|
83 |
##### Download Data
|
84 |
``` python
|
|
|
112 |
|
113 |
# Add the message to the prompts list with <user> tag
|
114 |
prompts.append("<user>: " + prompt)
|
115 |
+
|
116 |
# Add the message to the responses list with <chatbot> tag
|
117 |
responses.append("<chatbot>: " + response)
|
118 |
|
|
|
123 |
# Write the new dataframe to a csv file
|
124 |
new_data.to_csv("MyData/chatbot_instruction_prompts_train.csv", index=False)
|
125 |
```
|
126 |
+
The user's prompts in the dataset are appended with the tag <user> and the corresponding responses with the tag <chatbot>.
|
127 |
Check the the modified dataset https://huggingface.co/datasets/ayoolaolafenwa/sft-data .
|
128 |
|
129 |
+
### Training
|
130 |
+
|
131 |
+
ChatLM was supervised finetuned with pretrained [Falcon 1-Billion parameters model](https://huggingface.co/tiiuae/falcon-rw-1b) trained on 350-Billion tokens
|
132 |
+
of RefinedWeb. It was trained with a single H100 GPU for 1 epoch. Check the full code for supervised finetune
|
133 |
+
training on its github repository https://github.com/ayoolaolafenwa/ChatLM/tree/main
|
134 |
+
|