pvduy commited on
Commit
6f287a0
1 Parent(s): 603c9a0

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +96 -0
README.md ADDED
@@ -0,0 +1,96 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Training
2
+
3
+ This is the 10k steps English supervised-fine-tuning (SFT) model of GPT-J using SODA dataset for Chai Competition.
4
+
5
+ - **Language:** English
6
+ - **Finetuned from:** [EleutherAI / GPT-J](https://huggingface.co/EleutherAI/gpt-j-6b)
7
+ - **Code:** [Open-Assistant/model/model_training](https://github.com/LAION-AI/Open-Assistant/tree/main/model/model_training)
8
+ - **Dataset:** 10 percent from [SODA dataset](https://huggingface.co/datasets/allenai/soda)
9
+
10
+ # Why OpenAssistant framework:
11
+ - Easy to setup training with change config from dataset and model is all you need
12
+ - Data processing available for almost popular conversation datasets: SODA, Vicuna, OpenAssistant, ...
13
+
14
+ # Configuration:
15
+
16
+ You need to add this to default config file `configs/config.yaml`
17
+
18
+ data:
19
+ ```
20
+ soda-only:
21
+ datasets:
22
+ - soda:
23
+ fraction: 0.1
24
+ input_max_length: 1024
25
+ ```
26
+
27
+ gptj-chai:
28
+ ```
29
+ dtype: fp16
30
+ log_dir: gptj-soda
31
+ model_name: EleutherAI/gpt-j-6b
32
+ output_dir: output/gptj-soda-chai
33
+ max_length: 1024
34
+ warmup_steps: 100
35
+ gradient_checkpointing: true
36
+ gradient_accumulation_steps: 1
37
+ per_device_train_batch_size: 8
38
+ per_device_eval_batch_size: 8
39
+ eval_steps: 5000
40
+ save_steps: 5000
41
+ num_train_epochs: 1
42
+ save_total_limit: 1
43
+ use_flash_attention: false
44
+ ```
45
+
46
+ # Command to train:
47
+ deepspeed trainer_sft.py --local_rank=0 --configs defaults gptj-chai soda-only --cache_dir data_cache --deepspeed
48
+
49
+ # Demo code:
50
+ ```python
51
+ from transformers import AutoTokenizer, AutoModelForCausalLM
52
+
53
+
54
+
55
+ class ChatBot():
56
+ def __init__(self, path="/mnt/hdd/duyphung/gptj-soda-chai/checkpoint-10000/"):
57
+ self.tokenizer = AutoTokenizer.from_pretrained(path)
58
+ self.model = AutoModelForCausalLM.from_pretrained(path).half().cuda().eval()
59
+ self.model.pad_token_id = self.tokenizer.eos_token_id
60
+ self.tokenizer.pad_token_id = self.tokenizer.eos_token_id
61
+
62
+ def chat(self, message):
63
+ enc_dict = self.tokenizer(
64
+ message,
65
+ return_tensors='pt'
66
+ )
67
+ for x in enc_dict:
68
+ enc_dict[x] = enc_dict[x].cuda()
69
+ chat_history_ids = self.model.generate(
70
+ input_ids=enc_dict['input_ids'],
71
+ attention_mask=enc_dict['attention_mask'],
72
+ max_new_tokens=64,
73
+ temperature=0.7,
74
+ do_sample=True,
75
+ top_k=0,
76
+ top_p=0.95,
77
+ )
78
+ out = chat_history_ids[:, enc_dict['input_ids'].shape[-1]:][0]
79
+ return self.tokenizer.decode(out, skip_special_tokens=True)
80
+
81
+
82
+ if __name__ == "__main__":
83
+ bot_name = 'Bot:'
84
+ prompt = "<|prompter|>"
85
+ chat_history = []
86
+
87
+ bot = ChatBot()
88
+ while True:
89
+ message = input("Me: ")
90
+ chat_history.append(f'Me: {message}')
91
+ prompt = prompt + message + "<|endoftext|><|assistant|>"
92
+ response = bot.chat(prompt)
93
+ print(f'{bot_name} {response}')
94
+ prompt = prompt + response + "<|endoftext|><|prompter|>"
95
+ ```
96
+