picocreator commited on
Commit
254e2d1
·
1 Parent(s): d621032

README wip update

Browse files
README.md CHANGED
@@ -1,174 +1,10 @@
1
- ### Run Huggingface RWKV6 World Model
2
 
3
- > origin pth weight from https://huggingface.co/BlinkDL/rwkv-6-world/blob/main/RWKV-x060-World-7B-v2.1-20240507-ctx4096.pth .
 
4
 
5
- #### CPU
6
 
7
- ```python
8
- import torch
9
- from transformers import AutoModelForCausalLM, AutoTokenizer
10
 
11
- def generate_prompt(instruction, input=""):
12
- instruction = instruction.strip().replace('\r\n','\n').replace('\n\n','\n')
13
- input = input.strip().replace('\r\n','\n').replace('\n\n','\n')
14
- if input:
15
- return f"""Instruction: {instruction}
16
-
17
- Input: {input}
18
-
19
- Response:"""
20
- else:
21
- return f"""User: hi
22
-
23
- Assistant: Hi. I am your assistant and I will provide expert full response in full details. Please feel free to ask any question and I will always answer it.
24
-
25
- User: {instruction}
26
-
27
- Assistant:"""
28
-
29
-
30
- model = AutoModelForCausalLM.from_pretrained("RWKV/rwkv-6-world-7b", trust_remote_code=True).to(torch.float32)
31
- tokenizer = AutoTokenizer.from_pretrained("RWKV/rwkv-6-world-7b", trust_remote_code=True, padding_side='left', pad_token="<s>")
32
-
33
- text = "请介绍北京的旅游景点"
34
- prompt = generate_prompt(text)
35
-
36
- inputs = tokenizer(prompt, return_tensors="pt")
37
- output = model.generate(inputs["input_ids"], max_new_tokens=333, do_sample=True, temperature=1.0, top_p=0.3, top_k=0, )
38
- print(tokenizer.decode(output[0].tolist(), skip_special_tokens=True))
39
- ```
40
-
41
- output:
42
-
43
- ```shell
44
- User: hi
45
-
46
- Assistant: Hi. I am your assistant and I will provide expert full response in full details. Please feel free to ask any question and I will always answer it.
47
-
48
- User: 请介绍北京的旅游景点
49
-
50
- Assistant: 北京是中国的首都,拥有众多的旅游景点,以下是其中一些著名的景点:
51
- 1. 故宫:位于北京市中心,是明清两代的皇宫,内有大量的文物和艺术品。
52
- 2. 天安门广场:是中国最著名的广场之一,是中国人民政治协商会议的旧址,也是中国人民政治协商会议的中心。
53
- 3. 颐和园:是中国古代皇家园林之一,有着悠久的历史和丰富的文化内涵。
54
- 4. 长城:是中国古代的一道长城,全长约万里,是中国最著名的旅游景点之一。
55
- 5. 北京大学:是中国著名的高等教育机构之一,有着悠久的历史和丰富的文化内涵。
56
- 6. 北京动物园:是中国最大的动物园之一,有着丰富的动物资源和丰富的文化内涵。
57
- 7. 故宫博物院:是中国最著名的博物馆之一,收藏了大量的文物和艺术品,是中国最重要的文化遗产之一。
58
- 8. 天坛:是中国古代皇家
59
- ```
60
-
61
- #### GPU
62
-
63
- ```python
64
- import torch
65
- from transformers import AutoModelForCausalLM, AutoTokenizer
66
-
67
- def generate_prompt(instruction, input=""):
68
- instruction = instruction.strip().replace('\r\n','\n').replace('\n\n','\n')
69
- input = input.strip().replace('\r\n','\n').replace('\n\n','\n')
70
- if input:
71
- return f"""Instruction: {instruction}
72
-
73
- Input: {input}
74
-
75
- Response:"""
76
- else:
77
- return f"""User: hi
78
-
79
- Assistant: Hi. I am your assistant and I will provide expert full response in full details. Please feel free to ask any question and I will always answer it.
80
-
81
- User: {instruction}
82
-
83
- Assistant:"""
84
-
85
-
86
- model = AutoModelForCausalLM.from_pretrained("RWKV/rwkv-6-world-7b", trust_remote_code=True, torch_dtype=torch.float16).to(0)
87
- tokenizer = AutoTokenizer.from_pretrained("RWKV/rwkv-6-world-7b", trust_remote_code=True, padding_side='left', pad_token="<s>")
88
-
89
- text = "介绍一下大熊猫"
90
- prompt = generate_prompt(text)
91
-
92
- inputs = tokenizer(prompt, return_tensors="pt").to(0)
93
- output = model.generate(inputs["input_ids"], max_new_tokens=128, do_sample=True, temperature=1.0, top_p=0.3, top_k=0, )
94
- print(tokenizer.decode(output[0].tolist(), skip_special_tokens=True))
95
- ```
96
-
97
- output:
98
-
99
- ```shell
100
- User: hi
101
-
102
- Assistant: Hi. I am your assistant and I will provide expert full response in full details. Please feel free to ask any question and I will always answer it.
103
-
104
- User: 介绍一下大熊猫
105
-
106
- Assistant: 大熊猫是一种中国特有的哺乳动物,也是中国的国宝之一。它们的外貌特征是圆形的黑白相间的身体,有着黑色的毛发和白色的耳朵。大熊猫的食物主要是竹子,它们会在竹林中寻找竹子,并且会将竹子放在竹笼中进行储存。大熊猫的寿命约为20至30年,但由于栖息地的丧失和人类活动的
107
- ```
108
-
109
- #### Batch Inference
110
-
111
- ```python
112
- import torch
113
- from transformers import AutoModelForCausalLM, AutoTokenizer
114
-
115
- def generate_prompt(instruction, input=""):
116
- instruction = instruction.strip().replace('\r\n', '\n').replace('\n\n', '\n')
117
- input = input.strip().replace('\r\n', '\n').replace('\n\n', '\n')
118
- if input:
119
- return f"""Instruction: {instruction}
120
-
121
- Input: {input}
122
-
123
- Response:"""
124
- else:
125
- return f"""User: hi
126
-
127
- Assistant: Hi. I am your assistant and I will provide expert full response in full details. Please feel free to ask any question and I will always answer it.
128
-
129
- User: {instruction}
130
-
131
- Assistant:"""
132
-
133
- model = AutoModelForCausalLM.from_pretrained("RWKV/rwkv-6-world-7b", trust_remote_code=True).to(torch.float32)
134
- tokenizer = AutoTokenizer.from_pretrained("RWKV/rwkv-6-world-7b", trust_remote_code=True, padding_side='left', pad_token="<s>")
135
-
136
- texts = ["请介绍北京的旅游景点", "介绍一下大熊猫", "乌兰察布"]
137
- prompts = [generate_prompt(text) for text in texts]
138
-
139
- inputs = tokenizer(prompts, return_tensors="pt", padding=True)
140
- outputs = model.generate(inputs["input_ids"], max_new_tokens=128, do_sample=True, temperature=1.0, top_p=0.3, top_k=0, )
141
-
142
- for output in outputs:
143
- print(tokenizer.decode(output.tolist(), skip_special_tokens=True))
144
-
145
- ```
146
-
147
- output:
148
-
149
- ```shell
150
- User: hi
151
-
152
- Assistant: Hi. I am your assistant and I will provide expert full response in full details. Please feel free to ask any question and I will always answer it.
153
-
154
- User: 请介绍北京的旅游景点
155
-
156
- Assistant: 北京是中国的首都,拥有丰富的旅游资源和历史文化遗产。以下是一些北京的旅游景点:
157
- 1. 故宫:位于北京市中心,是明清两代的皇宫,是中国最大的古代宫殿建筑群之一。
158
- 2. 天安门广场:位于北京市中心,是中国最著名的城市广场之一,也是中国最大的城市广场。
159
- 3. 颐和
160
- User: hi
161
-
162
- Assistant: Hi. I am your assistant and I will provide expert full response in full details. Please feel free to ask any question and I will always answer it.
163
-
164
- User: 介绍一下大熊猫
165
-
166
- Assistant: 大熊猫是一种生活在中国中部地区的哺乳动物,也是中国的国宝之一。它们的外貌特征是圆形的黑白相间的身体,有着黑色的毛发和圆圆的眼睛。大熊猫是一种濒危物种,目前只有在野外的几个保护区才能看到它们的身影。大熊猫的食物主要是竹子,它们会在竹子上寻找食物,并且可以通
167
- User: hi
168
-
169
- Assistant: Hi. I am your assistant and I will provide expert full response in full details. Please feel free to ask any question and I will always answer it.
170
-
171
- User: 乌兰察布
172
-
173
- Assistant: 乌兰察布是中国新疆维吾尔自治区的一个县级市,位于新疆维吾尔自治区中部,是新疆的第二大城市。乌兰察布市是新疆的第一大城市,也是新疆的重要城市之一。乌兰察布市是新疆的经济中心,也是新疆的重要交通枢纽之一。乌兰察布市的人口约为2.5万人,其中汉族占绝大多数。乌
174
- ```
 
1
+ ### v6-Finch-14B-HF
2
 
3
+ > HF compatible model for Finch-14B.
4
+ > This is an early preview for benchmarking
5
 
6
+ ![Crimson Finch Bird](./imgs/crimson-finch-unsplash-david-clode.jpg)
7
 
8
+ > origin pth weight at https://huggingface.co/BlinkDL/rwkv-6-world/blob/main/ .
 
 
9
 
10
+ More details to be done.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
imgs/crimson-finch-unsplash-david-clode.jpg ADDED