x54-729 commited on
Commit
caaa9dd
1 Parent(s): 2815e26

Add stream_char example; Add torch_dtype to example

Browse files
Files changed (1) hide show
  1. README.md +67 -29
README.md CHANGED
@@ -59,22 +59,41 @@ We conducted a comprehensive evaluation of InternLM using the open-source evalua
59
  ### Import from Transformers
60
  To load the InternLM 7B Chat model using Transformers, use the following code:
61
  ```python
62
- >>> from transformers import AutoTokenizer, AutoModelForCausalLM
63
- >>> tokenizer = AutoTokenizer.from_pretrained("internlm/internlm-chat-7b", trust_remote_code=True)
64
- >>> model = AutoModelForCausalLM.from_pretrained("internlm/internlm-chat-7b", trust_remote_code=True).cuda()
65
- >>> model = model.eval()
66
- >>> response, history = model.chat(tokenizer, "hello", history=[])
67
- >>> print(response)
68
- Hello! How can I help you today?
69
- >>> response, history = model.chat(tokenizer, "please provide three suggestions about time management", history=history)
70
- >>> print(response)
71
- Sure, here are three tips for effective time management:
72
-
73
- 1. Prioritize tasks based on importance and urgency: Make a list of all your tasks and categorize them into "important and urgent," "important but not urgent," and "not important but urgent." Focus on completing the tasks in the first category before moving on to the others.
74
- 2. Use a calendar or planner: Write down deadlines and appointments in a calendar or planner so you don't forget them. This will also help you schedule your time more effectively and avoid overbooking yourself.
75
- 3. Minimize distractions: Try to eliminate any potential distractions when working on important tasks. Turn off notifications on your phone, close unnecessary tabs on your computer, and find a quiet place to work if possible.
76
-
77
- Remember, good time management skills take practice and patience. Start with small steps and gradually incorporate these habits into your daily routine.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
78
  ```
79
 
80
  ### Dialogue
@@ -126,19 +145,38 @@ InternLM ,即书生·浦语大模型,包含面向实用场景的70亿参数
126
  ### 通过 Transformers 加载
127
  通过以下的代码加载 InternLM 7B Chat 模型
128
  ```python
129
- >>> from transformers import AutoTokenizer, AutoModelForCausalLM
130
- >>> tokenizer = AutoTokenizer.from_pretrained("internlm/internlm-chat-7b", trust_remote_code=True)
131
- >>> model = AutoModelForCausalLM.from_pretrained("internlm/internlm-chat-7b", trust_remote_code=True).cuda()
132
- >>> model = model.eval()
133
- >>> response, history = model.chat(tokenizer, "你好", history=[])
134
- >>> print(response)
135
- 你好!有什么我可以帮助你的吗?
136
- >>> response, history = model.chat(tokenizer, "请提供三个管理时间的建议。", history=history)
137
- >>> print(response)
138
- 当然可以!以下是三个管理时间的建议:
139
- 1. 制定计划:制定一个详细的计划,包括每天要完成的任务和活动。这将有助于您更好地组织时间,并确保您能够按时完成任务。
140
- 2. 优先级:将任务按照优先级排序,先完成最重要的任务。这将确保您能够在最短的时间内完成最重要的任务,从而节省时间���
141
- 3. 集中注意力:避免分心,集中注意力完成任务。关闭社交媒体和电子邮件通知,专注于任务,这将帮助您更快地完成任务,并减少错误的可能性。
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
142
  ```
143
 
144
  ### 通过前端网页对话
 
59
  ### Import from Transformers
60
  To load the InternLM 7B Chat model using Transformers, use the following code:
61
  ```python
62
+ import torch
63
+ from transformers import AutoTokenizer, AutoModelForCausalLM
64
+ tokenizer = AutoTokenizer.from_pretrained("internlm/internlm-chat-7b", trust_remote_code=True)
65
+ # Set `torch_dtype=torch.float16` to load model in float16, otherwise it will be loaded as float32 and cause OOM Error.
66
+ model = AutoModelForCausalLM.from_pretrained("internlm/internlm-chat-7b", torch_dtype=torch.float16, trust_remote_code=True).cuda()
67
+ model = model.eval()
68
+ response, history = model.chat(tokenizer, "hello", history=[])
69
+ print(response)
70
+ # Hello! How can I help you today?
71
+ response, history = model.chat(tokenizer, "please provide three suggestions about time management", history=history)
72
+ print(response)
73
+ # Sure, here are three tips for effective time management:
74
+ #
75
+ # 1. Prioritize tasks based on importance and urgency: Make a list of all your tasks and categorize them into "important and urgent," "important but not urgent," and "not important but urgent." Focus on completing the tasks in the first category before moving on to the others.
76
+ # 2. Use a calendar or planner: Write down deadlines and appointments in a calendar or planner so you don't forget them. This will also help you schedule your time more effectively and avoid overbooking yourself.
77
+ # 3. Minimize distractions: Try to eliminate any potential distractions when working on important tasks. Turn off notifications on your phone, close unnecessary tabs on your computer, and find a quiet place to work if possible.
78
+ #
79
+ # Remember, good time management skills take practice and patience. Start with small steps and gradually incorporate these habits into your daily routine.
80
+ ```
81
+
82
+ The responses can be streamed using `stream_chat`:
83
+
84
+ ```python
85
+ import torch
86
+ from transformers import AutoModelForCausalLM, AutoTokenizer
87
+
88
+ model_path = "internlm/internlm-chat-7b"
89
+ model = AutoModelForCausalLM.from_pretrained(model_path, torch_dtype=torch.float16, trust_remote_code=True)
90
+ tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)
91
+
92
+ model = model.eval()
93
+ length = 0
94
+ for response, history in model.stream_chat(tokenizer, "Hello", history=[]):
95
+ print(response[length:], flush=True, end="")
96
+ length = len(response)
97
  ```
98
 
99
  ### Dialogue
 
145
  ### 通过 Transformers 加载
146
  通过以下的代码加载 InternLM 7B Chat 模型
147
  ```python
148
+ import torch
149
+ from transformers import AutoTokenizer, AutoModelForCausalLM
150
+ tokenizer = AutoTokenizer.from_pretrained("internlm/internlm-chat-7b", trust_remote_code=True)
151
+ # `torch_dtype=torch.float16` 可以令模型以 float16 精度加载,否则 transformers 会将模型加载为 float32,导致显存不足
152
+ model = AutoModelForCausalLM.from_pretrained("internlm/internlm-chat-7b", torch_dtype=torch.float16, trust_remote_code=True).cuda()
153
+ model = model.eval()
154
+ response, history = model.chat(tokenizer, "你好", history=[])
155
+ print(response)
156
+ # 你好!有什么我可以帮助你的吗?
157
+ response, history = model.chat(tokenizer, "请提供三个管理时间的建议。", history=history)
158
+ print(response)
159
+ # 当然可以!以下是三个管理时间的建议:
160
+ # 1. 制定计划:制定一个详细的计划,包括每天要完成的任务和活动。这将有助于您更好地组织时间,并确保您能够按时完成任务。
161
+ # 2. 优先级:将任务按照优先级排序,先完成最重要的任务。这将确保您能够在最短的时间内完成最重要的任务,从而节省时间。
162
+ # 3. 集中注意力:避免分心,集中注意力完成任务。关闭社交媒体和电子邮件通知,专注于任务,这将帮助您更快地完成任务,并减少错误的可能性。
163
+ ```
164
+
165
+ 如果想进行流式生成,则可以使用 `stream_chat` 接口:
166
+
167
+ ```python
168
+ import torch
169
+ from transformers import AutoModelForCausalLM, AutoTokenizer
170
+
171
+ model_path = "internlm/internlm-chat-7b"
172
+ model = AutoModelForCausalLM.from_pretrained(model_path, torch_dype=torch.float16, trust_remote_code=True)
173
+ tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)
174
+
175
+ model = model.eval()
176
+ length = 0
177
+ for response, history in model.stream_chat(tokenizer, "你好", history=[]):
178
+ print(response[length:], flush=True, end="")
179
+ length = len(response)
180
  ```
181
 
182
  ### 通过前端网页对话