Update README.md
Browse files
README.md
CHANGED
@@ -191,7 +191,41 @@ Covering 12 Major Languages including English, Chinese, French, Hindi, Spanish,
|
|
191 |
|
192 |
|
193 |
</details>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
194 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
195 |
|
196 |
## Results reproduction
|
197 |
<details><summary>Click to expand</summary>
|
|
|
191 |
|
192 |
|
193 |
</details>
|
194 |
+
## Model Download and Inference
|
195 |
+
We take Apollo-MoE-0.5B as an example
|
196 |
+
1. Login Huggingface
|
197 |
+
|
198 |
+
```
|
199 |
+
huggingface-cli login --token $HUGGINGFACE_TOKEN
|
200 |
+
```
|
201 |
+
|
202 |
+
2. Download model to local dir
|
203 |
+
|
204 |
+
```
|
205 |
+
from huggingface_hub import snapshot_download
|
206 |
+
import os
|
207 |
+
|
208 |
+
local_model_dir=os.path.join('/path/to/models/dir','Apollo-MoE-0.5B')
|
209 |
+
snapshot_download(repo_id="FreedomIntelligence/Apollo-MoE-0.5B", local_dir=local_model_dir)
|
210 |
+
```
|
211 |
+
|
212 |
+
3. Inference Example
|
213 |
|
214 |
+
```
|
215 |
+
from transformers import AutoTokenizer, AutoModelForCausalLM, GenerationConfig
|
216 |
+
import os
|
217 |
+
|
218 |
+
local_model_dir=os.path.join('/path/to/models/dir','Apollo-MoE-0.5B')
|
219 |
+
|
220 |
+
model=AutoModelForCausalLM.from_pretrained(local_model_dir,trust_remote_code=True)
|
221 |
+
tokenizer = AutoTokenizer.from_pretrained(local_model_dir,trust_remote_code=True)
|
222 |
+
generation_config = GenerationConfig.from_pretrained(local_model_dir, pad_token_id=tokenizer.pad_token_id, num_return_sequences=1, max_new_tokens=7, min_new_tokens=2, do_sample=False, temperature=1.0, top_k=50, top_p=1.0)
|
223 |
+
|
224 |
+
inputs = tokenizer('Answer direclty.\nThe capital of Mongolia is Ulaanbaatar.\nThe capital of Iceland is Reykjavik.\nThe capital of Australia is', return_tensors='pt')
|
225 |
+
inputs = inputs.to(model.device)
|
226 |
+
pred = model.generate(**inputs,generation_config=generation_config)
|
227 |
+
print(tokenizer.decode(pred.cpu()[0], skip_special_tokens=True))
|
228 |
+
```
|
229 |
|
230 |
## Results reproduction
|
231 |
<details><summary>Click to expand</summary>
|