zhao1iang commited on
Commit
f061a0c
1 Parent(s): 1bb95a2

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +57 -4
README.md CHANGED
@@ -1,4 +1,3 @@
1
-
2
  <!-- <div align="center">
3
  <h1>
4
  ✨Skywork
@@ -7,7 +6,7 @@
7
  <div align="center"><img src="misc/skywork_logo.jpeg" width="550"/></div>
8
 
9
  <p align="center">
10
- 🤗 <a href="https://huggingface.co/Skywork" target="_blank">Hugging Face</a> • 🤖 <a href="https://modelscope.cn/organization/Skywork" target="_blank">ModelScope</a> • 👾 <a href="https://wisemodel.cn/organization/Skywork" target="_blank">Wisemodel</a> • 💬 <a href="https://github.com/SkyworkAI/Skywork/blob/main/misc/wechat.png?raw=true" target="_blank">WeChat</a>• 📜<a href="https://github.com/SkyworkAI/Skywork-MoE/blob/main/skywork-moe-tech-report.pdf" target="_blank">Tech Report</a>
11
  </p>
12
 
13
  <div align="center">
@@ -34,7 +33,7 @@ Skywork-MoE demonstrates comparable or superior performance to models with more
34
 
35
  # Table of contents
36
 
37
-
38
  - [👨‍💻Benchmark Results](#Benchmark-Results)
39
  - [🏆Demonstration of Hugging Face Model Inference](#Demonstration-of-HuggingFace-Model-Inference)
40
  - [📕Demonstration of vLLM Model Inference](#Demonstration-of-vLLM-Model-Inference)
@@ -42,10 +41,54 @@ Skywork-MoE demonstrates comparable or superior performance to models with more
42
  - [🤝Contact Us and Citation](#Contact-Us-and-Citation)
43
 
44
 
 
 
 
 
 
 
 
 
45
  # Benchmark Results
 
46
  We evaluated Skywork-MoE-Base model on various popular benchmarks, including C-Eval, MMLU, CMMLU, GSM8K, MATH and HumanEval.
47
  <img src="misc/skywork_moe_base_evaluation.png" alt="Image" width="600" height="280">
48
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
49
 
50
  # Demonstration of vLLM Model Inference
51
 
@@ -179,10 +222,20 @@ If you find our work helpful, please feel free to cite our paper~
179
  ```
180
  @misc{wei2024skywork,
181
  title={Skywork-MoE: A Deep Dive into Training Techniques for Mixture-of-Experts Language Models},
182
- author={Tianwen Wei, Bo Zhu, Liang Zhao, Cheng Cheng, Biye Li, Weiwei Lü, Peng Cheng, Jianhao Zhang, Xiaoyu Zhang, Liang Zeng, Xiaokun Wang, Yutuan Ma, Rui Hu, Shuicheng Yan, Han Fang, Yahui Zhou},
 
183
  year={2024},
184
  archivePrefix={arXiv},
185
  primaryClass={cs.CL}
186
  }
187
  ```
188
 
 
 
 
 
 
 
 
 
 
 
 
1
  <!-- <div align="center">
2
  <h1>
3
  ✨Skywork
 
6
  <div align="center"><img src="misc/skywork_logo.jpeg" width="550"/></div>
7
 
8
  <p align="center">
9
+ 🤗 <a href="https://huggingface.co/Skywork" target="_blank">Hugging Face</a> • 🤖 <a href="https://modelscope.cn/organization/Skywork" target="_blank">ModelScope</a> • 👾 <a href="https://wisemodel.cn/organization/Skywork" target="_blank">Wisemodel</a> • 💬 <a href="https://github.com/SkyworkAI/Skywork/blob/main/misc/wechat.png?raw=true" target="_blank">WeChat</a>• 📜<a href="https://arxiv.org/pdf/2406.06563" target="_blank">Tech Report</a>
10
  </p>
11
 
12
  <div align="center">
 
33
 
34
  # Table of contents
35
 
36
+ - [☁️Download URL](#Download-URL)
37
  - [👨‍💻Benchmark Results](#Benchmark-Results)
38
  - [🏆Demonstration of Hugging Face Model Inference](#Demonstration-of-HuggingFace-Model-Inference)
39
  - [📕Demonstration of vLLM Model Inference](#Demonstration-of-vLLM-Model-Inference)
 
41
  - [🤝Contact Us and Citation](#Contact-Us-and-Citation)
42
 
43
 
44
+ # Download URL
45
+
46
+ | | HuggingFace Model | ModelScope Model | Wisemodel Model |
47
+ |:-------:|:------------------------------------------------------------------------------:|:-----------------------------:|:-----------------------------:|
48
+ | **Skywork-MoE-Base** | 🤗 [Skywork-MoE-Base](https://huggingface.co/Skywork/Skywork-MoE-Base) | 🤖[Skywork-MoE-Base](https://www.modelscope.cn/models/skywork/Skywork-MoE-base) | 👾[Skywork-MoE-Base](https://wisemodel.cn/models/Skywork/Skywork-MoE-base) |
49
+ | **Skywork-MoE-Base-FP8** | 🤗 [Skywork-MoE-Base-FP8](https://huggingface.co/Skywork/Skywork-MoE-Base-FP8) | 🤖[Skywork-MoE-Base-FP8](https://www.modelscope.cn/models/skywork/Skywork-MoE-Base-FP8) | 👾[Skywork-MoE-Base-FP8](https://wisemodel.cn/models/Skywork/Skywork-MoE-Base-FP8) |
50
+ | **Skywork-MoE-Chat** | 😊 [Coming Soon]() | 🤖 | 👾 |
51
+
52
  # Benchmark Results
53
+
54
  We evaluated Skywork-MoE-Base model on various popular benchmarks, including C-Eval, MMLU, CMMLU, GSM8K, MATH and HumanEval.
55
  <img src="misc/skywork_moe_base_evaluation.png" alt="Image" width="600" height="280">
56
 
57
+ # Demonstration of Hugging Face Model Inference
58
+
59
+ ## Base Model Inference
60
+
61
+ We can perform inference for the Skywork-MoE-Base (16x13B size) model using HuggingFace on 8xA100/A800 or higher GPU hardware configurations.
62
+
63
+ ```python
64
+
65
+ from transformers import AutoModelForCausalLM, AutoTokenizer
66
+
67
+ model = AutoModelForCausalLM.from_pretrained("Skywork/Skywork-MoE-Base", trust_remote_code=True, device_map='auto')
68
+ tokenizer = AutoTokenizer.from_pretrained("Skywork/Skywork-MoE-Base", trust_remote_code=True)
69
+
70
+ inputs = tokenizer('陕西的省会是西安', return_tensors='pt').to(model.device)
71
+ response = model.generate(inputs.input_ids, max_length=128)
72
+ print(tokenizer.decode(response.cpu()[0], skip_special_tokens=True))
73
+ """
74
+ 陕西的省会是西安。
75
+ 西安,古称长安、镐京,是陕西省会、副省级市、关中平原城市群核心城市、丝绸之路起点城市、“一带一路”核心区、中国西部地区重要的中心城市,国家重要的科研、教育、工业基地。
76
+ 西安是中国四大古都之一,联合国科教文组织于1981年确定的“世界历史名城”,美媒评选的世界十大古都之一。地处关中平原中部,北濒渭河,南依秦岭,八水润长安。下辖11区2县并代管西
77
+ """
78
+
79
+ inputs = tokenizer('陕西的省会是西安,甘肃的省会是兰州,河南的省会是郑州', return_tensors='pt').to(model.device)
80
+ response = model.generate(inputs.input_ids, max_length=128)
81
+ print(tokenizer.decode(response.cpu()[0], skip_special_tokens=True))
82
+ """
83
+ 陕西的省会是西安,甘肃的省会是兰州,河南的省会是郑州,湖北的省会是武汉,湖南的省会是长沙,安徽的省会是合肥,江西的省会是南昌,江苏的省会是南京,浙江的省会是杭州,福建的省会是福州,广东的省会是广州,广西的省会是南宁,四川的省会是成都,贵州的省会是贵阳,云南的省会是昆明,山西的省会是太原,山东的省会是济南,河北的省会是石家庄,辽宁的省会是沈阳,吉林的省会是长春,黑龙江的
84
+ """
85
+
86
+ ```
87
+
88
+ ## Chat Model Inference
89
+
90
+ coming soon...
91
+
92
 
93
  # Demonstration of vLLM Model Inference
94
 
 
222
  ```
223
  @misc{wei2024skywork,
224
  title={Skywork-MoE: A Deep Dive into Training Techniques for Mixture-of-Experts Language Models},
225
+ author={Tianwen Wei, Bo Zhu, Liang Zhao, Cheng Cheng, Biye Li, Weiwei Lü, Peng Cheng, Jianhao Zhang, Xiaoyu Zhang, Liang Zeng, Xiaokun Wang, Yutuan Ma, Rui Hu, Shuicheng Yan, Han Fang, Yahui Zhou},
226
+ url={https://arxiv.org/pdf/2406.06563},
227
  year={2024},
228
  archivePrefix={arXiv},
229
  primaryClass={cs.CL}
230
  }
231
  ```
232
 
233
+ ```
234
+ @article{zhao2024longskywork,
235
+ title={LongSkywork: A Training Recipe for Efficiently Extending Context Length in Large Language Models},
236
+ author={Zhao, Liang and Wei, Tianwen and Zeng, Liang and Cheng, Cheng and Yang, Liu and Cheng, Peng and Wang, Lijie and Li, Chenxia and Wu, Xuejie and Zhu, Bo and others},
237
+ journal={arXiv preprint arXiv:2406.00605},
238
+ url={https://arxiv.org/abs/2406.00605},
239
+ year={2024}
240
+ }
241
+ ```