WizardLM commited on
Commit
4dd9f3f
β€’
1 Parent(s): d210fc2

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +68 -14
README.md CHANGED
@@ -9,30 +9,73 @@ license: llama2
9
  🏠 <a href="https://wizardlm.github.io/" target="_blank">Home Page</a> </p>
10
  <p align="center">
11
  <p align="center">
12
- πŸ€— <a href="https://huggingface.co/WizardLM" target="_blank">HF Repo</a> β€’πŸ± <a href="https://github.com/nlpxucan/WizardLM" target="_blank">Github Repo</a> β€’ 🐦 <a href="https://twitter.com/WizardLM_AI" target="_blank">Twitter</a> β€’ πŸ“ƒ <a href="https://arxiv.org/abs/2304.12244" target="_blank">[WizardLM]</a> β€’ πŸ“ƒ <a href="https://arxiv.org/abs/2306.08568" target="_blank">[WizardCoder]</a> β€’ πŸ“ƒ <a href="https://arxiv.org/abs/2308.09583" target="_blank">[WizardMath]</a> <br>
 
 
13
  </p>
14
  <p align="center">
15
  πŸ‘‹ Join our <a href="https://discord.gg/VZjjHtWrKs" target="_blank">Discord</a>
16
  </p>
17
 
18
- | Model | Checkpoint | Paper | HumanEval | MBPP | Demo | License |
19
- | ----- |------| ---- |------|-------| ----- | ----- |
20
- | WizardCoder-Python-34B-V1.0 | πŸ€— <a href="https://huggingface.co/WizardLM/WizardCoder-Python-34B-V1.0" target="_blank">HF Link</a> | πŸ“ƒ <a href="https://arxiv.org/abs/2306.08568" target="_blank">[WizardCoder]</a> | 73.2 | 61.2 | [Demo](http://47.103.63.15:50085/) | <a href="https://ai.meta.com/resources/models-and-libraries/llama-downloads/" target="_blank">Llama2</a> |
21
- | WizardCoder-15B-V1.0 | πŸ€— <a href="https://huggingface.co/WizardLM/WizardCoder-15B-V1.0" target="_blank">HF Link</a> | πŸ“ƒ <a href="https://arxiv.org/abs/2306.08568" target="_blank">[WizardCoder]</a> | 59.8 |50.6 | -- | <a href="https://huggingface.co/spaces/bigcode/bigcode-model-license-agreement" target="_blank">OpenRAIL-M</a> |
22
- | WizardCoder-Python-13B-V1.0 | πŸ€— <a href="https://huggingface.co/WizardLM/WizardCoder-Python-13B-V1.0" target="_blank">HF Link</a> | πŸ“ƒ <a href="https://arxiv.org/abs/2306.08568" target="_blank">[WizardCoder]</a> | 64.0 | 55.6 | -- | <a href="https://ai.meta.com/resources/models-and-libraries/llama-downloads/" target="_blank">Llama2</a> |
23
- | WizardCoder-Python-7B-V1.0 | πŸ€— <a href="https://huggingface.co/WizardLM/WizardCoder-Python-7B-V1.0" target="_blank">HF Link</a> | πŸ“ƒ <a href="https://arxiv.org/abs/2306.08568" target="_blank">[WizardCoder]</a> | 55.5 | 51.6 | [Demo](http://47.103.63.15:50088/) | <a href="https://ai.meta.com/resources/models-and-libraries/llama-downloads/" target="_blank">Llama2</a> |
24
- | WizardCoder-3B-V1.0 | πŸ€— <a href="https://huggingface.co/WizardLM/WizardCoder-3B-V1.0" target="_blank">HF Link</a> | πŸ“ƒ <a href="https://arxiv.org/abs/2306.08568" target="_blank">[WizardCoder]</a> | 34.8 |37.4 | -- | <a href="https://huggingface.co/spaces/bigcode/bigcode-model-license-agreement" target="_blank">OpenRAIL-M</a> |
25
- | WizardCoder-1B-V1.0 | πŸ€— <a href="https://huggingface.co/WizardLM/WizardCoder-1B-V1.0" target="_blank">HF Link</a> | πŸ“ƒ <a href="https://arxiv.org/abs/2306.08568" target="_blank">[WizardCoder]</a> | 23.8 |28.6 | -- | <a href="https://huggingface.co/spaces/bigcode/bigcode-model-license-agreement" target="_blank">OpenRAIL-M</a> |
26
 
 
27
 
28
- | Model | Checkpoint | Paper | GSM8k | MATH |Online Demo| License|
29
- | ----- |------| ---- |------|-------| ----- | ----- |
30
- | WizardMath-70B-V1.0 | πŸ€— <a href="https://huggingface.co/WizardLM/WizardMath-70B-V1.0" target="_blank">HF Link</a> | πŸ“ƒ <a href="https://arxiv.org/abs/2308.09583" target="_blank">[WizardMath]</a>| **81.6** | **22.7** |[Demo](http://47.103.63.15:50083/)| <a href="https://ai.meta.com/resources/models-and-libraries/llama-downloads/" target="_blank">Llama 2 </a> |
31
- | WizardMath-13B-V1.0 | πŸ€— <a href="https://huggingface.co/WizardLM/WizardMath-13B-V1.0" target="_blank">HF Link</a> | πŸ“ƒ <a href="https://arxiv.org/abs/2308.09583" target="_blank">[WizardMath]</a>| **63.9** | **14.0** |[Demo](http://47.103.63.15:50082/)| <a href="https://ai.meta.com/resources/models-and-libraries/llama-downloads/" target="_blank">Llama 2 </a> |
32
- | WizardMath-7B-V1.0 | πŸ€— <a href="https://huggingface.co/WizardLM/WizardMath-7B-V1.0" target="_blank">HF Link</a> | πŸ“ƒ <a href="https://arxiv.org/abs/2308.09583" target="_blank">[WizardMath]</a>| **54.9** | **10.7** | [Demo](http://47.103.63.15:50080/)| <a href="https://ai.meta.com/resources/models-and-libraries/llama-downloads/" target="_blank">Llama 2 </a>|
33
 
 
 
 
 
 
 
 
 
34
 
35
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
36
  <font size=4>
37
 
38
  | <sup>Model</sup> | <sup>Checkpoint</sup> | <sup>Paper</sup> |<sup>MT-Bench</sup> | <sup>AlpacaEval</sup> | <sup>GSM8k</sup> | <sup>HumanEval</sup> | <sup>License</sup>|
@@ -45,6 +88,17 @@ license: llama2
45
  | <sup>WizardLM-7B-V1.0 </sup>| <sup>πŸ€— <a href="https://huggingface.co/WizardLM/WizardLM-7B-V1.0" target="_blank">HF Link</a> </sup> |<sup> πŸ“ƒ <a href="https://arxiv.org/abs/2304.12244" target="_blank">[WizardLM]</a> </sup>| | | |<sup>19.1 pass@1 </sup>|<sup> Non-commercial</sup>|
46
  </font>
47
 
 
 
 
 
 
 
 
 
 
 
 
48
  **Github Repo**: https://github.com/nlpxucan/WizardLM/tree/main/WizardMath
49
 
50
  **Twitter**: https://twitter.com/WizardLM_AI/status/1689998428200112128
 
9
  🏠 <a href="https://wizardlm.github.io/" target="_blank">Home Page</a> </p>
10
  <p align="center">
11
  <p align="center">
12
+ πŸ€— <a href="https://huggingface.co/WizardLM" target="_blank">HF Repo</a> β€’πŸ± <a href="https://github.com/nlpxucan/WizardLM" target="_blank">Github Repo</a> β€’ 🐦 <a href="https://twitter.com/WizardLM_AI" target="_blank">Twitter</a> </p>
13
+ <p align="center">
14
+ πŸ“ƒ <a href="https://arxiv.org/abs/2304.12244" target="_blank">[WizardLM]</a> β€’ πŸ“ƒ <a href="https://arxiv.org/abs/2306.08568" target="_blank">[WizardCoder]</a> β€’ πŸ“ƒ <a href="https://arxiv.org/abs/2308.09583" target="_blank">[WizardMath]</a> <br>
15
  </p>
16
  <p align="center">
17
  πŸ‘‹ Join our <a href="https://discord.gg/VZjjHtWrKs" target="_blank">Discord</a>
18
  </p>
19
 
20
+ ## News
 
 
 
 
 
 
 
21
 
22
+ [12/19/2023] πŸ”₯ We released **WizardMath-7B-V1.1** trained from Mistral-7B, the **SOTA 7B math LLM**, achieves **83.2 pass@1** on GSM8k, and **33.0 pass@1** on MATH.
23
 
24
+ [12/19/2023] πŸ”₯ **WizardMath-7B-V1.1** outperforms **ChatGPT 3.5**, **Gemini Pro**, **Mixtral MOE**, and **Claude Instant** on GSM8K pass@1.
 
 
 
 
25
 
26
+ [12/19/2023] πŸ”₯ **WizardMath-7B-V1.1** is comparable with **ChatGPT 3.5**, **Gemini Pro**, and surpasses **Mixtral MOE** on MATH pass@1.
27
+
28
+ | Model | Checkpoint | Paper | GSM8k | MATH |
29
+ | ----- |------| ---- |------|-------|
30
+ | **WizardMath-7B-V1.1** | πŸ€— <a href="https://huggingface.co/WizardLM/WizardMath-7B-V1.1" target="_blank">HF Link</a> | πŸ“ƒ <a href="https://arxiv.org/abs/2308.09583" target="_blank">[WizardMath]</a>| **83.2** | **33.0** |
31
+ | WizardMath-70B-V1.0 | πŸ€— <a href="https://huggingface.co/WizardLM/WizardMath-70B-V1.0" target="_blank">HF Link</a> | πŸ“ƒ <a href="https://arxiv.org/abs/2308.09583" target="_blank">[WizardMath]</a>| **81.6** | **22.7** |
32
+ | WizardMath-13B-V1.0 | πŸ€— <a href="https://huggingface.co/WizardLM/WizardMath-13B-V1.0" target="_blank">HF Link</a> | πŸ“ƒ <a href="https://arxiv.org/abs/2308.09583" target="_blank">[WizardMath]</a>| **63.9** | **14.0** |
33
+ | WizardMath-7B-V1.0 | πŸ€— <a href="https://huggingface.co/WizardLM/WizardMath-7B-V1.0" target="_blank">HF Link</a> | πŸ“ƒ <a href="https://arxiv.org/abs/2308.09583" target="_blank">[WizardMath]</a>| **54.9** | **10.7** |
34
 
35
 
36
+
37
+
38
+ ## [12/19/2023] Comparing WizardMath-7B-V1.1 with other open source 7B size math LLMs.
39
+
40
+ | Model | GSM8k Pass@1 | MATH Pass@1 |
41
+ | ----- |------| ---- |
42
+ | MPT-7B | 6.8 | 3.0 |
43
+ |Llama 1-7B | 11.0 | 2.9 |
44
+ |Llama 2-7B|12.3 |2.8 |
45
+ |Yi-6b| 32.6 |5.8 |
46
+ |Mistral-7B|37.8 |9.1 |
47
+ |Qwen-7b|47.8 |9.3 |
48
+ | RFT-7B | 50.3 | -- |
49
+ | MAmmoTH-7B (COT) | 50.5 | 10.4 |
50
+ | WizardMath-7B-V1.0 | 54.9 | 10.7 |
51
+ |Abel-7B-001 |59.7 |13 |
52
+ | MetaMath-7B | 66.5 | 19.8 |
53
+ | Arithmo-Mistral-7B | 74.7 | 25.3 |
54
+ |MetaMath-Mistral-7B|77.7 |28.2 |
55
+ |Abel-7B-002 | 80.4 | 29.5 |
56
+ | **WizardMath-7B-V1.1** | **83.2** | **33.0** |
57
+
58
+
59
+ ## [12/19/2023] Comparing WizardMath-7B-V1.1 with large open source (30B~70B) LLMs.
60
+
61
+ | Model | GSM8k Pass@1 | MATH Pass@1 |
62
+ | ----- |------| ---- |
63
+ | Llemma-34B | 51.5 | 25.0 |
64
+ | Minerva-62B | 52.4 | 27.6 |
65
+ | Llama 2-70B | 56.8 | 13.5 |
66
+ | DeepSeek 67B | 63.4 | -- |
67
+ | Gork 33B | 62.9 | 23.9 |
68
+ | MAmmoTH-70B | 72.4 | 21.1 |
69
+ | Yi-34B | 67.9 | 15.9 |
70
+ | Mixtral 8x7B | 74.4 | 28.4 |
71
+ | MetaMath-70B | 82.3 | 26.6 |
72
+ | **WizardMath-7B-V1.1** | **83.2** | **33.0** |
73
+
74
+
75
+ ## ❗ Data Contamination Check:
76
+
77
+ Before model training, we carefully and rigorously checked all the training data, and used multiple deduplication methods to verify and prevent data leakage on GSM8k and MATH test set.
78
+
79
  <font size=4>
80
 
81
  | <sup>Model</sup> | <sup>Checkpoint</sup> | <sup>Paper</sup> |<sup>MT-Bench</sup> | <sup>AlpacaEval</sup> | <sup>GSM8k</sup> | <sup>HumanEval</sup> | <sup>License</sup>|
 
88
  | <sup>WizardLM-7B-V1.0 </sup>| <sup>πŸ€— <a href="https://huggingface.co/WizardLM/WizardLM-7B-V1.0" target="_blank">HF Link</a> </sup> |<sup> πŸ“ƒ <a href="https://arxiv.org/abs/2304.12244" target="_blank">[WizardLM]</a> </sup>| | | |<sup>19.1 pass@1 </sup>|<sup> Non-commercial</sup>|
89
  </font>
90
 
91
+
92
+ | Model | Checkpoint | Paper | HumanEval | MBPP | Demo | License |
93
+ | ----- |------| ---- |------|-------| ----- | ----- |
94
+ | WizardCoder-Python-34B-V1.0 | πŸ€— <a href="https://huggingface.co/WizardLM/WizardCoder-Python-34B-V1.0" target="_blank">HF Link</a> | πŸ“ƒ <a href="https://arxiv.org/abs/2306.08568" target="_blank">[WizardCoder]</a> | 73.2 | 61.2 | [Demo](http://47.103.63.15:50085/) | <a href="https://ai.meta.com/resources/models-and-libraries/llama-downloads/" target="_blank">Llama2</a> |
95
+ | WizardCoder-15B-V1.0 | πŸ€— <a href="https://huggingface.co/WizardLM/WizardCoder-15B-V1.0" target="_blank">HF Link</a> | πŸ“ƒ <a href="https://arxiv.org/abs/2306.08568" target="_blank">[WizardCoder]</a> | 59.8 |50.6 | -- | <a href="https://huggingface.co/spaces/bigcode/bigcode-model-license-agreement" target="_blank">OpenRAIL-M</a> |
96
+ | WizardCoder-Python-13B-V1.0 | πŸ€— <a href="https://huggingface.co/WizardLM/WizardCoder-Python-13B-V1.0" target="_blank">HF Link</a> | πŸ“ƒ <a href="https://arxiv.org/abs/2306.08568" target="_blank">[WizardCoder]</a> | 64.0 | 55.6 | -- | <a href="https://ai.meta.com/resources/models-and-libraries/llama-downloads/" target="_blank">Llama2</a> |
97
+ | WizardCoder-Python-7B-V1.0 | πŸ€— <a href="https://huggingface.co/WizardLM/WizardCoder-Python-7B-V1.0" target="_blank">HF Link</a> | πŸ“ƒ <a href="https://arxiv.org/abs/2306.08568" target="_blank">[WizardCoder]</a> | 55.5 | 51.6 | [Demo](http://47.103.63.15:50088/) | <a href="https://ai.meta.com/resources/models-and-libraries/llama-downloads/" target="_blank">Llama2</a> |
98
+ | WizardCoder-3B-V1.0 | πŸ€— <a href="https://huggingface.co/WizardLM/WizardCoder-3B-V1.0" target="_blank">HF Link</a> | πŸ“ƒ <a href="https://arxiv.org/abs/2306.08568" target="_blank">[WizardCoder]</a> | 34.8 |37.4 | -- | <a href="https://huggingface.co/spaces/bigcode/bigcode-model-license-agreement" target="_blank">OpenRAIL-M</a> |
99
+ | WizardCoder-1B-V1.0 | πŸ€— <a href="https://huggingface.co/WizardLM/WizardCoder-1B-V1.0" target="_blank">HF Link</a> | πŸ“ƒ <a href="https://arxiv.org/abs/2306.08568" target="_blank">[WizardCoder]</a> | 23.8 |28.6 | -- | <a href="https://huggingface.co/spaces/bigcode/bigcode-model-license-agreement" target="_blank">OpenRAIL-M</a> |
100
+
101
+
102
  **Github Repo**: https://github.com/nlpxucan/WizardLM/tree/main/WizardMath
103
 
104
  **Twitter**: https://twitter.com/WizardLM_AI/status/1689998428200112128