Wanfq commited on
Commit
81f3f34
β€’
1 Parent(s): dc892da

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +20 -4
README.md CHANGED
@@ -20,8 +20,8 @@ library_name: transformers
20
 
21
 
22
  <h4> |<a href="https://arxiv.org/abs/2402.16107"> πŸ“‘ Paper </a> |
23
- <a href="https://huggingface.co/FuseAI"> πŸ€— Huggingface Repo </a> |
24
- <a href="https://github.com/fanqiwan/FuseLLM"> 🐱 Github Repo </a> |
25
  </h4>
26
 
27
  <!-- **Authors:** -->
@@ -38,12 +38,28 @@ _Sun Yat-sen University_
38
  <img src="./assets/fig_0.png" width="70%"> <br>
39
  </p>
40
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
41
  </div>
42
 
43
 
44
  ## News
45
  - **Feb 26, 2024:** πŸ”₯ We release [FuseChat-7B-VaRM](https://huggingface.co/FuseAI/FuseChat-7B-VaRM), which is the fusion of three prominent chat LLMs with diverse architectures and scales, namely [NH2-Mixtral-8x7B](https://huggingface.co/NousResearch/Nous-Hermes-2-Mixtral-8x7B-DPO), [NH2-Solar-10.7B](https://huggingface.co/NousResearch/Nous-Hermes-2-SOLAR-10.7B), and [OpenChat-3.5-7B](https://huggingface.co/openchat/openchat_3.5). FuseChat-7B-VaRM achieves an average performance of **8.22** on MT-Bench, outperforming various powerful chat LLMs at 7B and 34B scales like [Starling-7B](https://huggingface.co/berkeley-nest/Starling-LM-7B-alpha) and [Yi-34B-Chat](https://huggingface.co/01-ai/Yi-34B-Chat), even surpassing [GPT-3.5 (March)](https://platform.openai.com/docs/models/gpt-3-5-turbo), [Claude-2.1](https://www.anthropic.com/news/claude-2-1), and approaching [Mixtral-8x7B-Instruct](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1).
46
 
 
47
 
48
  ## Contents
49
 
@@ -70,7 +86,7 @@ Moreover, we argue that the concept of knowledge fusion adopted by both FuseChat
70
 
71
  ## Model Release
72
 
73
- We release [FuseChat-7B-VaRM](https://huggingface.co/FuseAI/FuseChat-7B-VaRM), which is the fusion of three prominent chat LLMs with diverse architectures and scales, namely [NH2-Mixtral-8x7B](https://huggingface.co/NousResearch/Nous-Hermes-2-Mixtral-8x7B-DPO), [NH2-Solar-10.7B](https://huggingface.co/NousResearch/Nous-Hermes-2-SOLAR-10.7B), and [OpenChat-3.5-7B](https://huggingface.co/openchat/openchat_3.5).
74
 
75
  To support a plug-and-play fusion of new source LLM, we release our target LLMs: [OpenChat-3.5-7B-Solar](https://huggingface.co/FuseAI/OpenChat-3.5-7B-Solar) and [OpenChat-3.5-7B-Mixtral](https://huggingface.co/FuseAI/OpenChat-3.5-7B-Mixtral), which are obtained from pair-wise knowledge fusion. Integrating a new source LLM at any scale requires only obtaining a target LLM from the new source LLM and merging it with the existing target LLMs.
76
 
@@ -333,7 +349,7 @@ torchrun --nproc_per_node=8 --master_port=20001 /train/train.py \
333
  We show the scripts to obtain the final FuseChat using different merging methods.
334
 
335
  ```bash
336
- # For "slerp", "ta", "ties", and "dare" methods
337
  export CUDA_VISIBLE_DEVICES=0
338
  mergekit-yaml merge/mergekit_configs/fusechat-slerp.yml "<path_to_save_fusechat_7b_slerp>"
339
  mergekit-yaml merge/mergekit_configs/fusechat-ta.yml "<path_to_save_fusechat_7b_ta>"
 
20
 
21
 
22
  <h4> |<a href="https://arxiv.org/abs/2402.16107"> πŸ“‘ Paper </a> |
23
+ <a href="https://huggingface.co/FuseAI"> πŸ€— HuggingFace Repo </a> |
24
+ <a href="https://github.com/fanqiwan/FuseLLM"> 🐱 GitHub Repo </a> |
25
  </h4>
26
 
27
  <!-- **Authors:** -->
 
38
  <img src="./assets/fig_0.png" width="70%"> <br>
39
  </p>
40
 
41
+ | Proprietary Models | #Params | MT-Bench | Open Source Models | #Params | MT-Bench |
42
+ |-----------------------------------------------------------------------|---------|----------|-----------------------------------------------------------------------|---------|----------|
43
+ | GPT-4-1106-preview | - | 9.32 | Qwen1.5-72B-Chat | 72B | 8.61 |
44
+ | GPT-4-0613 | - | 9.18 | Nous-Hermes-2-Mixtral-8x7B-DPO | 8x7B | 8.33 |
45
+ | GPT-4-0314 | - | 8.96 | Mixtral-8x7B-Instruct-v0.1 | 8x7B | 8.30 |
46
+ | Mistral Medium | - | 8.61 | πŸ€— [FuseChat-7B-VaRM](https://huggingface.co/FuseAI/FuseChat-7B-VaRM) | 7B | 8.22 |
47
+ | GPT-3.5-Turbo-0613 | - | 8.39 | Starling-LM-7B-alpha | 7B | 8.09 |
48
+ | GPT-3.5-Turbo-1106 | - | 8.32 | Tulu-2-DPO-70B | 70B | 7.89 |
49
+ | πŸ€— [FuseChat-7B-VaRM](https://huggingface.co/FuseAI/FuseChat-7B-VaRM) | 7B | 8.22 | OpenChat-3.5 | 7B | 7.81 |
50
+ | Claude-2.1 | - | 8.18 | OpenChat-3.5-0106 | 7B | 7.80 |
51
+ | Claude-2.0 | - | 8.06 | WizardLM-70B-v1.0 | 70B | 7.71 |
52
+ | GPT-3.5-Turbo-0314 | - | 7.94 | Yi-34B-Chat | 34B | 7.67 |
53
+ | Claude-1 | - | 7.90 | Nous-Hermes-2-SOLAR-10.7B | 10.7B | 7.66 |
54
+
55
+
56
  </div>
57
 
58
 
59
  ## News
60
  - **Feb 26, 2024:** πŸ”₯ We release [FuseChat-7B-VaRM](https://huggingface.co/FuseAI/FuseChat-7B-VaRM), which is the fusion of three prominent chat LLMs with diverse architectures and scales, namely [NH2-Mixtral-8x7B](https://huggingface.co/NousResearch/Nous-Hermes-2-Mixtral-8x7B-DPO), [NH2-Solar-10.7B](https://huggingface.co/NousResearch/Nous-Hermes-2-SOLAR-10.7B), and [OpenChat-3.5-7B](https://huggingface.co/openchat/openchat_3.5). FuseChat-7B-VaRM achieves an average performance of **8.22** on MT-Bench, outperforming various powerful chat LLMs at 7B and 34B scales like [Starling-7B](https://huggingface.co/berkeley-nest/Starling-LM-7B-alpha) and [Yi-34B-Chat](https://huggingface.co/01-ai/Yi-34B-Chat), even surpassing [GPT-3.5 (March)](https://platform.openai.com/docs/models/gpt-3-5-turbo), [Claude-2.1](https://www.anthropic.com/news/claude-2-1), and approaching [Mixtral-8x7B-Instruct](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1).
61
 
62
+ - **Feb 25, 2024:** πŸ”₯ We release [FuseChat-Mixture](https://huggingface.co/datasets/FuseAI/FuseChat-Mixture), which is a comprehensive training dataset covers different styles and capabilities, featuring both human-written and model-generated, and spanning general instruction-following and specific skills.
63
 
64
  ## Contents
65
 
 
86
 
87
  ## Model Release
88
 
89
+ We release [FuseChat-7B-VaRM](https://huggingface.co/FuseAI/FuseChat-7B-VaRM), which is the fusion of three prominent chat LLMs with diverse architectures and scales, namely [NH2-Mixtral-8x7B](https://huggingface.co/NousResearch/Nous-Hermes-2-Mixtral-8x7B-DPO), [NH2-Solar-10.7B](https://huggingface.co/NousResearch/Nous-Hermes-2-SOLAR-10.7B), and [OpenChat-3.5-7B](https://huggingface.co/openchat/openchat_3.5). FuseChat-7B-VaRM achieves an average performance of **8.22** on MT-Bench, outperforming various powerful chat LLMs at 7B and 34B scales like [Starling-7B](https://huggingface.co/berkeley-nest/Starling-LM-7B-alpha) and [Yi-34B-Chat](https://huggingface.co/01-ai/Yi-34B-Chat), even surpassing [GPT-3.5 (March)](https://platform.openai.com/docs/models/gpt-3-5-turbo), [Claude-2.1](https://www.anthropic.com/news/claude-2-1), and approaching [Mixtral-8x7B-Instruct](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1).
90
 
91
  To support a plug-and-play fusion of new source LLM, we release our target LLMs: [OpenChat-3.5-7B-Solar](https://huggingface.co/FuseAI/OpenChat-3.5-7B-Solar) and [OpenChat-3.5-7B-Mixtral](https://huggingface.co/FuseAI/OpenChat-3.5-7B-Mixtral), which are obtained from pair-wise knowledge fusion. Integrating a new source LLM at any scale requires only obtaining a target LLM from the new source LLM and merging it with the existing target LLMs.
92
 
 
349
  We show the scripts to obtain the final FuseChat using different merging methods.
350
 
351
  ```bash
352
+ # For "slerp", "ta", "ties", and "dare" methods (Please install "mergekit")
353
  export CUDA_VISIBLE_DEVICES=0
354
  mergekit-yaml merge/mergekit_configs/fusechat-slerp.yml "<path_to_save_fusechat_7b_slerp>"
355
  mergekit-yaml merge/mergekit_configs/fusechat-ta.yml "<path_to_save_fusechat_7b_ta>"