Skylaude commited on
Commit
f2ea86b
1 Parent(s): 53e1c6f

Upload README.md

Browse files
Files changed (1) hide show
  1. README.md +25 -0
README.md CHANGED
@@ -1,3 +1,28 @@
1
  ---
2
  license: apache-2.0
 
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
+ tags:
4
+ - MoE
5
+ - merge
6
+ - mergekit
7
+ - Mistral
8
+ - Microsoft/WizardLM-2-7B
9
  ---
10
+
11
+ # WizardLM-2-4x7B-MoE
12
+
13
+ This is an experimental MoE model made with Mergekit. It was made by combing four WizardLM-2-7B models using the random gate mode.
14
+ Please be sure to set experts per token to 4 for the best results!
15
+ Context length should be the same as Mistral-7B-Instruct-v0.1 (8k tokens).
16
+
17
+ Mergekit config:
18
+ ```
19
+ base_model: models/WizardLM-2-7B
20
+ gate_mode: random
21
+ dtype: float16
22
+ experts_per_token: 4
23
+ experts:
24
+ - source_model: models/WizardLM-2-7B
25
+ - source_model: models/WizardLM-2-7B
26
+ - source_model: models/WizardLM-2-7B
27
+ - source_model: models/WizardLM-2-7B
28
+ ```