RDson commited on
Commit
fb8377d
1 Parent(s): 7166ddc

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +93 -0
README.md ADDED
@@ -0,0 +1,93 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - moe
4
+ - llama
5
+ - '3'
6
+ - llama 3
7
+ - 2x8b
8
+ ---
9
+ # GGUF files of [Llama-3-Magenta-Instruct-4x8B-MoE](https://huggingface.co/RDson/Llama-3-Magenta-Instruct-4x8B-MoE)
10
+ <img src="https://i.imgur.com/c1Mv8cy.png" width="640"/>
11
+
12
+ # Llama-3-Magenta-Instruct-4x8B-MoE
13
+ This is a experimental MoE created from [meta-llama/Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct), [nvidia/Llama3-ChatQA-1.5-8B](https://huggingface.co/nvidia/Llama3-ChatQA-1.5-8B), [Salesforce/SFR-Iterative-DPO-LLaMA-3-8B-R](https://huggingface.co/Salesforce/SFR-Iterative-DPO-LLaMA-3-8B-R) and [Muhammad2003/Llama3-8B-OpenHermes-DPO](https://huggingface.co/Muhammad2003/Llama3-8B-OpenHermes-DPO) using Mergekit.
14
+
15
+ Mergekit yaml file:
16
+ ```
17
+ base_model: Meta-Llama-3-8B-Instruct
18
+ experts:
19
+ - source_model: Meta-Llama-3-8B-Instruct
20
+ positive_prompts:
21
+ - "explain"
22
+ - "chat"
23
+ - "assistant"
24
+ - "think"
25
+ - "roleplay"
26
+ - "versatile"
27
+ - "helpful"
28
+ - "factual"
29
+ - "integrated"
30
+ - "adaptive"
31
+ - "comprehensive"
32
+ - "balanced"
33
+ negative_prompts:
34
+ - "specialized"
35
+ - "narrow"
36
+ - "focused"
37
+ - "limited"
38
+ - "specific"
39
+ - source_model: ChatQA-1.5-8B
40
+ positive_prompts:
41
+ - "python"
42
+ - "math"
43
+ - "solve"
44
+ - "code"
45
+ - "programming"
46
+ negative_prompts:
47
+ - "sorry"
48
+ - "cannot"
49
+ - "factual"
50
+ - "concise"
51
+ - "straightforward"
52
+ - "objective"
53
+ - "dry"
54
+ - source_model: SFR-Iterative-DPO-LLaMA-3-8B-R
55
+ positive_prompts:
56
+ - "chat"
57
+ - "assistant"
58
+ - "AI"
59
+ - "instructive"
60
+ - "clear"
61
+ - "directive"
62
+ - "helpful"
63
+ - "informative"
64
+ - source_model: Llama3-8B-OpenHermes-DPO
65
+ positive_prompts:
66
+ - "analytical"
67
+ - "accurate"
68
+ - "logical"
69
+ - "knowledgeable"
70
+ - "precise"
71
+ - "calculate"
72
+ - "compute"
73
+ - "solve"
74
+ - "work"
75
+ - "python"
76
+ - "code"
77
+ - "javascript"
78
+ - "programming"
79
+ - "algorithm"
80
+ - "tell me"
81
+ - "assistant"
82
+ negative_prompts:
83
+ - "creative"
84
+ - "abstract"
85
+ - "imaginative"
86
+ - "artistic"
87
+ - "emotional"
88
+ - "mistake"
89
+ - "inaccurate"
90
+ gate_mode: hidden
91
+ dtype: float16
92
+ ```
93
+ Some inspiration for the Mergekit yaml file is from [LoneStriker/Umbra-MoE-4x10.7-2.4bpw-h6-exl2](https://huggingface.co/LoneStriker/Umbra-MoE-4x10.7-2.4bpw-h6-exl2).