jsfs11 commited on
Commit
8505d94
1 Parent(s): 6d097d2

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +100 -0
README.md ADDED
@@ -0,0 +1,100 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ tags:
4
+ - moe
5
+ - frankenmoe
6
+ - merge
7
+ - mergekit
8
+ - lazymergekit
9
+ - jsfs11/RandomMergeNoNormWEIGHTED-7B-DARETIES
10
+ - senseable/WestLake-7B-v2
11
+ - mlabonne/OmniBeagle-7B
12
+ - vanillaOVO/supermario_v3
13
+ base_model:
14
+ - jsfs11/RandomMergeNoNormWEIGHTED-7B-DARETIES
15
+ - senseable/WestLake-7B-v2
16
+ - mlabonne/OmniBeagle-7B
17
+ - vanillaOVO/supermario_v3
18
+ ---
19
+
20
+ # MixtureofMerges-MoE-4x7b-v3
21
+
22
+ MixtureofMerges-MoE-4x7b-v3 is a Mixure of Experts (MoE) made with the following models using [LazyMergekit](https://colab.research.google.com/drive/1obulZ1ROXHjYLn6PPZJwRR6GzgQogxxb?usp=sharing):
23
+ * [jsfs11/RandomMergeNoNormWEIGHTED-7B-DARETIES](https://huggingface.co/jsfs11/RandomMergeNoNormWEIGHTED-7B-DARETIES)
24
+ * [senseable/WestLake-7B-v2](https://huggingface.co/senseable/WestLake-7B-v2)
25
+ * [mlabonne/OmniBeagle-7B](https://huggingface.co/mlabonne/OmniBeagle-7B)
26
+ * [vanillaOVO/supermario_v3](https://huggingface.co/vanillaOVO/supermario_v3)
27
+
28
+ ## 🧩 Configuration
29
+
30
+ ```yaml
31
+ base_model: senseable/WestLake-7B-v2
32
+ gate_mode: hidden
33
+ dtype: bfloat16
34
+ experts:
35
+ - source_model: jsfs11/RandomMergeNoNormWEIGHTED-7B-DARETIES
36
+ positive_prompts:
37
+ - "Answer this question from the ARC (Argument Reasoning Comprehension)."
38
+ - "Use common sense and logical reasoning skills."
39
+ negative_prompts:
40
+ - "nonsense"
41
+ - "irrational"
42
+ - "math"
43
+ - "code"
44
+ - source_model: senseable/WestLake-7B-v2
45
+ positive_prompts:
46
+ - "Answer this question from the Winogrande test."
47
+ - "Use advanced knowledge of culture and humanity"
48
+ negative_prompts:
49
+ - "ignorance"
50
+ - "uninformed"
51
+ - "creativity"
52
+ - source_model: mlabonne/OmniBeagle-7B
53
+ positive_prompts:
54
+ - "Calculate the answer to this math problem"
55
+ - "My mathematical capabilities are strong, allowing me to handle complex mathematical queries"
56
+ - "solve for"
57
+ negative_prompts:
58
+ - "incorrect"
59
+ - "inaccurate"
60
+ - "creativity"
61
+ - source_model: vanillaOVO/supermario_v3
62
+ positive_prompts:
63
+ - "Predict the most plausible continuation for this scenario."
64
+ - "Demonstrate understanding of everyday commonsense in your response."
65
+ - "Use contextual clues to determine the most likely outcome."
66
+ - "Apply logical reasoning to complete the given narrative."
67
+ - "Infer the most realistic action or event that follows."
68
+ negative_prompts:
69
+ - "guesswork"
70
+ - "irrelevant information"
71
+ - "contradictory response"
72
+ - "illogical conclusion"
73
+ - "ignoring context"
74
+
75
+
76
+ ```
77
+
78
+ ## 💻 Usage
79
+
80
+ ```python
81
+ !pip install -qU transformers bitsandbytes accelerate
82
+
83
+ from transformers import AutoTokenizer
84
+ import transformers
85
+ import torch
86
+
87
+ model = "jsfs11/MixtureofMerges-MoE-4x7b-v3"
88
+
89
+ tokenizer = AutoTokenizer.from_pretrained(model)
90
+ pipeline = transformers.pipeline(
91
+ "text-generation",
92
+ model=model,
93
+ model_kwargs={"torch_dtype": torch.float16, "load_in_4bit": True},
94
+ )
95
+
96
+ messages = [{"role": "user", "content": "Explain what a Mixture of Experts is in less than 100 words."}]
97
+ prompt = pipeline.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
98
+ outputs = pipeline(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
99
+ print(outputs[0]["generated_text"])
100
+ ```