sethuiyer
/

MedleyMD

Text Generation

Mixture of Experts

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

sethuiyer commited on Jan 15, 2024

Commit

22db51c

•

1 Parent(s): 57bfa7b

Update README.md

Files changed (1) hide show

README.md +25 -4

README.md CHANGED Viewed

@@ -1,12 +1,16 @@
 ---
-license: apache-2.0
 tags:
 - moe
 - merge
 - mergekit
-- lazymergekit
 - sethuiyer/Dr_Samantha_7b_mistral
 - fblgit/UNA-TheBeagle-7b-v1
 ---
 # MedleyMD
@@ -46,11 +50,28 @@ tokenizer = AutoTokenizer.from_pretrained(model)
 pipeline = transformers.pipeline(
     "text-generation",
     model=model,
-    model_kwargs={"torch_dtype": torch.float16, "load_in_4bit": True},
 )
-messages = [{"role": "user", "content": "Explain what a Mixture of Experts is in less than 100 words."}]
 prompt = pipeline.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
 outputs = pipeline(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
 print(outputs[0]["generated_text"])
 ```

 ---
+license: cc-by-nc-nd-4.0
 tags:
 - moe
 - merge
+- medical
 - mergekit
 - sethuiyer/Dr_Samantha_7b_mistral
 - fblgit/UNA-TheBeagle-7b-v1
+language:
+- en
+library_name: transformers
+pipeline_tag: text-generation
 ---
 # MedleyMD
 pipeline = transformers.pipeline(
     "text-generation",
     model=model,
+    model_kwargs={"torch_dtype": torch.bfloat16, "load_in_4bit": True},
 )
+generation_kwargs = {
+    "max_new_tokens": 512,
+    "do_sample": True,
+    "temperature": 0.7,
+    "top_k": 50,
+    "top_p": 95,
+}
+messages = [{"role":"system", "content":"You are an helpful AI assistant. Please use </s> when you want to end the answer."},
+{"role": "user", "content": "Explain what a Mixture of Experts is in less than 100 words."}]
 prompt = pipeline.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
 outputs = pipeline(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
 print(outputs[0]["generated_text"])
+```
+```text
+A Mixture of Experts (Mixout) is a neural network architecture that combines the strengths of multiple expert networks to make a more accurate and robust prediction.
+It is composed of a topmost gating network that assigns weights to each expert network based on their performance on past input samples.
+The expert networks are trained independently, and the gating network learns to choose the best combination of these experts to make the final prediction.
+Mixout demonstrates a stronger ability to handle complex data distributions and is more efficient in terms of training time and memory usage compared to a
+traditional ensemble approach.
 ```