Edit model card

MixtureofMerges-MoE-4x7b-v10-MIXTRAL3

MixtureofMerges-MoE-4x7b-v10-MIXTRAL3 is a Mixture of Experts (MoE) made with the following models using LazyMergekit:

🧩 Configuration

base_model: mistralai/Mistral-7B-Instruct-v0.3
gate_mode: hidden
dtype: bfloat16
experts:
  - source_model: mistralai/Mistral-7B-Instruct-v0.3
    positive_prompts:
      - "Analyze the ARC (Argument Reasoning Comprehension) question."
      - "Use logical reasoning and common sense."
      - "Identify the assumptions in this argument."
      - "Evaluate the validity of these assumptions."
      - "Provide an alternative explanation for this argument."
      - "Identify weaknesses in this argument."
      - "Detect any logical fallacies in this argument and specify them."
    negative_prompts:
      - "ignores key evidence"
      - "too general"
      - "focuses on irrelevant details"
      - "assumes unprovided information"
      - "relies on stereotypes"
  - source_model: Kukedlc/NeuralSynthesis-7B-v0.1
    positive_prompts:
      - "Answer with commonsense understanding and relevant general knowledge."
      - "Summarize this passage and explain the importance of the highlighted section."
      - "Compare two articles with different viewpoints and list their key arguments."
      - "Paraphrase this statement, altering the emotional tone but retaining the core meaning."
      - "Create an analogy to illustrate the main concept of this article."
    negative_prompts:
      - "overly simplistic"
      - "understates important points"
      - "ignores critical details"
      - "misses the question's nuance"
      - "takes the statement too literally"
  - source_model: mlabonne/AlphaMonarch-7B
    positive_prompts:
      - "Solve this math problem."
      - "Demonstrate strong mathematical capabilities."
      - "Solve for the given variable."
      - "Calculate the total cost for 12 apples at $0.50 each."
      - "Isolate the variable in the equation: 2x + 5 = 17."
      - "Show your work in solving this equation."
      - "Explain the formula used to solve the problem."
      - "Discuss why dividing by zero is impossible."
    negative_prompts:
      - "incorrect calculation"
      - "inaccurate answer"
      - "lacks creativity"
      - "assumes without proof"
      - "rushed calculation"
      - "confuses concepts"
      - "draws illogical conclusions"
      - "circular reasoning"
  - source_model: s3nh/SeverusWestLake-7B-DPO
    positive_prompts:
      - "Generate possible continuations for this scenario."
      - "Show understanding of everyday commonsense."
      - "Use contextual clues to predict the outcome."
      - "Continue the scenario in a cool and informal style."
      - "Introduce an unexpected yet plausible twist to the narrative."
      - "Depict a character's angry outburst in this scenario."
    negative_prompts:
      - "repetitive phrases"
      - "overuse of words"
      - "contradicts previous statements"
      - "unnatural dialogue"
      - "awkward phrasing"
      - "mismatched genre"

πŸ’» Usage

!pip install -qU transformers bitsandbytes accelerate

from transformers import AutoTokenizer
import transformers
import torch

model = "jsfs11/MixtureofMerges-MoE-4x7b-v10-MIXTRAL3"

tokenizer = AutoTokenizer.from_pretrained(model)
pipeline = transformers.pipeline(
    "text-generation",
    model=model,
    model_kwargs={"torch_dtype": torch.float16, "load_in_4bit": True},
)

messages = [{"role": "user", "content": "Explain what a Mixture of Experts is in less than 100 words."}]
prompt = pipeline.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
outputs = pipeline(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
print(outputs[0]["generated_text"])
Downloads last month
361
Safetensors
Model size
24.2B params
Tensor type
BF16
Β·
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Merge of