--- language: - en license: apache-2.0 library_name: transformers tags: - moe - mergekit - MoErges base_model: - mistralai/Mistral-7B-v0.3 pipeline_tag: text-classification model-index: - name: MistralBase-4x7B-MoE-ECE-PRYMMAL-Martial results: - task: type: text-generation name: Text Generation dataset: name: IFEval (0-Shot) type: HuggingFaceH4/ifeval args: num_few_shot: 0 metrics: - type: inst_level_strict_acc and prompt_level_strict_acc value: 16.97 name: strict accuracy source: url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Marsouuu/MistralBase-4x7B-MoE-ECE-PRYMMAL-Martial name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: BBH (3-Shot) type: BBH args: num_few_shot: 3 metrics: - type: acc_norm value: 8.87 name: normalized accuracy source: url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Marsouuu/MistralBase-4x7B-MoE-ECE-PRYMMAL-Martial name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: MATH Lvl 5 (4-Shot) type: hendrycks/competition_math args: num_few_shot: 4 metrics: - type: exact_match value: 0.3 name: exact match source: url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Marsouuu/MistralBase-4x7B-MoE-ECE-PRYMMAL-Martial name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: GPQA (0-shot) type: Idavidrein/gpqa args: num_few_shot: 0 metrics: - type: acc_norm value: 1.23 name: acc_norm source: url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Marsouuu/MistralBase-4x7B-MoE-ECE-PRYMMAL-Martial name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: MuSR (0-shot) type: TAUR-Lab/MuSR args: num_few_shot: 0 metrics: - type: acc_norm value: 7.85 name: acc_norm source: url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Marsouuu/MistralBase-4x7B-MoE-ECE-PRYMMAL-Martial name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: MMLU-PRO (5-shot) type: TIGER-Lab/MMLU-Pro config: main split: test args: num_few_shot: 5 metrics: - type: acc value: 4.21 name: accuracy source: url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Marsouuu/MistralBase-4x7B-MoE-ECE-PRYMMAL-Martial name: Open LLM Leaderboard --- Model Name: Marsouuu/MistralBase-4x7B-MoE-ECE-PRYMMAL-Martial - Mixture of Experts (MoE) Description: This is a cutting-edge Mixture of Experts (MoE) model designed with 24-bit precision, tailored to excel in four key domains: mathematics, coding, storytelling, and general chat. Built with a dynamic mixture of expert layers, this model adapts to different tasks by routing inputs to the most relevant expert network, delivering high-quality outputs efficiently. Key Features • Mathematics Expert: Equipped with specialized mathematical reasoning capabilities, this model is fine-tuned for solving complex mathematical problems, numerical computations, and providing detailed explanations for mathematical concepts. • Coding Expert: The model has been trained extensively on various programming languages and software development paradigms. It can help generate, debug, and explain code snippets, offering a comprehensive coding support experience. • Storytelling Expert: Designed to assist in creative writing, this expert focuses on generating narratives, constructing dialogues, and offering story-building support for various genres. • General Chat Expert: Capable of engaging in everyday conversations, offering accurate and contextually appropriate responses. This expert is versatile and adaptive to different conversational tones, whether it’s casual chit-chat or formal assistance. Technical Specifications • Model Architecture: Mixture of Experts (MoE) with a gating mechanism that routes inputs to the most relevant expert networks. • Domains: • Mathematics: Advanced reasoning and problem-solving. • Coding: Programming support across multiple languages. • Storytelling: Creative writing and narrative generation. • General Chat: Versatile dialogue handling for various conversational contexts. • Training Data: The model was trained on diverse datasets that cover each expert domain, ensuring robustness and versatility. • Framework: Developed using [Nom du Framework, par exemple: PyTorch, TensorFlow], optimized for the MoE architecture with gated routing. Usage This model can be used for a wide range of applications: • Educational Tools: Assisting with mathematical problems, coding exercises, and creative writing tasks. • Software Development: Providing coding suggestions, code completion, and debugging support. • Creative Writing: Generating stories, dialogues, and narrative content. • Conversational Agents: Implementing chatbots with versatile conversational abilities. Limitations • The model may occasionally generate responses that are not entirely contextually appropriate, especially in cases requiring highly specialized domain knowledge. • Despite its 24-bit precision, it may not perform well with extremely large datasets or tasks that require higher precision levels. # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard) Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_Marsouuu__MistralBase-4x7B-MoE-ECE-PRYMMAL-Martial) | Metric |Value| |-------------------|----:| |Avg. | 6.57| |IFEval (0-Shot) |16.97| |BBH (3-Shot) | 8.87| |MATH Lvl 5 (4-Shot)| 0.30| |GPQA (0-shot) | 1.23| |MuSR (0-shot) | 7.85| |MMLU-PRO (5-shot) | 4.21|