metadata

language:
  - en
license: apache-2.0
library_name: transformers
tags:
  - merge
  - mergekit
  - lazymergekit
  - model_stock
  - ZeroXClem-Llama-3.1-8B-AthenaSky-MegaMix
base_model:
  - Pedro13543/mega_blend_model
  - Skywork/Skywork-o1-Open-Llama-3.1-8B
  - Undi95/Meta-Llama-3.1-8B-Claude
  - mergekit-community/good_mix_model_Stock
  - mergekit-community/L3.1-Athena-d-8B
pipeline_tag: text-generation
model-index:
  - name: Llama-3.1-8B-AthenaSky-MegaMix
    results:
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: IFEval (0-Shot)
          type: HuggingFaceH4/ifeval
          args:
            num_few_shot: 0
        metrics:
          - type: inst_level_strict_acc and prompt_level_strict_acc
            value: 63.01
            name: strict accuracy
        source:
          url: >-
            https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=ZeroXClem/Llama-3.1-8B-AthenaSky-MegaMix
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: BBH (3-Shot)
          type: BBH
          args:
            num_few_shot: 3
        metrics:
          - type: acc_norm
            value: 31.39
            name: normalized accuracy
        source:
          url: >-
            https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=ZeroXClem/Llama-3.1-8B-AthenaSky-MegaMix
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: MATH Lvl 5 (4-Shot)
          type: hendrycks/competition_math
          args:
            num_few_shot: 4
        metrics:
          - type: exact_match
            value: 27.95
            name: exact match
        source:
          url: >-
            https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=ZeroXClem/Llama-3.1-8B-AthenaSky-MegaMix
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: GPQA (0-shot)
          type: Idavidrein/gpqa
          args:
            num_few_shot: 0
        metrics:
          - type: acc_norm
            value: 3.69
            name: acc_norm
        source:
          url: >-
            https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=ZeroXClem/Llama-3.1-8B-AthenaSky-MegaMix
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: MuSR (0-shot)
          type: TAUR-Lab/MuSR
          args:
            num_few_shot: 0
        metrics:
          - type: acc_norm
            value: 6.9
            name: acc_norm
        source:
          url: >-
            https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=ZeroXClem/Llama-3.1-8B-AthenaSky-MegaMix
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: MMLU-PRO (5-shot)
          type: TIGER-Lab/MMLU-Pro
          config: main
          split: test
          args:
            num_few_shot: 5
        metrics:
          - type: acc
            value: 27.82
            name: accuracy
        source:
          url: >-
            https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=ZeroXClem/Llama-3.1-8B-AthenaSky-MegaMix
          name: Open LLM Leaderboard

ZeroXClem-Llama-3.1-8B-AthenaSky-MegaMix

Overview

ZeroXClem-Llama-3.1-8B-AthenaSky-MegaMix is a powerful AI model built through model stock merging using MergeKit. It brings together some of the best models available on Hugging Face, ensuring strong performance in a wide range of NLP tasks, including reasoning, coding, roleplay, and instruction-following.

This model was created by merging high-quality foundational and fine-tuned models to create an optimized blended architecture that retains the strengths of each contributing model.

Merge Details

Merge Method: model_stock
Base Model: mergekit-community/L3.1-Athena-d-8B
Dtype: bfloat16
Tokenizer Source: mergekit-community/L3.1-Athena-d-8B

Models Merged

The following models contributed to this fusion:

Pedro13543/mega_blend_model - A well-balanced blend of roleplay and instruction-tuned Llama-3.1 variants.
Skywork/Skywork-o1-Open-Llama-3.1-8B - Optimized for reasoning and slow-thinking capabilities.
Undi95/Meta-Llama-3.1-8B-Claude - Fine-tuned on Claude Opus/Sonnet data, improving response depth and conversational engagement.
mergekit-community/good_mix_model_Stock - A diverse mixture including RP-focused and knowledge-heavy datasets.

Configuration

name: ZeroXClem-Llama-3.1-8B-AthenaSky-MegaMix
base_model: mergekit-community/L3.1-Athena-d-8B
dtype: bfloat16
merge_method: model_stock
models:
  - model: Pedro13543/mega_blend_model
  - model: Skywork/Skywork-o1-Open-Llama-3.1-8B
  - model: Undi95/Meta-Llama-3.1-8B-Claude
  - model: mergekit-community/good_mix_model_Stock
tokenizer_source: mergekit-community/L3.1-Athena-d-8B

Features & Improvements

🔹 Advanced Reasoning & Thoughtfulness - Thanks to Skywork-o1 integration, this model excels in logical thinking and problem-solving.

🔹 Enhanced Conversational Depth - The inclusion of Meta-Llama-3.1-8B-Claude adds better response structuring, making it more engaging in dialogue.

🔹 Versatile Roleplay & Creativity - Leveraging mega_blend_model and good_mix_model_Stock, the model supports immersive roleplaying and storytelling.

🔹 Strong Instruction Following - Trained on various instruction datasets to provide clear, informative, and helpful responses.

Use Cases

Chat & Roleplay - Supports natural, engaging, and dynamic conversational flow.
Programming & Code Generation - Provides reliable code completions and debugging suggestions.
Creative Writing - Generates compelling stories, character dialogues, and immersive text.
Educational Assistance - Helps explain complex topics and answer academic questions.
Logic & Problem-Solving - Can handle reasoning-based and structured thought processes.

🛠 How to Use

🔥 Ollama (Quick Inference)

You can run the model using Ollama for direct testing:

ollama run hf.co/ZeroXClem/Llama-3.1-8B-AthenaSky-MegaMix

🤗 Hugging Face Transformers (Python)

from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline
import torch

model_name = "ZeroXClem/Llama-3.1-8B-AthenaSky-MegaMix"

# Load tokenizer & model
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
    model_name, 
    torch_dtype=torch.bfloat16, 
    device_map="auto"
)

# Initialize text generation pipeline
text_generator = pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
    torch_dtype=torch.bfloat16,
    device_map="auto"
)

# Example prompt
prompt = "Describe the significance of AI ethics in modern technology."

# Generate output
outputs = text_generator(
    prompt,
    max_new_tokens=200,
    do_sample=True,
    temperature=0.7,
    top_k=50,
    top_p=0.95
)

print(outputs[0]["generated_text"])

Model Alignment & Ethics

⚠️ Uncensored Use: This model does not apply strict moderation. Users should implement appropriate safety filters before deployment.

⚠️ Responsibility Notice: You are responsible for the outputs generated by this model. It is recommended to apply ethical safeguards and content moderation when integrating this model into applications.

📜 License: Governed by the Meta Llama 3.1 Community License Agreement.

Feedback & Contributions

We welcome feedback, bug reports, and performance evaluations! If you find improvements or wish to contribute, feel free to reach out or submit suggestions.

**ZeroXClem Team | 2025 **

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric	Value
Avg.	26.79
IFEval (0-Shot)	63.01
BBH (3-Shot)	31.39
MATH Lvl 5 (4-Shot)	27.95
GPQA (0-shot)	3.69
MuSR (0-shot)	6.90
MMLU-PRO (5-shot)	27.82