ZeroXClem-Llama-3.1-8B-AthenaSky-MegaMix

Overview

ZeroXClem-Llama-3.1-8B-AthenaSky-MegaMix is a powerful AI model built through model stock merging using MergeKit. It brings together some of the best models available on Hugging Face, ensuring strong performance in a wide range of NLP tasks, including reasoning, coding, roleplay, and instruction-following.

Model Fusion

This model was created by merging high-quality foundational and fine-tuned models to create an optimized blended architecture that retains the strengths of each contributing model.

Merge Details

Models Merged

The following models contributed to this fusion:

Configuration

name: ZeroXClem-Llama-3.1-8B-AthenaSky-MegaMix
base_model: mergekit-community/L3.1-Athena-d-8B
dtype: bfloat16
merge_method: model_stock
models:
  - model: Pedro13543/mega_blend_model
  - model: Skywork/Skywork-o1-Open-Llama-3.1-8B
  - model: Undi95/Meta-Llama-3.1-8B-Claude
  - model: mergekit-community/good_mix_model_Stock
tokenizer_source: mergekit-community/L3.1-Athena-d-8B

Features & Improvements

🔹 Advanced Reasoning & Thoughtfulness - Thanks to Skywork-o1 integration, this model excels in logical thinking and problem-solving.

🔹 Enhanced Conversational Depth - The inclusion of Meta-Llama-3.1-8B-Claude adds better response structuring, making it more engaging in dialogue.

🔹 Versatile Roleplay & Creativity - Leveraging mega_blend_model and good_mix_model_Stock, the model supports immersive roleplaying and storytelling.

🔹 Strong Instruction Following - Trained on various instruction datasets to provide clear, informative, and helpful responses.

Use Cases

  • Chat & Roleplay - Supports natural, engaging, and dynamic conversational flow.
  • Programming & Code Generation - Provides reliable code completions and debugging suggestions.
  • Creative Writing - Generates compelling stories, character dialogues, and immersive text.
  • Educational Assistance - Helps explain complex topics and answer academic questions.
  • Logic & Problem-Solving - Can handle reasoning-based and structured thought processes.

🛠 How to Use

🔥 Ollama (Quick Inference)

You can run the model using Ollama for direct testing:

ollama run hf.co/ZeroXClem/Llama-3.1-8B-AthenaSky-MegaMix

🤗 Hugging Face Transformers (Python)

from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline
import torch

model_name = "ZeroXClem/Llama-3.1-8B-AthenaSky-MegaMix"

# Load tokenizer & model
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
    model_name, 
    torch_dtype=torch.bfloat16, 
    device_map="auto"
)

# Initialize text generation pipeline
text_generator = pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
    torch_dtype=torch.bfloat16,
    device_map="auto"
)

# Example prompt
prompt = "Describe the significance of AI ethics in modern technology."

# Generate output
outputs = text_generator(
    prompt,
    max_new_tokens=200,
    do_sample=True,
    temperature=0.7,
    top_k=50,
    top_p=0.95
)

print(outputs[0]["generated_text"])

Model Alignment & Ethics

⚠️ Uncensored Use: This model does not apply strict moderation. Users should implement appropriate safety filters before deployment.

⚠️ Responsibility Notice: You are responsible for the outputs generated by this model. It is recommended to apply ethical safeguards and content moderation when integrating this model into applications.

📜 License: Governed by the Meta Llama 3.1 Community License Agreement.

Feedback & Contributions

We welcome feedback, bug reports, and performance evaluations! If you find improvements or wish to contribute, feel free to reach out or submit suggestions.


**ZeroXClem Team | 2025 ** ZXC

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric Value
Avg. 26.79
IFEval (0-Shot) 63.01
BBH (3-Shot) 31.39
MATH Lvl 5 (4-Shot) 27.95
GPQA (0-shot) 3.69
MuSR (0-shot) 6.90
MMLU-PRO (5-shot) 27.82
Downloads last month
19
Safetensors
Model size
8.03B params
Tensor type
BF16
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Model tree for ZeroXClem/Llama-3.1-8B-AthenaSky-MegaMix

Evaluation results