ZeroXClem-Llama-3.1-8B-AthenaSky-MegaMix
Overview
ZeroXClem-Llama-3.1-8B-AthenaSky-MegaMix is a powerful AI model built through model stock merging using MergeKit. It brings together some of the best models available on Hugging Face, ensuring strong performance in a wide range of NLP tasks, including reasoning, coding, roleplay, and instruction-following.
This model was created by merging high-quality foundational and fine-tuned models to create an optimized blended architecture that retains the strengths of each contributing model.
Merge Details
- Merge Method:
model_stock
- Base Model:
mergekit-community/L3.1-Athena-d-8B
- Dtype:
bfloat16
- Tokenizer Source:
mergekit-community/L3.1-Athena-d-8B
Models Merged
The following models contributed to this fusion:
Pedro13543/mega_blend_model
- A well-balanced blend of roleplay and instruction-tuned Llama-3.1 variants.Skywork/Skywork-o1-Open-Llama-3.1-8B
- Optimized for reasoning and slow-thinking capabilities.Undi95/Meta-Llama-3.1-8B-Claude
- Fine-tuned on Claude Opus/Sonnet data, improving response depth and conversational engagement.mergekit-community/good_mix_model_Stock
- A diverse mixture including RP-focused and knowledge-heavy datasets.
Configuration
name: ZeroXClem-Llama-3.1-8B-AthenaSky-MegaMix
base_model: mergekit-community/L3.1-Athena-d-8B
dtype: bfloat16
merge_method: model_stock
models:
- model: Pedro13543/mega_blend_model
- model: Skywork/Skywork-o1-Open-Llama-3.1-8B
- model: Undi95/Meta-Llama-3.1-8B-Claude
- model: mergekit-community/good_mix_model_Stock
tokenizer_source: mergekit-community/L3.1-Athena-d-8B
Features & Improvements
🔹 Advanced Reasoning & Thoughtfulness - Thanks to Skywork-o1
integration, this model excels in logical thinking and problem-solving.
🔹 Enhanced Conversational Depth - The inclusion of Meta-Llama-3.1-8B-Claude
adds better response structuring, making it more engaging in dialogue.
🔹 Versatile Roleplay & Creativity - Leveraging mega_blend_model
and good_mix_model_Stock
, the model supports immersive roleplaying and storytelling.
🔹 Strong Instruction Following - Trained on various instruction datasets to provide clear, informative, and helpful responses.
Use Cases
- Chat & Roleplay - Supports natural, engaging, and dynamic conversational flow.
- Programming & Code Generation - Provides reliable code completions and debugging suggestions.
- Creative Writing - Generates compelling stories, character dialogues, and immersive text.
- Educational Assistance - Helps explain complex topics and answer academic questions.
- Logic & Problem-Solving - Can handle reasoning-based and structured thought processes.
🛠 How to Use
🔥 Ollama (Quick Inference)
You can run the model using Ollama for direct testing:
ollama run hf.co/ZeroXClem/Llama-3.1-8B-AthenaSky-MegaMix
🤗 Hugging Face Transformers (Python)
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline
import torch
model_name = "ZeroXClem/Llama-3.1-8B-AthenaSky-MegaMix"
# Load tokenizer & model
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype=torch.bfloat16,
device_map="auto"
)
# Initialize text generation pipeline
text_generator = pipeline(
"text-generation",
model=model,
tokenizer=tokenizer,
torch_dtype=torch.bfloat16,
device_map="auto"
)
# Example prompt
prompt = "Describe the significance of AI ethics in modern technology."
# Generate output
outputs = text_generator(
prompt,
max_new_tokens=200,
do_sample=True,
temperature=0.7,
top_k=50,
top_p=0.95
)
print(outputs[0]["generated_text"])
Model Alignment & Ethics
⚠️ Uncensored Use: This model does not apply strict moderation. Users should implement appropriate safety filters before deployment.
⚠️ Responsibility Notice: You are responsible for the outputs generated by this model. It is recommended to apply ethical safeguards and content moderation when integrating this model into applications.
📜 License: Governed by the Meta Llama 3.1 Community License Agreement.
Feedback & Contributions
We welcome feedback, bug reports, and performance evaluations! If you find improvements or wish to contribute, feel free to reach out or submit suggestions.
Open LLM Leaderboard Evaluation Results
Detailed results can be found here
Metric | Value |
---|---|
Avg. | 26.79 |
IFEval (0-Shot) | 63.01 |
BBH (3-Shot) | 31.39 |
MATH Lvl 5 (4-Shot) | 27.95 |
GPQA (0-shot) | 3.69 |
MuSR (0-shot) | 6.90 |
MMLU-PRO (5-shot) | 27.82 |
- Downloads last month
- 19
Model tree for ZeroXClem/Llama-3.1-8B-AthenaSky-MegaMix
Evaluation results
- strict accuracy on IFEval (0-Shot)Open LLM Leaderboard63.010
- normalized accuracy on BBH (3-Shot)Open LLM Leaderboard31.390
- exact match on MATH Lvl 5 (4-Shot)Open LLM Leaderboard27.950
- acc_norm on GPQA (0-shot)Open LLM Leaderboard3.690
- acc_norm on MuSR (0-shot)Open LLM Leaderboard6.900
- accuracy on MMLU-PRO (5-shot)test set Open LLM Leaderboard27.820