AetherDrake-SFT / README.md
Daemontatox's picture
Adding Evaluation Results (#2)
a34d9b7 verified
---
base_model:
- prithivMLmods/Llama-3.1-8B-Open-SFT
tags:
- text-generation-inference
- transformers
- unsloth
- Llama3
- trl
- COT
- Reasoning
license: apache-2.0
language:
- en
datasets:
- Daemontatox/LongCOT-Reason
metrics:
- accuracy
- character
- competition_math
- code_eval
library_name: transformers
pipeline_tag: text-generation
model-index:
- name: AetherDrake-SFT
results:
- task:
type: text-generation
name: Text Generation
dataset:
name: IFEval (0-Shot)
type: HuggingFaceH4/ifeval
args:
num_few_shot: 0
metrics:
- type: inst_level_strict_acc and prompt_level_strict_acc
value: 48.13
name: strict accuracy
source:
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Daemontatox/AetherDrake-SFT
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: BBH (3-Shot)
type: BBH
args:
num_few_shot: 3
metrics:
- type: acc_norm
value: 27.14
name: normalized accuracy
source:
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Daemontatox/AetherDrake-SFT
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: MATH Lvl 5 (4-Shot)
type: hendrycks/competition_math
args:
num_few_shot: 4
metrics:
- type: exact_match
value: 14.65
name: exact match
source:
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Daemontatox/AetherDrake-SFT
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: GPQA (0-shot)
type: Idavidrein/gpqa
args:
num_few_shot: 0
metrics:
- type: acc_norm
value: 9.4
name: acc_norm
source:
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Daemontatox/AetherDrake-SFT
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: MuSR (0-shot)
type: TAUR-Lab/MuSR
args:
num_few_shot: 0
metrics:
- type: acc_norm
value: 9.97
name: acc_norm
source:
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Daemontatox/AetherDrake-SFT
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: MMLU-PRO (5-shot)
type: TIGER-Lab/MMLU-Pro
config: main
split: test
args:
num_few_shot: 5
metrics:
- type: acc
value: 27.77
name: accuracy
source:
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Daemontatox/AetherDrake-SFT
name: Open LLM Leaderboard
---
![image](./image.webp)
# AetherDrake-SFT
- **Developed by:** Daemontatox
- **License:** Apache 2.0
- **Finetuned Using:** [Unsloth](https://github.com/unslothai/unsloth), Hugging Face Transformers, and TRL Library
## Model Overview
The **AetherDrake-SFT Model** is an advanced AI system optimized for logical reasoning, multi-step problem-solving, and decision-making tasks. Designed with efficiency and accuracy in mind, it employs a structured system prompt to ensure high-quality answers through a transparent and iterative thought process.
### System Prompt and Workflow
This model operates using an innovative reasoning framework structured around the following steps:
1. **Initial Thought:**
The model uses `<Thinking>` tags to reason step-by-step and craft its best possible response.
Example:
2. **Self-Critique:**
It evaluates its initial response within `<Critique>` tags, focusing on:
- **Accuracy:** Is it factually correct and verifiable?
- **Clarity:** Is it clear and free of ambiguity?
- **Completeness:** Does it fully address the request?
- **Improvement:** What can be enhanced?
Example:
3. **Revision:**
Based on the critique, the model refines its response within `<Revising>` tags.
Example:
4. **Final Response:**
The revised response is presented clearly within `<Final>` tags.
Example:
5. **Tag Innovation:**
When needed, the model creates and defines new tags for better structuring or clarity, ensuring consistent usage.
Example:
### Key Features
- **Structured Reasoning:** Transparent, multi-step approach for generating and refining answers.
- **Self-Improvement:** Built-in critique and revision ensure continuous response enhancement.
- **Clarity and Adaptability:** Tagging system provides organized, adaptable responses tailored to user needs.
- **Creative Flexibility:** Supports dynamic problem-solving with the ability to introduce new tags and concepts.
---
## Use Cases
The model is designed for various domains, including:
1. **Research and Analysis:** Extracting insights and providing structured explanations.
2. **Education:** Assisting with tutoring by breaking down complex problems step-by-step.
3. **Problem-Solving:** Offering logical and actionable solutions for multi-step challenges.
4. **Content Generation:** Producing clear, well-organized creative or professional content.
---
## Training Details
- **Frameworks:**
- [Unsloth](https://github.com/unslothai/unsloth) for accelerated training.
- Hugging Face Transformers and the TRL library for reinforcement learning with human feedback (RLHF).
- **Dataset:** Finetuned on diverse reasoning-focused tasks, including logical puzzles, mathematical problems, and commonsense reasoning scenarios.
- **Hardware Efficiency:**
- Trained with bnb-4bit precision for reduced memory usage.
- Optimized training pipeline achieving 2x faster development cycles.
---
## Limitations
- **Arithmetic Equations** Model might hallucinate in the middle of thinking and using Arithmetic Equations as it wasn't trained on latex equations.
- **Very Complex problems** Model has a tendency to get side tracked when asked long and complex problems and might answer with uncertainty.
---
## Ethical Considerations
- **Transparency:** Responses are structured for verifiability through tagging.
- **Bias Mitigation:** Includes self-critique to minimize biases and ensure fairness.
- **Safe Deployment:** Users are encouraged to evaluate outputs to prevent harm or misinformation.
---
## License
This model is distributed under the Apache 2.0 license, allowing users to use, modify, and share it in compliance with the license terms.
---
## Acknowledgments
Special thanks to:
- [Unsloth](https://github.com/unslothai/unsloth) for accelerated training workflows.
- Hugging Face for their powerful tools and libraries.
---
Experience the **AetherDrake-SFT**, leveraging its structured reasoning and self-improvement capabilities for any task requiring advanced AI reasoning.
# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/Daemontatox__AetherDrake-SFT-details)!
Summarized results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/contents/viewer/default/train?q=Daemontatox/AetherDrake-SFT)!
| Metric |% Value|
|-------------------|------:|
|Avg. | 22.84|
|IFEval (0-Shot) | 48.13|
|BBH (3-Shot) | 27.14|
|MATH Lvl 5 (4-Shot)| 14.65|
|GPQA (0-shot) | 9.40|
|MuSR (0-shot) | 9.97|
|MMLU-PRO (5-shot) | 27.77|