--- base_model: - prithivMLmods/Llama-3.1-8B-Open-SFT tags: - text-generation-inference - transformers - unsloth - Llama3 - trl - COT - Reasoning license: apache-2.0 language: - en datasets: - Daemontatox/LongCOT-Reason metrics: - accuracy - character - competition_math - code_eval library_name: transformers pipeline_tag: text-generation model-index: - name: AetherDrake-SFT results: - task: type: text-generation name: Text Generation dataset: name: IFEval (0-Shot) type: HuggingFaceH4/ifeval args: num_few_shot: 0 metrics: - type: inst_level_strict_acc and prompt_level_strict_acc value: 48.13 name: strict accuracy source: url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Daemontatox/AetherDrake-SFT name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: BBH (3-Shot) type: BBH args: num_few_shot: 3 metrics: - type: acc_norm value: 27.14 name: normalized accuracy source: url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Daemontatox/AetherDrake-SFT name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: MATH Lvl 5 (4-Shot) type: hendrycks/competition_math args: num_few_shot: 4 metrics: - type: exact_match value: 14.65 name: exact match source: url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Daemontatox/AetherDrake-SFT name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: GPQA (0-shot) type: Idavidrein/gpqa args: num_few_shot: 0 metrics: - type: acc_norm value: 9.4 name: acc_norm source: url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Daemontatox/AetherDrake-SFT name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: MuSR (0-shot) type: TAUR-Lab/MuSR args: num_few_shot: 0 metrics: - type: acc_norm value: 9.97 name: acc_norm source: url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Daemontatox/AetherDrake-SFT name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: MMLU-PRO (5-shot) type: TIGER-Lab/MMLU-Pro config: main split: test args: num_few_shot: 5 metrics: - type: acc value: 27.77 name: accuracy source: url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Daemontatox/AetherDrake-SFT name: Open LLM Leaderboard --- ![image](./image.webp) # AetherDrake-SFT - **Developed by:** Daemontatox - **License:** Apache 2.0 - **Finetuned Using:** [Unsloth](https://github.com/unslothai/unsloth), Hugging Face Transformers, and TRL Library ## Model Overview The **AetherDrake-SFT Model** is an advanced AI system optimized for logical reasoning, multi-step problem-solving, and decision-making tasks. Designed with efficiency and accuracy in mind, it employs a structured system prompt to ensure high-quality answers through a transparent and iterative thought process. ### System Prompt and Workflow This model operates using an innovative reasoning framework structured around the following steps: 1. **Initial Thought:** The model uses `` tags to reason step-by-step and craft its best possible response. Example: 2. **Self-Critique:** It evaluates its initial response within `` tags, focusing on: - **Accuracy:** Is it factually correct and verifiable? - **Clarity:** Is it clear and free of ambiguity? - **Completeness:** Does it fully address the request? - **Improvement:** What can be enhanced? Example: 3. **Revision:** Based on the critique, the model refines its response within `` tags. Example: 4. **Final Response:** The revised response is presented clearly within `` tags. Example: 5. **Tag Innovation:** When needed, the model creates and defines new tags for better structuring or clarity, ensuring consistent usage. Example: ### Key Features - **Structured Reasoning:** Transparent, multi-step approach for generating and refining answers. - **Self-Improvement:** Built-in critique and revision ensure continuous response enhancement. - **Clarity and Adaptability:** Tagging system provides organized, adaptable responses tailored to user needs. - **Creative Flexibility:** Supports dynamic problem-solving with the ability to introduce new tags and concepts. --- ## Use Cases The model is designed for various domains, including: 1. **Research and Analysis:** Extracting insights and providing structured explanations. 2. **Education:** Assisting with tutoring by breaking down complex problems step-by-step. 3. **Problem-Solving:** Offering logical and actionable solutions for multi-step challenges. 4. **Content Generation:** Producing clear, well-organized creative or professional content. --- ## Training Details - **Frameworks:** - [Unsloth](https://github.com/unslothai/unsloth) for accelerated training. - Hugging Face Transformers and the TRL library for reinforcement learning with human feedback (RLHF). - **Dataset:** Finetuned on diverse reasoning-focused tasks, including logical puzzles, mathematical problems, and commonsense reasoning scenarios. - **Hardware Efficiency:** - Trained with bnb-4bit precision for reduced memory usage. - Optimized training pipeline achieving 2x faster development cycles. --- ## Limitations - **Arithmetic Equations** Model might hallucinate in the middle of thinking and using Arithmetic Equations as it wasn't trained on latex equations. - **Very Complex problems** Model has a tendency to get side tracked when asked long and complex problems and might answer with uncertainty. --- ## Ethical Considerations - **Transparency:** Responses are structured for verifiability through tagging. - **Bias Mitigation:** Includes self-critique to minimize biases and ensure fairness. - **Safe Deployment:** Users are encouraged to evaluate outputs to prevent harm or misinformation. --- ## License This model is distributed under the Apache 2.0 license, allowing users to use, modify, and share it in compliance with the license terms. --- ## Acknowledgments Special thanks to: - [Unsloth](https://github.com/unslothai/unsloth) for accelerated training workflows. - Hugging Face for their powerful tools and libraries. --- Experience the **AetherDrake-SFT**, leveraging its structured reasoning and self-improvement capabilities for any task requiring advanced AI reasoning. # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard) Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/Daemontatox__AetherDrake-SFT-details)! Summarized results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/contents/viewer/default/train?q=Daemontatox/AetherDrake-SFT)! | Metric |% Value| |-------------------|------:| |Avg. | 22.84| |IFEval (0-Shot) | 48.13| |BBH (3-Shot) | 27.14| |MATH Lvl 5 (4-Shot)| 14.65| |GPQA (0-shot) | 9.40| |MuSR (0-shot) | 9.97| |MMLU-PRO (5-shot) | 27.77|