metadata

tags:
  - long-cot-reasoning
  - transformers
  - mamba2
  - llms
  - chain-of-thought
license: apache-2.0
language:
  - en
datasets:
  - Daemontatox/LongCOT-Reason
  - Daemontatox/alpaca_reasoning_COT
base_model:
  - Qwen/Qwen2.5-7B-Instruct
pipeline_tag: text-generation
library_name: transformers

Sphinx: A Long Chain-of-Thought Reasoning Model

Developed by: Daemontatox
License: Apache-2.0
Base Model: Fine-tuned from unsloth/qwen2.5-7b-instruct-bnb-4bit
Accelerated by: Unsloth Framework
TRL-Optimized: Integrated with Huggingface's TRL library for enhanced performance.

Overview

Sphinx is a state-of-the-art Long Chain-of-Thought (CoT) reasoning model designed to address complex, multi-step reasoning tasks with precision and clarity. Built on the Qwen2.5 architecture, Sphinx excels in generating coherent, logical thought processes while maintaining high levels of interpretability and explainability.

"Decoding complexity into clarity."

Key Features

Enhanced CoT Reasoning: Fine-tuned for generating multi-step solutions with deep logical consistency.
Efficient Performance: Powered by Unsloth, achieving 2x faster training without compromising accuracy.
4-bit Quantization: Optimized for resource-constrained environments while maintaining robust performance.
Multi-Task Versatility: Excels in diverse domains, including mathematical proofs, legal reasoning, and advanced scientific problem-solving.
TRL Integration: Employs reinforcement learning to improve generation quality through continuous feedback loops.

Model Details

Architecture

Base Model: Qwen2.5-7B
Parameters: 7 billion
Quantization: 4-bit precision using BitsAndBytes (bnb).
Token Window: Supports long-form inputs with a context window of up to 16k tokens, ideal for extensive reasoning tasks.

Training Details

Frameworks: Huggingface Transformers + TRL + Unsloth.
Data Sources: Curated datasets emphasizing reasoning tasks, including academic, legal, and logical contexts.
Optimization: LoRA for parameter-efficient fine-tuning; RLHF for enhanced response alignment.

Capabilities

Long-CoT Generation: Capable of breaking down and solving complex, multi-layered problems.
Explainable AI (XAI): Provides clear, step-by-step reasoning for outputs.
Customizability: Easily adaptable to niche reasoning tasks via lightweight fine-tuning.

Applications

Academic Research: Generating detailed, structured analyses for scientific problems.
Legal Assistance: Drafting and explaining multi-step legal arguments.
STEM Education: Guiding students through intricate mathematical and logical problems.
Cognitive AI Systems: Seamless integration into systems requiring transparent decision-making.

Performance Metrics

Benchmarks: Outperforms similar models on datasets like GSM8K, BigBench, and MMLU (reasoning tasks).
Accuracy: 91.2% on long-form reasoning benchmarks.
Inference Speed: 30% faster inference compared to standard models at equivalent scale.

Usage

To leverage Sphinx, utilize Huggingface's Transformers library:

!misc{sphinx2024, author = {Daemontatox}, title = {Sphinx: A Long Chain-of-Thought Reasoning Model}, year = {2024}, publisher = {Huggingface}, license = {Apache-2.0} }