Daemontatox
commited on
Update README.md
Browse files
README.md
CHANGED
@@ -12,25 +12,60 @@ tags:
|
|
12 |
- Reasoning
|
13 |
- text-generation-inference
|
14 |
---
|
|
|
15 |
![RA_REASONER](./image.webp)
|
16 |
-
# Uploaded Model
|
17 |
|
18 |
-
**
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
19 |
|
20 |
-
|
21 |
|
22 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
23 |
|
24 |
-
|
|
|
|
|
|
|
25 |
|
26 |
-
|
|
|
|
|
27 |
|
28 |
-
|
|
|
|
|
29 |
|
30 |
-
|
31 |
|
32 |
-
##
|
33 |
|
34 |
-
|
|
|
|
|
35 |
|
36 |
-
|
|
|
12 |
- Reasoning
|
13 |
- text-generation-inference
|
14 |
---
|
15 |
+
|
16 |
![RA_REASONER](./image.webp)
|
|
|
17 |
|
18 |
+
# **RA_Reasoner 2.0**
|
19 |
+
|
20 |
+
## **Model Details**
|
21 |
+
|
22 |
+
**Developed by:** [Daemontatox](#)
|
23 |
+
**License:** [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0)
|
24 |
+
**Base Model:** [tiiuae/Falcon3-10B-Instruct](https://huggingface.co/tiiuae/Falcon3-10B-Instruct)
|
25 |
+
|
26 |
+
This model is fine-tuned from the Falcon-10B-Instruct model, leveraging advanced training optimizations to enhance reasoning and instruction-following capabilities. It was trained 2x faster using [Unsloth](https://github.com/unslothai/unsloth) and Hugging Face's TRL library.
|
27 |
+
|
28 |
+
---
|
29 |
+
|
30 |
+
## **Training Details**
|
31 |
+
|
32 |
+
- **Frameworks Used:** Unsloth, Hugging Face TRL
|
33 |
+
- **Fine-Tuning Focus:** Emphasis on reasoning, logic-based tasks, and instruction comprehension.
|
34 |
+
- **Dataset:** Includes examples from [Daemontatox/Deepthinking-COT](https://huggingface.co/datasets/Daemontatox/Deepthinking-COT).
|
35 |
+
- **Optimization:** Significant speedup during fine-tuning while maintaining model quality.
|
36 |
|
37 |
+
Further details on hyperparameters and fine-tuning methodology will be added in future updates.
|
38 |
|
39 |
+
---
|
40 |
+
|
41 |
+
## **Intended Use**
|
42 |
+
|
43 |
+
This model is intended for **research and development** in text generation, reasoning tasks, and instruction-following applications.
|
44 |
+
|
45 |
+
### **Key Features:**
|
46 |
+
- Enhanced reasoning capabilities for multi-step logical problems.
|
47 |
+
- Robust instruction-following for complex tasks.
|
48 |
+
- Fine-tuned for Chain-of-Thought (COT) reasoning and inference.
|
49 |
|
50 |
+
### **Applications:**
|
51 |
+
- Research on reasoning-based AI systems.
|
52 |
+
- Tasks requiring logical deductions, such as question answering and problem-solving.
|
53 |
+
- General text generation with a focus on nuanced understanding.
|
54 |
|
55 |
+
---
|
56 |
+
|
57 |
+
## **Limitations and Warnings**
|
58 |
|
59 |
+
- This model is not designed for real-time or production-critical tasks.
|
60 |
+
- Outputs may vary based on input specificity and complexity.
|
61 |
+
- Users are responsible for ensuring ethical use and compliance with applicable regulations.
|
62 |
|
63 |
+
---
|
64 |
|
65 |
+
## **Acknowledgments**
|
66 |
|
67 |
+
- Base model: [tiiuae/Falcon3-10B-Instruct](https://huggingface.co/tiiuae/Falcon3-10B-Instruct)
|
68 |
+
- Training acceleration powered by [Unsloth](https://github.com/unslothai/unsloth) and Hugging Face's TRL library.
|
69 |
+
- Dataset contributions: [Daemontatox/Deepthinking-COT](https://huggingface.co/datasets/Daemontatox/Deepthinking-COT).
|
70 |
|
71 |
+
---
|