Daemontatox commited on
Commit
d41acf4
1 Parent(s): 14dbb63

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +118 -6
README.md CHANGED
@@ -9,14 +9,126 @@ tags:
9
  license: apache-2.0
10
  language:
11
  - en
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
12
  ---
 
 
13
 
14
- # Uploaded model
15
 
16
- - **Developed by:** Daemontatox
17
- - **License:** apache-2.0
18
- - **Finetuned from model :** Daemontatox/RA_Reasoner
19
 
20
- This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
21
 
22
- [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9
  license: apache-2.0
10
  language:
11
  - en
12
+ pipeline_tag: text-generation
13
+ library_name: transformers
14
+ model-index:
15
+ - name: RA_Reasoner2.0
16
+ results:
17
+ - task:
18
+ type: text-generation
19
+ name: Text Generation
20
+ dataset:
21
+ name: IFEval (0-Shot)
22
+ type: HuggingFaceH4/ifeval
23
+ args:
24
+ num_few_shot: 0
25
+ metrics:
26
+ - type: inst_level_strict_acc and prompt_level_strict_acc
27
+ value: 55.92
28
+ name: strict accuracy
29
+ source:
30
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Daemontatox/RA_Reasoner
31
+ name: Open LLM Leaderboard
32
+ - task:
33
+ type: text-generation
34
+ name: Text Generation
35
+ dataset:
36
+ name: BBH (3-Shot)
37
+ type: BBH
38
+ args:
39
+ num_few_shot: 3
40
+ metrics:
41
+ - type: acc_norm
42
+ value: 43.07
43
+ name: normalized accuracy
44
+ source:
45
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Daemontatox/RA_Reasoner
46
+ name: Open LLM Leaderboard
47
+ - task:
48
+ type: text-generation
49
+ name: Text Generation
50
+ dataset:
51
+ name: MATH Lvl 5 (4-Shot)
52
+ type: hendrycks/competition_math
53
+ args:
54
+ num_few_shot: 4
55
+ metrics:
56
+ - type: exact_match
57
+ value: 20.09
58
+ name: exact match
59
+ source:
60
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Daemontatox/RA_Reasoner
61
+ name: Open LLM Leaderboard
62
+ - task:
63
+ type: text-generation
64
+ name: Text Generation
65
+ dataset:
66
+ name: GPQA (0-shot)
67
+ type: Idavidrein/gpqa
68
+ args:
69
+ num_few_shot: 0
70
+ metrics:
71
+ - type: acc_norm
72
+ value: 10.85
73
+ name: acc_norm
74
+ source:
75
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Daemontatox/RA_Reasoner
76
+ name: Open LLM Leaderboard
77
+ - task:
78
+ type: text-generation
79
+ name: Text Generation
80
+ dataset:
81
+ name: MuSR (0-shot)
82
+ type: TAUR-Lab/MuSR
83
+ args:
84
+ num_few_shot: 0
85
+ metrics:
86
+ - type: acc_norm
87
+ value: 7.51
88
+ name: acc_norm
89
+ source:
90
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Daemontatox/RA_Reasoner
91
+ name: Open LLM Leaderboard
92
+ - task:
93
+ type: text-generation
94
+ name: Text Generation
95
+ dataset:
96
+ name: MMLU-PRO (5-shot)
97
+ type: TIGER-Lab/MMLU-Pro
98
+ config: main
99
+ split: test
100
+ args:
101
+ num_few_shot: 5
102
+ metrics:
103
+ - type: acc
104
+ value: 36.67
105
+ name: accuracy
106
+ source:
107
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Daemontatox/RA_Reasoner
108
+ name: Open LLM Leaderboard
109
  ---
110
+ ![RA_REASONER](./image.webp)
111
+ # Uploaded Model
112
 
113
+ **Developed by:** Daemontatox
114
 
115
+ **License:** [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0)
 
 
116
 
117
+ **Finetuned from model:** [tiiuae/Falcon3-10B-Instruct](https://huggingface.co/tiiuae/Falcon3-10B-Instruct)
118
 
119
+ This model was fine-tuned from the Falcon-10B-Instruct model. It was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Hugging Face's TRL library.
120
+
121
+ This model is intended for text generation tasks, with a focus on reasoning capabilities and instruction following, similar to capabilities demonstrated by the ChatGPT-O1-Mini model.
122
+
123
+ ## Training Details
124
+
125
+ This model was fine-tuned with Unsloth and TRL, resulting in significant speed improvements during the training process. Details on specific fine-tuning data, parameters and methods will be added soon. The fine-tuning process has prioritized improving the model's reasoning abilities on various benchmarks.
126
+
127
+ ## Intended Use
128
+
129
+ This model is intended for research and development purposes related to text generation, instruction following, and complex reasoning tasks. It is suitable for applications that require a model capable of handling multi-step logical problems and understanding nuanced instructions.
130
+
131
+ **Focus on Reasoning:** The fine-tuning has been geared towards enhancing the model's ability to tackle reasoning challenges and logic-based tasks.
132
+
133
+
134
+ ---