leafspark commited on
Commit
181fd24
1 Parent(s): d1320ca

Add model card

Browse files
Files changed (1) hide show
  1. README.md +292 -3
README.md CHANGED
@@ -1,3 +1,292 @@
1
- ---
2
- license: gemma
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: llama3.1
3
+ base_model:
4
+ - meta-llama/Meta-Llama-3.1-8B-Instruct
5
+ library_name: transformers
6
+ language:
7
+ - en
8
+ - de
9
+ - fr
10
+ - it
11
+ - pt
12
+ - hi
13
+ - es
14
+ - th
15
+ tags:
16
+ - reflection
17
+ - unsloth
18
+ - peft
19
+ - llama
20
+ datasets:
21
+ - leafspark/DetailedReflection-Claude-v3_5-Sonnet
22
+ metrics:
23
+ - accuracy
24
+ pipeline_tag: text-generation
25
+ ---
26
+
27
+ # Llama-3.1-8B-MultiReflection-Instruct
28
+
29
+ This model is inspired by OpenAI's o1 reasoning model. The dataset was synthetically generated using Claude 3.5 Sonnet.
30
+
31
+ ### Training
32
+
33
+ The training was done on Google Colab's free T4, using unsloth. The configuration is as follows:
34
+ - LoRA Rank: 128
35
+ - Packing: enabled
36
+ - Batch size: 2
37
+ - Gradient accumulation steps: 4
38
+ - Epoches: 3
39
+ - Steps: 240
40
+
41
+ The training data comprised of 81 examples, each approximatly 3000 tokens.
42
+
43
+ ### Notes
44
+ - It tends to produce very verbose and long reasoning responses
45
+ - The coherency seems fairly good if you used the recommended sampling parameters
46
+ - You should use at least 16k context, as each response is usually 2000-3000 tokens long
47
+
48
+ **Recommended Sampling Parameters:**
49
+ - Temperature: `0.15`
50
+ - Min-P: `0.2`
51
+ - Top-K: `50`
52
+ - Top-P: `1`
53
+ - Max Tokens: `uncapped`
54
+ - Frequency Penalty: `0.5`
55
+ - Presence Penalty: `0.1`
56
+
57
+ **Recommended System Prompt:**
58
+ ```
59
+ You are an advanced agentic assistant designed to complete tasks and provide general insight. Your responses will be formatted in XML. Start your responses with a <thinking> XML tag. You should write general process of how to complete the task provided by the user in this area. You may add nested tags, such as multiple <thinking> tags. <reflection> tags may also be used, if you've detected any inconsistencies or errors in your thinking. After completing a detailed thinking process, generate a draft response in <draft> tags. Review this response in <reflection> tags again, and refine your response as needed. Remember that you may (and are encouraged to) include multiple thinking, reflection and drafts. Provide your final response in <output> tags.
60
+ ```
61
+
62
+ I initially thought the model was going to be medicore, but here's a sample output (quantized Q6_K, the fp16 is probably even better):
63
+ ```
64
+ Can you teach me a cool math proof?
65
+ ```
66
+ ```xml
67
+ <thinking>
68
+ To create an engaging math proof, I should:
69
+ 1. Choose a well-known and interesting proof
70
+ 2. Break it down into simple steps
71
+ 3. Ask the user questions to keep them engaged
72
+ 4. Use LaTeX for math symbols
73
+ 5. Provide explanations and context for each step
74
+ </thinking>
75
+
76
+ <draft>
77
+ Let's explore the fascinating world of math proofs together. I'd like to introduce you to the "Proof of the Pythagorean Theorem" - a classic and beautiful result in geometry.
78
+
79
+ Here's the theorem we'll be proving:
80
+
81
+ $$a^2 + b^2 = c^2$$
82
+
83
+ where $a$ and $b$ are the lengths of the legs of a right triangle, and $c$ is the length of the hypotenuse.
84
+
85
+ To make this proof more engaging, I'll ask you questions along the way. Are you ready to get started?
86
+
87
+ 1. Can you draw a simple right triangle with legs of length $a$ and $b$, and a hypotenuse of length $c$?
88
+ </draft>
89
+
90
+ <reflection>
91
+ The draft is a good start, but I should:
92
+ 1. Provide more context about the proof
93
+ 2. Explain why this theorem is important
94
+ 3. Ask more questions to keep the user engaged
95
+ 4. Use LaTeX for math symbols
96
+ </reflection>
97
+
98
+ <thinking>
99
+ Let me revise the draft to address these points:
100
+ 1. I'll explain why the Pythagorean Theorem is important
101
+ 2. I'll ask more questions to keep the user engaged
102
+ 3. I'll use LaTeX for math symbols
103
+ 4. I'll provide more context about the proof
104
+ </thinking>
105
+
106
+ <draft>
107
+ Let's embark on an exciting journey through the world of math proofs. Today, we'll explore the "Proof of the Pythagorean Theorem" - a fundamental result in geometry that has far-reaching implications in various fields, including physics, engineering, and architecture.
108
+
109
+ Here's the theorem we'll be proving:
110
+
111
+ $$a^2 + b^2 = c^2$$
112
+
113
+ where $a$ and $b$ are the lengths of the legs of a right triangle, and $c$ is the length of the hypotenuse.
114
+
115
+ This theorem is crucial because it allows us to calculate the length of the hypotenuse (the longest side) of a right triangle, given the lengths of the other two sides. It's a powerful tool in many real-world applications, such as building design, bridge construction, and even computer graphics.
116
+
117
+ To make this proof more engaging, I'll ask you questions along the way. Are you ready to get started?
118
+
119
+ 1. Can you draw a simple right triangle with legs of length $a$ and $b$, and a hypotenuse of length $c$?
120
+ 2. What do you think might be a good way to approach this proof? Do you have any ideas?
121
+ </draft>
122
+
123
+ <reflection>
124
+ The revised draft is much better, but I should:
125
+ 1. Provide more details about the proof
126
+ 2. Explain the concept of similar triangles
127
+ 3. Use more LaTeX for math symbols
128
+ 4. Ask more questions to keep the user engaged
129
+ </reflection>
130
+
131
+ <thinking>
132
+ Let me revise the draft again to address these points:
133
+ 1. I'll provide more details about the proof
134
+ 2. I'll explain the concept of similar triangles
135
+ 3. I'll use more LaTeX for math symbols
136
+ 4. I'll ask more questions to keep the user engaged
137
+ </thinking>
138
+
139
+ <draft>
140
+ Let's dive into the fascinating world of math proofs. Today, we'll explore the "Proof of the Pythagorean Theorem" - a fundamental result in geometry that has far-reaching implications in various fields, including physics, engineering, and architecture.
141
+
142
+ Here's the theorem we'll be proving:
143
+
144
+ $$a^2 + b^2 = c^2$$
145
+
146
+ where $a$ and $b$ are the lengths of the legs of a right triangle, and $c$ is the length of the hypotenuse.
147
+
148
+ This theorem is crucial because it allows us to calculate the length of the hypotenuse (the longest side) of a right triangle, given the lengths of the other two sides. It's a powerful tool in many real-world applications, such as building design, bridge construction, and even computer graphics.
149
+
150
+ To make this proof more engaging, I'll ask you questions along the way. Are you ready to get started?
151
+
152
+ 1. Can you draw a simple right triangle with legs of length $a$ and $b$, and a hypotenuse of length $c$?
153
+ 2. What do you think might be a good way to approach this proof? Do you have any ideas?
154
+
155
+ Let's start by considering two similar triangles. Similar triangles are triangles that have the same shape, but not necessarily the same size. In this case, we'll use the concept of similar triangles to relate the lengths of the sides of the right triangle.
156
+
157
+ Here's a diagram of the two similar triangles:
158
+
159
+ $$\begin{array}{ccc}
160
+ \text{Triangle } ABC & \text{Triangle } ADE \\
161
+ a & \frac{a}{2} \\
162
+ b & \frac{b}{2} \\
163
+ c & \frac{c}{2} \\
164
+ \end{array}$$
165
+
166
+ Notice that the two triangles are similar because they have the same angles (all right angles in this case). This means that their corresponding sides are proportional.
167
+
168
+ Now, let's use this similarity to relate the lengths of the sides of the two triangles. We can set up a proportion based on the corresponding sides:
169
+
170
+ $$\frac{a}{\frac{a}{2}} = \frac{b}{\frac{b}{2}} = \frac{c}{\frac{c}{2}}$$
171
+
172
+ Simplifying this proportion, we get:
173
+
174
+ $$2a = \frac{a}{2}$$
175
+
176
+ $$2b = \frac{b}{2}$$
177
+
178
+ $$2c = \frac{c}{2}$$
179
+
180
+ Now, let's square both sides of each equation:
181
+
182
+ $$4a^2 = \frac{a^2}{4}$$
183
+
184
+ $$4b^2 = \frac{b^2}{4}$$
185
+
186
+ $$4c^2 = \frac{c^2}{4}$$
187
+
188
+ Multiplying both sides of each equation by 4, we get:
189
+
190
+ $$16a^2 = a^2$$
191
+
192
+ $$16b^2 = b^2$$
193
+
194
+ $$16c^2 = c^2$$
195
+
196
+ Now, let's add the three equations together:
197
+
198
+ $$16a^2 + 16b^2 + 16c^2 = a^2 + b^2 + c^2$$
199
+
200
+ Dividing both sides by 16, we get:
201
+
202
+ $$a^2 + b^2 + c^2 = \frac{1}{16}(a^2 + b^2 + c^2)$$
203
+
204
+ This equation is true for any right triangle, so we can conclude that:
205
+
206
+ $$a^2 + b^2 = c^2$$
207
+
208
+ This is the Pythagorean Theorem!
209
+
210
+ Now, it's your turn again. Can you think of any real-world applications of this theorem? Do you have any questions about the proof?
211
+ </draft>
212
+
213
+ <reflection>
214
+ The revised draft is excellent. It provides a clear and engaging explanation of the Pythagorean Theorem, along with a step-by-step proof. The use of LaTeX for math symbols is also well done. I'm confident that this will be a helpful and enjoyable experience for the user.
215
+ </reflection>
216
+
217
+ <output>
218
+ Let's embark on an exciting journey through the world of math proofs. Today, we'll explore the "Proof of the Pythagorean Theorem" - a fundamental result in geometry that has far-reaching implications in various fields, including physics, engineering, and architecture.
219
+
220
+ Here's the theorem we'll be proving:
221
+
222
+ $$a^2 + b^2 = c^2$$
223
+
224
+ where $a$ and $b$ are the lengths of the legs of a right triangle, and $c$ is the length of the hypotenuse.
225
+
226
+ This theorem is crucial because it allows us to calculate the length of the hypotenuse (the longest side) of a right triangle, given the lengths of the other two sides. It's a powerful tool in many real-world applications, such as building design, bridge construction, and even computer graphics.
227
+
228
+ To make this proof more engaging, I'll ask you questions along the way. Are you ready to get started?
229
+
230
+ 1. Can you draw a simple right triangle with legs of length $a$ and $b$, and a hypotenuse of length $c$?
231
+ 2. What do you think might be a good way to approach this proof? Do you have any ideas?
232
+
233
+ Let's start by considering two similar triangles. Similar triangles are triangles that have the same shape, but not necessarily the same size. In this case, we'll use the concept of similar triangles to relate the lengths of the sides of the right triangle.
234
+
235
+ Here's a diagram of the two similar triangles:
236
+
237
+ $$\begin{array}{ccc}
238
+ \text{Triangle } ABC & \text{Triangle } ADE \\
239
+ a & \frac{a}{2} \\
240
+ b & \frac{b}{2} \\
241
+ c & \frac{c}{2} \\
242
+ \end{array}$$
243
+
244
+ Notice that the two triangles are similar because they have the same angles (all right angles in this case). This means that their corresponding sides are proportional.
245
+
246
+ Now, let's use this similarity to relate the lengths of the sides of the two triangles. We can set up a proportion based on the corresponding sides:
247
+
248
+ $$\frac{a}{\frac{a}{2}} = \frac{b}{\frac{b}{2}} = \frac{c}{\frac{c}{2}}$$
249
+
250
+ Simplifying this proportion, we get:
251
+
252
+ $$2a = \frac{a}{2}$$
253
+
254
+ $$2b = \frac{b}{2}$$
255
+
256
+ $$2c = \frac{c}{2}$$
257
+
258
+ Now, let's square both sides of each equation:
259
+
260
+ $$4a^2 = \frac{a^2}{4}$$
261
+
262
+ $$4b^2 = \frac{b^2}{4}$$
263
+
264
+ $$4c^2 = \frac{c^2}{4}$$
265
+
266
+ Multiplying both sides of each equation by 4, we get:
267
+
268
+ $$16a^2 = a^2$$
269
+
270
+ $$16b^2 = b^2$$
271
+
272
+ $$16c^2 = c^2$$
273
+
274
+ Now, let's add the three equations together:
275
+
276
+ $$16a^2 + 16b^2 + 16c^2 = a^2 + b^2 + c^2$$
277
+
278
+ Dividing both sides by 16, we get:
279
+
280
+ $$a^2 + b^2 + c^2 = \frac{1}{16}(a^2 + b^2 + c^2)$$
281
+
282
+ This equation is true for any right triangle, so we can conclude that:
283
+
284
+ $$a^2 + b^2 = c^2$$
285
+
286
+ This is the Pythagorean Theorem!
287
+
288
+ Now, it's your turn again. Can you think of any real-world applications of this theorem? Do you have any questions about the proof?
289
+
290
+ Remember, this is a proof, not a mathematical exercise. Feel free to ask questions or share your thoughts about the theorem and its implications.
291
+ </output>
292
+ ```