Pinkstack commited on
Commit
921b7e4
·
verified ·
1 Parent(s): 7ceb94f

Update README.md

Browse files

![superthoughtslight.png](https://cdn-uploads.huggingface.co/production/uploads/6710ba6af1279fe0dfe33afe/2LuPB_ZPCGni3-PyCkL0-.png)

Files changed (1) hide show
  1. README.md +24 -5
README.md CHANGED
@@ -1,5 +1,5 @@
1
  ---
2
- base_model: Pinkstack/Superthoughts-lite-1.7B-QwQ-o1
3
  tags:
4
  - text-generation-inference
5
  - transformers
@@ -7,17 +7,36 @@ tags:
7
  - llama
8
  - trl
9
  - sft
 
 
 
 
10
  license: apache-2.0
11
  language:
12
  - en
 
13
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
14
 
15
  # Uploaded model
16
 
17
  - **Developed by:** Pinkstack
18
  - **License:** apache-2.0
19
- - **Finetuned from model :** Pinkstack/Superthoughts-lite-1.7B-QwQ-o1
20
-
21
- This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
22
 
23
- [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
 
1
  ---
2
+ base_model: HuggingFaceTB/SmolLM2-1.7B-Instruct
3
  tags:
4
  - text-generation-inference
5
  - transformers
 
7
  - llama
8
  - trl
9
  - sft
10
+ - code
11
+ - superthoughts
12
+ - cot
13
+ - reasoning
14
  license: apache-2.0
15
  language:
16
  - en
17
+ pipeline_tag: text-generation
18
  ---
19
+ # Information
20
+ Advanced, high-quality and lite reasoning for a tiny size that you can run locally in Q8 on your phone! 😲
21
+ ⚠️This is an experimental version, it may enter reasoning loops, and it may not always answer your question properly or correctly. an updated version will be released later on. currently reasoning may only work on single-turn conversations, as we've trained it on single turn conversations only.
22
+ ![superthoughtslight.png](https://cdn-uploads.huggingface.co/production/uploads/6710ba6af1279fe0dfe33afe/2LuPB_ZPCGni3-PyCkL0-.png)
23
+ we've continuously pre-trained SmolLM2-1.7B-Instruct on advanced reasoning patterns, then SFT fine-tuned that continuously pre-trained version on reasoning once again.
24
+ # Examples:
25
+ all responses below generated with no system prompt, 400 maximum tokens and a temperature of 0.7 (not recommended, 0.3 - 0.5 is better):
26
+ Generated inside the android application, Pocketpal via GGUF Q8.
27
+
28
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/6710ba6af1279fe0dfe33afe/wh33o-vjxIePfPqoN3q1z.png)
29
+
30
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/6710ba6af1279fe0dfe33afe/7JeF3YNNhrlY2tED4rpFJ.png)
31
+
32
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/6710ba6af1279fe0dfe33afe/Y8optw73kTgqMnZKj3wKj.png)
33
+
34
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/6710ba6af1279fe0dfe33afe/6lywy3IYEIgzPnUIJ5RvF.png)
35
 
36
  # Uploaded model
37
 
38
  - **Developed by:** Pinkstack
39
  - **License:** apache-2.0
40
+ - **Finetuned from model :** HuggingFaceTB/SmolLM2-1.7B-Instruct
 
 
41
 
42
+ This smollm2 model was trained with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.