Doctor-Shotgun
commited on
Commit
•
4488107
1
Parent(s):
9f6103d
Update README.md
Browse files
README.md
CHANGED
@@ -12,7 +12,7 @@ tags:
|
|
12 |
|
13 |
Experimental model meant to serve as a long-context speculative decoding model. This one is specifically for models trained on the LimaRP prompt format.
|
14 |
|
15 |
-
Created using [Doctor-Shotgun/smol_llama-220M-GQA-32k-theta-sft](https://huggingface.co/Doctor-Shotgun/smol_llama-220M-GQA-32k-theta-sft) and finetuning at 32768 context length on
|
16 |
|
17 |
This variant uses the rope theta (rope frequency base) method for context extension.
|
18 |
|
|
|
12 |
|
13 |
Experimental model meant to serve as a long-context speculative decoding model. This one is specifically for models trained on the LimaRP prompt format.
|
14 |
|
15 |
+
Created using [Doctor-Shotgun/smol_llama-220M-GQA-32k-theta-sft](https://huggingface.co/Doctor-Shotgun/smol_llama-220M-GQA-32k-theta-sft) and finetuning at 32768 context length on the LimaRP dataset.
|
16 |
|
17 |
This variant uses the rope theta (rope frequency base) method for context extension.
|
18 |
|